RefSeq Release 220

RefSeq Release 220

RefSeq release 220 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of September 5, 2023, this full release incorporates genomic, transcript, and protein data containing:

  • 391,350,361 records
  • 289,333,423 proteins
  • 56,423,426 RNAs
  • sequences from 141,099 organisms 

The release is provided in several directories as a complete dataset and divided by logical groupings.

Updates & announcements

New eukaryotic genome annotations

This release includes new annotations generated by NCBI’s eukaryotic genome annotation pipeline for 35 species, including:

Future changes

The Eukaryotic Gene Annotation Pipeline software was recently updated to version 10.2.
The release notes are available here. The updated processes and reporting will apply to new annotations in the next release (November 2023).

Stay up to date

RefSeq is part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration. Follow us on social @NCBI and join our mailing list to keep up to date with RefSeq and other CGR news.

Questions?

If you have questions or would like to provide feedback, please reach out to us! 

Leave a Reply