RefSeq Release 217

RefSeq release 217 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of March 8, 2023, this full release incorporates genomic, transcript, and protein data, containing:

  • 348,351,219 records
  • 254,500,694 proteins
  • 50,975,429 RNAs
  • sequences from 130,837 organisms

The release is provided in several directories as a complete dataset and divided by logical groupings.

Updates & announcements

Prokaryote phylum names
As previously announced, NCBI Taxonomy began updating phylum names for prokaryotes in January 2023. Informal phylum names in long use (e.g., Firmicutes, Proteobacteria) were changed to newly formalized names (e.g. Bacillota, Pseudomonadota, respectively). This update affected over 40 NCBI TaxIDs at phylum rank. The rollout of new phylum names is now complete! The flatfiles in this release contain the new phylum names.

New eukaryotic genome annotations
This release includes new annotations generated by NCBI’s eukaryotic genome annotation pipeline for 33 species, including:

