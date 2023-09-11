RefSeq release 220 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of September 5, 2023, this full release incorporates genomic, transcript, and protein data containing:

391,350,361 records

289,333,423 proteins

56,423,426 RNAs

sequences from 141,099 organisms

The release is provided in several directories as a complete dataset and divided by logical groupings.

Updates & announcements

New eukaryotic genome annotations

This release includes new annotations generated by NCBI’s eukaryotic genome annotation pipeline for 35 species, including:

Silvery gibbon, based on updated assembly HMol_V3 (GCF_009828535.3_2023_07)

Cactus mouse, based on new assembly PerEre_H2_v1 (GCF_949786415.1_2023_08)

Black rhinoceros, based on new assembly mDicBic1.mat.cur (GCF_020826845.1_2023_07) (pictured)

Ahaetulla prasina, based on new assembly ASM2864084v1 (GCF_028640845.1_2023_07)

Tigriopus californicus, based on new assembly Tcal_SD_v2.1 (GCF_007210705.1_2023_08)

Magnolia sinica, based on new assembly MsV1 (GCF_029962835.1_2023_07)

Future changes

The Eukaryotic Gene Annotation Pipeline software was recently updated to version 10.2.

The release notes are available here. The updated processes and reporting will apply to new annotations in the next release (November 2023).

