GenBank Release 259.0 is Available!

GenBank Release 259.0 is Available!

GenBank release 259.0 (12/22/2023) is now available on the NCBI FTP site. This release has 27.94 trillion bases and 3.96 billion records.

The current release has:

  • 247,777,761 traditional records containing 2,433,391,164,875 base pairs of sequence data
  • 2,775,205,599 WGS records containing 23,600,199,887,231 base pairs of sequence data
  • 701,336,089 bulk-oriented TSA records containing 659,924,904,311 base pairs of sequence data
  • 130,654,568 bulk-oriented TLS records containing 50,868,407,906 base pairs of sequence data

What’s new?

During the 50 days between the close dates for GenBank releases 258.0 and 259.0, the traditional portion of GenBank grew by 137,320,423,169 base pairs and by 1,282,675 sequence records. We updated 58,447 records during that same period. We added and/or updated an average of 26,822 traditional records per day!

Between releases 258.0 and 259.0, the WGS component of GenBank grew by 1,051,380,577,104 base pairs and by 88,022,953 sequence records. The TSA component of GenBank grew by 8,882,205,015 base pairs and by 14,467,034 sequence records. The TLS component of GenBank grew by 699,949,072 base pairs and by 1,700,564 sequence records.

The total number of sequence data files increased by 418 with this release. The divisions are as follows:

  • BCT: 24 new files, now a total of 1,069
  • CON: 1 new file, now a total of 238
  • ENV: 15 new files, now a total of 95
  • EST: 1 new file, now a total of 579
  • INV: 237 new files, now a total of 2,099
  • MAM: 1 new file, now a total of 273
  • PAT: 2 new files, now a total of 263
  • PLN: 66 new files, now a total of 1,713
  • PRI: 19 new files, now a total of 77
  • SYN: 1 new file, now a total of 30
  • VRL: 26 new files, now a total of 1,063
  • VRT: 25 new files, now a total of 509
Upcoming changes

In collaboration with our partners at the International Nucleotide Sequence Database Collaboration (INSDC), we are changing the name of the GenBank qualifier “/country” to “/geo_loc_name.” As previously announced, this change (effective June 2024) will better represent the diversity of sample collection location types.

GenBank will also have new allowed values for the “/collection_date” qualifier, effective December 2024.

Additional information

For downloading purposes, please keep in mind that the uncompressed GenBank release 259.0 sequence data flat files require roughly 4,175 GB. The ASN.1 data files require approximately 1,820 GB.

For more information about GenBank release 259.0, see the release notes, as well as the README files in the GenBank and ASN.1 (ncbi-asn1) directories on FTP.

Stay up to date

Follow us on social @NCBI and join our mailing list to keep up to date with GenBank and other NCBI news.

Questions?

Please send any comments or questions to info@ncbi.nlm.nih.gov.

Leave a Reply