GenBank release 243.0

GenBank release 243.0 (5/26/2021) is now available on the NCBI FTP site. This release has 14.03 trillion bases and 2.40 billion records.

The current release has 227,123,201 traditional records containing 832,400,799,511 base pairs of sequence data. There are also 1,590,670,459 WGS records containing 12,732,048,052,023 base pairs of sequence data, 481,154,920 bulk-oriented TSA records containing 425,076,483,459 base pairs of sequence data, and 102,395,753 bulk-oriented TLS records containing 37,998,534,461 base pairs of sequence data. 

Growth between releases

During the 72 days between the close dates for GenBank Releases 242.0 and 243.0, the ‘traditional’ portion of GenBank grew by 56,109,588,405 basepairs and by 881,725 sequence records. During that same period, 68,924 records were updated. An average of 13,203 ‘traditional’ records were added and/or updated per day.

Between releases 242.0 and 243.0, the WGS component of GenBank grew by 461,330,842,411 basepairs and by 26,732,416 sequence records. The TSA component of GenBank grew by 17,471,073,511 basepairs and by 18,003,920 sequence records. The TLS component of GenBank grew by 4,364,411,466 basepairs and by 12,265,192 sequence records.

The total number of sequence data files increased by 186 with this release. The divisions are as follows:

  • BCT: 31 new files, now a total of 588
  • CON: 1 new file, now a total of 221
  • ENV:  1 new file, now a total of 65
  • EST: 1 new file, now a total of 575
  • INV:  67 new files, now a total of 271
  • MAM: 15 new files, now a total of 91
  • PAT: 2 new files, now a total of 230
  • PHG: 1 new file, now a total of 5
  • PLN: 35 new files, now a total of 657
  • SYN: 1 new file, now a total of 29
  • VRL: 27 new files, now a total of 76
  • VRT: 4 new files, now a total of 259

Delay in GenBank 243.0

We experienced a significant delay in the availability of GenBank 243.0 data files by over one month. Close-of-data occurred about two weeks later then normal, on April 26, 2021. The release files were then made available four weeks later, on May 26, 2021. We regret any inconvenience caused by the delay.

Additional Information

For downloading purposes, please keep in mind that the uncompressed GenBank release 243.0 sequence data flatfiles require roughly 1,725 GB. The ASN.1 data files require approximately 1,037 GB.

More information about GenBank release 243.0 is available in the release notes, as well as in the README files in the GenBank and ASN.1 (ncbi-asn1) directories on FTP.

Leave a Reply