Now over 3 billion records!
Growth between releases
During the 63 days between the close dates for GenBank Releases 251.0 and 252.0, the traditional portion of GenBank grew by 70,162,662,354 basepairs and by 623,496 sequence records. During that same period, 25,466 records were updated. An average of 10,301 traditional records were added and/or updated per day.
Between releases 251.0 and 252.0, the WGS component of GenBank grew by 720,151,132,199 basepairs and by 143,800,629 sequence records. The TSA component of GenBank grew by 13,975,407,571 basepairs and by 13,823,250 sequence records. The TLS component of GenBank grew by 8,232,104 basepairs and by 19,779 sequence records.
The total number of sequence data files increased by 216 with this release. The divisions are as follows:
- BCT: 37 new files, now a total of 857
- CON: 28 files removed, now a total of 231
- ENV: 3 new files, now a total of 75
- INV: 99 new files, now a total of 965
- PLN: 61 new files, now a total of 1013
- VRL: 39 new files, now a total of 813
- VRT: 5 new files, now a total of 320
Sequence data file notes
With GenBank Release 249.0 in April 2022, we noticed an unusually large increase of 36 sequence flatfiles for the CON-division. The increase was due to the inclusion of “external annotation” erroneously incorporated into the ASN.1 version of 174 WGS-associated chomosomal scaffolds within a set of CON records.
The rendering and content of these 174 records in the GenBank flatfile representation was not negatively impacted by this error. However, customers who use the ASN.1 representation of GenBank records would have seen dramatic increases in their sizes.
We corrected the problem with this October 2022 GenBank Release 252.0 and the overall number of CON-division files has decreased. We apologize for any difficulties this caused.
For downloading purposes, please keep in mind that the uncompressed GenBank release 252.0 sequence data flatfiles require roughly 2,815 GB. The ASN.1 data files require approximately 1,432 GB.