GenBank release 235.0 (12/11/2019) is now available on the NCBI FTP site. This release has 7 trillion bases and 1.74 billion records.
The current release has 215,333,020 traditional records containing 388,417,258,009 base pairs of sequence data. There are also 1,127,023,870 WGS records containing 6,277,551,200,690 base pairs of sequence data, 367,193,844 bulk-oriented TSA records containing 325,433,016,129 base pairs of sequence data, and 28,227,180 bulk-oriented TLS records containing 11,280,596,614 base pairs of sequence data.
Growth between releases
During the 54 days between the close dates for GenBank releases 234.0 and 235.0, the traditional portion of GenBank grew by 2,220,239,471 base pairs and decreased by 1,430,686 sequence records.* During that same period, 40,779 records were updated. An average of 755 traditional records were added and/or updated per day.
*Please see section 1.3.2 of the GenBank 235.0 release notes for more information about the overall decrease in the number of traditional sequence records.
Between releases 234.0 and 235.0, the WGS component of GenBank grew by 399,327,917,868 base pairs and by 22,356,959 sequence records. The TSA component of GenBank grew by 10,644,726,229 base pairs and by 11,463,344 sequence records. The TLS component of GenBank grew by 316,654,540 base pairs and by 1,097,033 sequence records.
The total number of sequence data files increased by 33 with this release. The divisions are as follows:
- BCT: 14 new files, now a total of 401
- CON: 1 new file, now a total of 212
- INV: 33 fewer files, now a total of 80
- MAM: 6 new files, now a total of 39
- PAT: 1 new file, now a total of 202
- ROD: 16 new files, now a total of 34
- VRL: 1 new file, now a total of 35
- VRT: 1 new file, now a total of 166
For downloading purposes, please keep in mind that the uncompressed GenBank release 234.0 flatfiles require roughly 1091 GB (sequence files only). The ASN.1 data require approximately 827 GB.