GenBank release 247.0 (12/19/2021) is now available on the NCBI FTP site. This release has 16.47 trillion bases and 2.59 billion records.
The current release has 234,557,297 traditional records containing 1,053,275,115,030 base pairs of sequence data. There are also 1,734,664,952 WGS records containing 14,922,033,922,302 base pairs of sequence data, 514,158,576 bulk-oriented TSA records containing 455,870,853,358 base pairs of sequence data, and 109,379,021 bulk-oriented TLS records containing 41,143,480,750 base pairs of sequence data.
Growth between releases
During the 46 days between the close dates for GenBank releases 246.0 and 247.0, the ‘traditional’ portion of GenBank grew by 38,511,362,917 base pairs and by 914,404 sequence records. During that same period, 76,765 records were updated. An average of 21,547 ‘traditional’ records were added and/or updated per day.
Between releases 246.0 and 247.0, the WGS component of GenBank grew by 322,932,347,755 basepairs and by 13,600,851 sequence records. The TSA component of GenBank grew by
5,979,836,761 basepairs and by 5,839,185 sequence records. The TLS component of GenBank grew by 974,605,935 basepairs and by 1,809,086 sequence records.
The total number of sequence data files increased by 154 with this release. The divisions are as follows:
- BCT: 25 new files, now a total of 688
- CON: 1 new file, now a total of 223
- INV: 27 new files, now a total of 488
- MAM: 17 new files, now a total of 116
- PLN: 5 new files, now a total of 728
- PRI: 1 new file, now a total of 56 files
- VRL: 78 new files, now a total of 375
For downloading purposes, please keep in mind that the uncompressed GenBank release 247.0 sequence data flatfiles require roughly 2,072 GB. The ASN.1 data files require approximately 1,180 GB.