GenBank release 220.0 (6/18/2017) has 201,663,568 traditional records containing 234,997,362,623 base pairs of sequence data. In addition, there are 487,891,767 WGS records containing 2,164,683,993,369 base pairs of sequence data, 176812130 TSA records containing 158,112,969,073 base pairs of sequence data, and 1,628,475 TLS records containing 824,191,338 base pairs of sequence data.
During the 65 days between the close dates for GenBank releases 219.0 and 220.0, the traditional portion of GenBank grew by 3,172,411,071 base pairs and by 785,684 sequence records. During that same period, 80,500 records were updated – an average of 13,236 records added or updated per day.
Between releases 219.0 and 220.0, the WGS component of GenBank grew by 129,651,353,562 base pairs and by 36,051,620 sequence records. The TSA component Grew by 9,074,061,474 base pairs and by 11,743,588 sequence records. The TLS component of GenBank grew by 187,268,043 base pairs and by 190,126 sequence records.
The total number of sequence data files increased by 32 with this release. The divisions are as follows:
- BCT: 20 new files, now a total of 370
- CON: 4 new files, now a total of 363
- EST: 1 less file, now a total of 482
- GSS: 1 new file, now a total of 304
- INV: 1 new file, now a total of 154
- PAT: 1 new file, now a total of 291
- PHG: 1 new file, now a total of 4
- PLN: 3 new files, now a total of 148
- VRL: 1 new file, now a total of 49
For downloading purposes, please keep in mind that the uncompressed GenBank release 220.0 flatfiles require roughly 830 GB (sequence files only). The ASN.1 data require approximately 691 GB.
More information about GenBank release 220.0 is available in the release notes, as well as in the README files in the genbank (ftp.ncbi.nih.gov) and ASN.1 (ncbi-asn1) directories.