GenBank release 223.0 (12/15/2017) has 206,293,625 traditional records (including non-bulk-oriented TSA) containing 249,722,163,594 base pairs of sequence data. In addition, there are 551,063,065 WGS records containing 2,466,098,053,327 base pairs of sequence data, 201,559,502 TSA records containing 181,394,660,188 base pairs of sequence data, and 12,695,198 TLS records containing 4,458,042,616 base pairs of sequence data.
During the 62 days between the close dates for GenBank releases 222.0 and 223.0, the traditional portion of GenBank grew by 4,807,458,126 base pairs and by 2,339,943 sequence records. During that same period, 112,692 records were updated – an average of 39,559 records added or updated per day.
Between releases 222.0 and 223.0, the WGS component of GenBank grew by 147,941,691,328 base pairs and by 42,237,734 sequence records. The TSA component grew by 8,485,391,653 base pairs and by 8,804,698 sequence records. The TLS component grew by 1,464,224,301 base pairs and by 3,215,738 sequence records.
The total number of sequence data files increased by 33 with this release. The divisions are as follows:
- BCT: 21 new files, now a total of 428
- CON: 3 new files, now a total of 363
- ENV: 1 new file, now a total of 100
- EST: 3 new files, now a total of 485
- INV: 2 new files, now a total of 159
- PAT: 19 new files, now a total of 320
- PLN: 9 new files, now a total of 166
- PRI: 1 new file, now a total of 58
- VRL: 1 new file, now a total of 51
For downloading purposes, please keep in mind that the uncompressed GenBank release 223.0 flatfiles require roughly 862 GB (sequence files only). The ASN.1 data require approximately 712 GB.
More information about GenBank release 223.0, including upcoming changes, is available in the release notes, as well as in the README files in the genbank and ASN.1 (ncbi-asn1) directories on FTP.
One thought on “GenBank release 223.0 is available via FTP, Entrez and BLAST”