GenBank release 222.0 is available via FTP, Entrez and BLAST

GenBank release 222.0 is available via FTP, Entrez and BLAST

GenBank release 222.0 (10/14/2017) has 203,953,682 traditional records (including non-bulk-oriented TSA) containing 244,914,705,468 base pairs of sequence data. In addition, there are 508,825,331 WGS records containing 2,318,156,361,999 base pairs of sequence data, 192,754,804 TSA records containing 172,909,268,535 base pairs of sequence data, and 9,479,460 TLS records containing 2,993,818,315 base pairs of sequence data.

During the 62 days between the close dates for GenBank releases 221.0 and 222.0, the traditional portion of GenBank grew by 4,571,327,210 base pairs and by 773,076 sequence records. During that same period, 151,221 records were updated – an average of 14,908 records added or updated per day.

Between releases 221.0 and 222.0, the WGS component of GenBank grew by 75,861,752,489 base pairs and by 8,859,609 sequence records. The TSA component grew by 5,863,605,118 base pairs and by 5,977,698 sequence records. The TLS component grew by 2,169,626,977 base pairs and by 7,850,985 sequence records.

The total number of sequence data files increased by 33 with this release. The divisions are as follows:

  • BCT: 16 new files, now a total of 407
  • CON: 1 new files, now a total of 360
  • INV: 2 new files, now a total of 157
  • PAT: 7 new files, now a total of 301
  • PLN: 7 new files, now a total of 157

For downloading purposes, please keep in mind that the uncompressed GenBank release 222.0 flatfiles require roughly 850 GB (sequence files only). The ASN.1 data require approximately 704 GB.

More information about GenBank release 222.0 is available in the release notes, as well as in the README files in the genbank and ASN.1 (ncbi-asn1) directories on FTP.

One thought on “GenBank release 222.0 is available via FTP, Entrez and BLAST

Leave a Reply