GenBank release 235


GenBank release 235.0 (12/11/2019) is now available on the NCBI FTP site. This release has 7 trillion bases and 1.74 billion records.

The current release has 215,333,020 traditional records containing 388,417,258,009 base pairs of sequence data. There are also 1,127,023,870 WGS records containing 6,277,551,200,690 base pairs of sequence data, 367,193,844 bulk-oriented TSA records containing 325,433,016,129 base pairs of sequence data, and 28,227,180 bulk-oriented TLS records containing 11,280,596,614 base pairs of sequence data.

Growth between releases

During the 54 days between the close dates for GenBank releases 234.0 and 235.0, the traditional portion of GenBank grew by 2,220,239,471 base pairs and decreased by 1,430,686 sequence records.* During that same period, 40,779 records were updated. An average of 755 traditional records were added and/or updated per day.

*Please see section 1.3.2 of the GenBank 235.0 release notes for more information about the overall decrease in the number of traditional sequence records.

Between releases 234.0 and 235.0, the WGS component of GenBank grew by 399,327,917,868 base pairs and by 22,356,959 sequence records. The TSA component of GenBank grew by 10,644,726,229 base pairs and by 11,463,344 sequence records. The TLS component of GenBank grew by 316,654,540 base pairs and by 1,097,033 sequence records.

The total number of sequence data files increased by 33 with this release. The divisions are as follows:

  • BCT: 14 new files, now a total of 401
  • CON: 1 new file, now a total of 212
  • INV: 33 fewer files, now a total of 80
  • MAM: 6 new files, now a total of 39
  • PAT: 1 new file, now a total of 202
  • ROD: 16 new files, now a total of 34
  • VRL: 1 new file, now a total of 35
  • VRT: 1 new file, now a total of 166

For downloading purposes, please keep in mind that the uncompressed GenBank release 234.0 flatfiles require roughly 1091 GB (sequence files only). The ASN.1 data require approximately 827 GB.

More information about GenBank release 234.0 is available in the release notes, as well as in the README files in the genbank and ASN.1 (ncbi-asn1) directories on FTP.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s