GenBank release 246.0

GenBank release 246.0

GenBank release 246.0 (11/2/2021) is now available on the NCBI FTP site. This release has 16.1 trillion bases and 2.57 billion records.

The current release has 233642893 traditional records containing 1,014,763,752,113 base pairs of sequence data. There are also 1,721,064,101 WGS records containing 14,599,101,574,547 base pairs of sequence data, 508,319,391 bulk-oriented TSA records containing 449,891,016,597 base pairs of sequence data, and 107,569,935 bulk-oriented TLS records containing 40,168,874,815 base pairs of sequence data.

Growth between releases

During the 76 days between the close dates for GenBank Releases 245.0 and 246.0, the ‘traditional’ portion of GenBank grew by 74,250,491,387 base pairs and by 1,660,301 sequence records. During that same period, 77,037 records were updated. An average of 22,860 ‘traditional’ records were added and/or updated per day.

Between releases 245.0 and 246.0, the WGS component of GenBank grew by 710,913,710,825 base pairs and by 67,637,046 sequence records. The TSA component of GenBank grew by 9,312,593,986 base pairs and by 10,014,346 sequence records. The TLS component of GenBank grew by 238,707,500 base pairs and by 574,717 sequence records.

The total number of sequence data files increased by 268 with this release. The divisions are as follows:

  • BCT: 24 new files, now a total of 663
  • ENV: 3 new files, now a total of 70
  • INV: 96 new files, now a total of 461
  • PAT: 1 new file, now a total of 246
  • PLN: 15 new files, now a total of 723
  • VRL: 124 new files, now a total of 297
  • VRT: 5 new files, now a total of 277

New /regulatory_class values for the regulatory feature

As of this release, new values are supported for the /regulatory_class qualifier:

  • recombination_enhancer: A regulatory region that promotes or induces the process of recombination.
  • uORF (or regulatory_uORF): A short open reading frame that is found in the 5′ untranslated region of an mRNA and plays a role in translational regulation.

See the release notes for more information about the new /regulatory_class values.

Additional Information

For downloading purposes, please keep in mind that the uncompressed GenBank release 246.0 sequence data flatfiles require roughly 2,007 GB. The ASN.1 data files require approximately 1,149 GB.

For more information about GenBank release 246.0, see the release notes, as well as the README files in the GenBank and ASN.1 (ncbi-asn1) directories on FTP.

Leave a Reply