Author: NCBI Staff

GenBank Release 260.0 is Available!

GenBank release 260.0 (4/19/2024) is now available on the NCBI FTP site. This release has 31.18 trillion bases and 4.46 billion records.

The current release has:

250,803,006 traditional records containing 3,213,818,003,787 base pairs of sequence data
3,333,621,823 WGS records containing 27,225,116,587,937 base pairs of sequence data
741,066,498 bulk-oriented TSA records containing 689,648,317,082 base pairs of sequence data
135,115,766 bulk-oriented TLS records containing 53,492,243,256 base pairs of sequence data Continue reading “GenBank Release 260.0 is Available!” →

Now Available! Updated Bacterial and Archaeal Reference Genomes Collection

Download the updated bacterial and archaeal reference genome collection! We built this collection of 19,328 genomes by selecting the “best” genome assembly for each species among the 350,000+ prokaryotic genomes in RefSeq (except for E. coli for which two assemblies were selected as reference).

What’s New?

413 species are represented in this collection for the first time
198 species are represented by a better assembly
27 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment

Continue reading “Now Available! Updated Bacterial and Archaeal Reference Genomes Collection” →

NCBI Hidden Markov Models (HMM) Release 15.0 Now Available!

Download release 15.0 of the NCBI protein profile Hidden Markov models (HMMs) used by the Prokaryotic Genome Annotation Pipeline (PGAP)! Search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.

What’s New?

Release 15.0 contains:

16,667 HMMs maintained by NCBI
279 new HMMs since release 14.0
Several hundreds HMMs with better names, EC numbers, Gene Ontology (GO) terms, gene symbols, or publications.

Continue reading “NCBI Hidden Markov Models (HMM) Release 15.0 Now Available!” →

Cleaner BLAST Databases for More Accurate Results

Removing contaminated sequences using NCBI quality assurance tools

Do you use BLAST to identify a sequence or the evolutionary scope of a gene? That can be challenging if contaminated and misclassified sequences are in the BLAST databases and show up in your search results. To address this problem, we now use the NCBI quality assurance tools listed below to systematically remove these misleading sequences from the default nucleotide (nt) and protein (nr) BLAST databases. Continue reading “Cleaner BLAST Databases for More Accurate Results” →

Conserved Domain Database Version 3.21 Now Available!

Check out the newly released Conserved Domain Database (CDD) version 3.21. Updated content is available on the CDD FTP site.

What’s New?

1,174 new or updated NCBI-curated domains
Mirrors Pfam version 35 as well as new models from the NCBIfams collection and revised models from the Clusters of Orthologous Genes (COG) database
Fine-grained classifications of the following domain families:

Continue reading “Conserved Domain Database Version 3.21 Now Available!” →

Browse Taxonomy Records with NCBI Datasets

New & improved NCBI Datasets Taxonomy pages and command-line service

NCBI Datasets is excited to introduce new features to our Taxonomy pages making it easier for you to access, browse, and download taxonomic information about organisms at any taxonomic level.

What’s new?

Explore Taxonomy records with an updated look and feel
Access and download taxonomic metadata from the web or with our updated command-line (CLI) tools

Continue reading “Browse Taxonomy Records with NCBI Datasets” →

New RefSeq Annotations Now Available!

In February and March, the NCBI Eukaryotic Genome Annotation Pipeline released forty-six new annotations in RefSeq!

New Annotations

Aedes albopictus (Asian tiger mosquito)
Anolis carolinensis (green anole)
Armigeres subalbatus (mosquito)
Bacillus rossius redtenbacheri (walking stick)
Bolinopsis microptera (comb jelly)
Bombyx mori (domestic silkworm)
Bubalus kerabau (carabao)
Candoia aspera (snake)
Cavia porcellus (domestic guinea pig)
Continue reading “New RefSeq Annotations Now Available!” →

Easy Access to Genetic Test Information with the NIH Genetic Testing Registry (GTR)

We have made some exciting updates to the NIH Genetic Testing Registry (GTR) to give you a more modern, easier-to-navigate display of genetic test information. We encourage you to check out our latest improvements and let us know what you think.

What’s new?

Continue reading “Easy Access to Genetic Test Information with the NIH Genetic Testing Registry (GTR)” →

MedGen Users, We Want Your Feedback!

Do you use NCBI’s MedGen? If so, then you probably know it’s NCBI’s one-stop-shop for genetic phenotype information. If you are a healthcare provider, genetic professional, researcher, or anyone who uses MedGen, we want to hear from you to help us make this resource better meet your needs!

We want to know:

How you currently use MedGen
How we can make MedGen data more useful to you

How to provide feedback

Continue reading “MedGen Users, We Want Your Feedback!” →

Changes to SRA Data Access on the Google Cloud Platform (GCP)

Sequence Read Archive (SRA) data available via the Google Cloud Platform (GCP) are migrating from multi-region to single region us-east-1. This migration is projected to be complete by May 2024. To minimize the impact of this change, we recommend updating your workflow to access SRA data in us-east-1 region as soon as conveniently possible.

Please note this change does not impact SRA data access from Amazon Web Services (AWS) or NCBI servers. Continue reading “Changes to SRA Data Access on the Google Cloud Platform (GCP)” →