Tag: NCBI Taxonomy

Now Available! Updated Bacterial and Archaeal Reference Genomes Collection

Now Available! Updated Bacterial and Archaeal Reference Genomes Collection

An updated bacterial and archaeal reference genome collection is available! This collection of 18,343 genomes was built by selecting exactly one genome assembly for each species among the 312,000+ prokaryotic genomes in RefSeq, except for E. coli for which two assemblies were selected as reference.

The criteria for selecting the reference assembly for a given species include assembly contiguity and completeness and quality of the RefSeq annotation. 

What’s new?
  • 790 species were added to the collection
  • 199 species are represented by a better assembly (compared to the April 2023 release)
  • 70 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment 

Continue reading “Now Available! Updated Bacterial and Archaeal Reference Genomes Collection”

Important Update! Changes to ASSEMBLY_REPORTS and GENOME_REPORTS on FTP

Important Update! Changes to ASSEMBLY_REPORTS and GENOME_REPORTS on FTP

Do you currently access genome assembly data through the FTP site? We are consolidating information provided in the ASSEMBLY_REPORTS and GENOME_REPORTS directories on the genomes FTP site to simplify access and ensure that you have the most accurate, up to date, and consistently reported data.  

The assembly_summary files in the ASSEMBLY_REPORTS directory are gaining information in newly added columns 24-38, including statistics about the assembly (size, GC content, genome size, and number of sequences) as well as details about the provided annotation (number of genes, annotation name and date). See example below (Table 1). Check out the README for more details about the contents of the summary files.  Continue reading “Important Update! Changes to ASSEMBLY_REPORTS and GENOME_REPORTS on FTP”

Join NCBI at ASM Microbe 2023

Join NCBI at ASM Microbe 2023

Houston, TX, June 15-19, 2023

NCBI is looking forward to seeing you in person at the American Society for Microbiology Annual Meeting (ASM Microbe 2023). NCBI staff will participate in a variety of activities and events and will also be available at our booth (#2410) to address your questions. We’re especially excited to share our recent efforts on the NCBI Pathogen Detection Project which integrates bacterial and fungal pathogen genomic sequences from numerous ongoing foodborne illness and environmental surveillance and research efforts. 

Check out our schedule of activities and events below (and on our conference webpage). All times are in CST.  Continue reading “Join NCBI at ASM Microbe 2023”

RefSeq Release 218

RefSeq Release 218

RefSeq release 218 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of May 1, 2023, this full release incorporates genomic, transcript, and protein data containing:

New Release! Updated Bacterial and Archaeal Reference Genomes Collection Now Available

New Release! Updated Bacterial and Archaeal Reference Genomes Collection Now Available

As previously announced, we are continuously curating a better Prokaryotic Reference Genomes Collection. An updated bacterial and archaeal reference genome collection is now available! This collection of 17,623 genomes was built by selecting exactly one genome assembly for each species among the 283,000+ prokaryotic genomes in RefSeq, except for E. coli for which two assemblies were selected as reference. 

What’s new?
  • 480 species were added to this collection 
  • 178 species are represented by a better assembly 
  • 17 species were removed due to changes in NCBI Taxonomy or uncertainty in their species assignment 

Continue reading “New Release! Updated Bacterial and Archaeal Reference Genomes Collection Now Available”

RefSeq Release 217

RefSeq Release 217

RefSeq release 217 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of March 8, 2023, this full release incorporates genomic, transcript, and protein data, containing:

  • 348,351,219 records
  • 254,500,694 proteins
  • 50,975,429 RNAs
  • sequences from 130,837 organisms

The release is provided in several directories as a complete dataset and divided by logical groupings. Continue reading “RefSeq Release 217”

New & Improved NCBI Datasets Genome and Assembly Pages

New & Improved NCBI Datasets Genome and Assembly Pages

Legacy pages will be redirected effective July 2023

In July 2023, NCBI’s Assembly and Genome record pages will be redirected to new Datasets pages as part of our ongoing effort to modernize and improve your user experience. NCBI Datasets is a new resource that makes it easier to find and download genome data 

We will update the following pages:
  • The NCBI Assembly pages will be redirected to the new DatasetsGenome pages that describe assembled genomes and provide links to related NCBI tools such as Genome Data Viewer and BLAST. 
  • The NCBIGenome pages will be redirected to the DatasetsTaxonomy pages that provide a taxonomy-focused portal to genes, genomes and additional NCBI resources.  
  • During this transition, you will have the option to return to the legacy Genome and Assembly pages. 

Continue reading “New & Improved NCBI Datasets Genome and Assembly Pages”

Upcoming changes to influenza virus names in NCBI Taxonomy

Upcoming changes to influenza virus names in NCBI Taxonomy

In order to reflect changes to the International Code of Virus Classification and Nomenclature (ICVCN) made by the International Committee on Taxonomy of Viruses (ICTV), NCBI will introduce new binomial influenza species names like ‘Alphainfluenzavirus influenzae.’ Changes are expected to be in place near summer 2023.

We recognize that the traditional influenza virus names like ‘Influenza A virus’ and ‘Influenza B virus’ are broadly used in public health, educational institutions, and research. To minimize the impact of this change to those who use NCBI resources, the taxonomy schema will keep the former names in the lineages for each species; however, they will be moved below the (new) species taxa in the hierarchy. See example below.

Continue reading “Upcoming changes to influenza virus names in NCBI Taxonomy”

Updated bacterial and archaeal reference genomes collection now available!

Updated bacterial and archaeal reference genomes collection now available!

An updated bacterial and archaeal reference genome collection is available! This collection of 17,163 genomes was built by selecting exactly one genome assembly for each species among the 272,000+ prokaryotic genomes in RefSeq, except for E. coli for which two assemblies were selected as reference.

A total of 497 species are included in this collection for the first time. In addition, comparing to the October 2022 set, 174 species are represented by a better assembly and 15 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment. The criteria for selecting one assembly for a given species from all assemblies available in RefSeq for the species include assembly contiguity and completeness and quality of the RefSeq annotation. See the documentation for details.

We have updated the nucleotide BLAST RefSeq reference genomes database (fourth in the menu) as well as the database on the Microbial Nucleotide BLAST page to reflect these changes. You can also run BLAST searches against the proteins annotated on these reference genomes (RefSeq Select proteins database, second in the menu).

Prokaryotic phylum name changes coming soon!

Prokaryotic phylum name changes coming soon!

Beginning in the first week of January 2023, NCBI Taxonomy will initiate changes to prokaryote phylum names in accordance with the recent inclusion of rank ‘phylum’ in the International Code of Nomenclature for Prokaryotes (ICNP). We first announced this update that involves changes to 42 NCBI taxa about a year ago. We will change several names that have long been in use (e.g., Firmicutes, Proteobacteria) to newly formalized names (e.g., Bacillota, Pseudomonadota) that may be unfamiliar to some.

You will still see the previous names on records and can search using them, but they will not be displayed as prominently as before. The organism names on Entrez records will not change (e.g., Bacillus subtilis). However, we will update the phylum names on the displayed lineages for ~276 million records (see an example in Figure 1 below). Continue reading “Prokaryotic phylum name changes coming soon!”