Tag: NCBI Taxonomy

Prokaryotic phylum name changes coming soon!

Prokaryotic phylum name changes coming soon!

Beginning in the first week of January 2023, NCBI Taxonomy will initiate changes to prokaryote phylum names in accordance with the recent inclusion of rank ‘phylum’ in the International Code of Nomenclature for Prokaryotes (ICNP). We first announced this update that involves changes to 42 NCBI taxa about a year ago. We will change several names that have long been in use (e.g., Firmicutes, Proteobacteria) to newly formalized names (e.g., Bacillota, Pseudomonadota) that may be unfamiliar to some.

You will still see the previous names on records and can search using them, but they will not be displayed as prominently as before. The organism names on Entrez records will not change (e.g., Bacillus subtilis). However, we will update the phylum names on the displayed lineages for ~276 million records (see an example in Figure 1 below). Continue reading “Prokaryotic phylum name changes coming soon!”

Now available: Updated prokaryote representative genomes collection

Now available: Updated prokaryote representative genomes collection

An updated bacterial and archaeal representative genomes collection is available! We selected a total of 16,665 of the 262,000 prokaryotic assemblies in RefSeq to represent their respective species. For the first time, more complete assemblies (as calculated by CheckM) were ranked higher than less complete assemblies. See the ranked list of criteria for selecting representative assemblies here. Continue reading “Now available: Updated prokaryote representative genomes collection”

Fungal species identification using DNA: an NCBI and USDA-APHIS collaboration with a focus on Colletotrichum

Fungal species identification using DNA: an NCBI and USDA-APHIS collaboration with a focus on Colletotrichum

As reported in the journal Plant Disease,  a recent collaboration between National Library of Medicine’s NCBI and the U.S. Department of Agriculture’s Animal and Plant Health Inspection Service (APHIS) analyzed public sequence records for the fungal genus Colletotrichum, an important group of fungal plant pathogens that are a significant threat  to food production. Colletotrichum species are challenging to identify accurately, and public sequences may contain out of date taxonomic information. The study improved the accuracy of species names assigned to Colletotrichum database sequences, verified a comprehensive set of reliable reference markers for the genus, and produced a multi-marker tree as well as the genome based interactive tree shown in Figure 1.

Figure 1.  Views from genome assembly derived multi-protein distance tree that shows the analysis of publicly available Colletotrichum genomes. The interactive tree is available online. You can browse, search, download, and export the tree. As an example search, you can demonstrate that assembly GCA_002901105.1 was incorrectly labeled as Colletotrichum gloeosporioides.  Searching the tree for the name “Colletotrichum gloeosporioides” highlights two clades.  Clicking the node for the Truncatum species complex and clicking “Show descendants” expands the clade and shows that assembly GCA_002901105.1, which was labelled as gloeosporioides, clusters with the Truncatum species complex. You can find more details on the tree building process in the supplementary material for the publication and on GitHub.

Continue reading “Fungal species identification using DNA: an NCBI and USDA-APHIS collaboration with a focus on Colletotrichum”

ASM Microbe 2022 was a success!

ASM Microbe 2022 was a success!

NCBI had the pleasure of attending and participating in this year’s American Society of Microbiology (ASM) Microbe conference, June 9-13 in Washington, D.C. NCBI staff participated in activities and events throughout the three-day conference. Over 4,500 attendees gathered in the exhibit hall and joined a variety of poster presentations and talks!

Reflections from a few of our NCBI experts

“It was a great honor for me to receive the ASM Elizabeth O. King Lecturer Award. Thank you to my colleagues, without whom so much of my work would not have been possible, and to all of those who attended my presentation on Making Genomics Accessible to Aid Public Health and Research.”

~Michael Feldgarden, Ph.D.  Continue reading “ASM Microbe 2022 was a success!”

Announcing an updated prokaryotic representative genomes collection with 706 new species!

Announcing an updated prokaryotic representative genomes collection with 706 new species!

An updated bacterial and archaeal representative genomes collection is available! A total of 16,105 assemblies among the 249,000 prokaryotic assemblies in RefSeq were selected to represent their respective species. The collection has grown by 3.7% since January 2022. A total of 706 species are represented for the first time. In addition, 186 species are represented by a better assembly, and 124 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment.

We updated the database on the Microbial Nucleotide BLAST page as well as the basic nucleotide BLAST RefSeq Representative genomes database (fourth in the menu) to reflect these changes. Finally, remember that you can now run BLAST searches against the proteins annotated on representative genomes (second in the menu). See more info here.

Come see NCBI at the ASM Microbe Conference 2022

Come see NCBI at the ASM Microbe Conference 2022

The American Society of Microbiology (ASM) Microbe conference is back, and scheduled to take place in-person, June 9th-13th in Washington, D.C.

NCBI staff member Dr. Michael Feldgarden will be recognized by ASM with an award for his research. Other NCBI staff will present posters on NCBI resources and will also be available at our booth (#1128) to address your questions. Drop by to see what’s new and provide your feedback. We hope to see you there! Check out NCBI’s schedule of activities:  Continue reading “Come see NCBI at the ASM Microbe Conference 2022”

NCBI Taxonomy to include phylum rank in taxonomic names

NCBI Taxonomy will append a list of 42 names of prokaryote phyla published for validation purposes as required under the International Code of Nomenclature for Prokaryotes (ICNP). You can still search for previous informal names, and any informal phylum rank names not addressed in the validation list will remain unchanged.

The largest named groups affected by this change are:

Current name New name
Firmicutes Bacillota
Proteobacteria Pseudomonadota
Actinobacteria Actinomycetota
Bacteroidetes Bacteroidota

In the first half of 2021, the International Committee on Systematics of Prokaryotes (ICSP) voted to include the rank of phylum under taxonomic names covered by the International Code of Nomenclature of Prokaryotes (ICNP) (2008 Revision). The rank phylum was previously widely used in literature for prokaryotic names, and included in the NCBI Taxonomy, but not formally recognized in the ICNP. Currently, this rank is assigned to 167 bacterial and 39 archaeal informal names in NCBI Taxonomy. The newly adjusted rule (Rule 8) in the ICNP requires all formal rank names to be formed by the addition of the suffix ” -ota” to the stem of the name of the designated type genus. NCBI Taxonomy adheres to the rules stipulated in several codes of nomenclature and this means that several names in long standing use will be changed accordingly.

NCBI Taxonomy is a curated classification and nomenclature for all of the organisms in public sequence databases. This currently represents about 10% of the described life on the planet.

March 10 Webinar: Where to find data for your research organism!

March 10 Webinar: Where to find data for your research organism!

Do you work with data from organisms outside the traditional set of model organisms? Join us on March 10, 2021 to learn how to use NCBI resources including NCBI’s Taxonomy and BLAST that can help you find information from your organism and closely related taxa. You will see an example that shows you how to retrieve and download gene sequences for a set of species, generate multiple sequence alignments, and design primers using Primer-Blast.

  • Date and time: Wed, March 10, 2021 12:00 PM – 12:45 PM EST
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Enhanced prokaryote type strain report now with details on needed type strain data

The Prokaryote type strain report provides information on type-strains for over 18,000 species. We revised and expanded the report to make it easier to identify cases where sequencing or establishing type material would have the biggest impact on improving prokaryote taxonomy and accurate identification.  These cases include species with designated type strains but without a sequenced type strain assembly and species without designated type material. We hope that the community will prioritize sequencing type strains for the former set of species (Table 1) and establishing a neotype or reftype, where applicable (as defined in Cuifo et al 2018) for the latter set (Table 2).

Other changes from the old format file are detailed in a recent genomes announce post.

Scientific Name Type material/co-identical strains Assemblies
Burkholderia ubonensis CCUG:48852, CIP:1070, … 308
Escherichia albertii Albert 19982, BCCM/LMG:20976, … 181
Xanthomonas perforans AATCC:BAA-983, DSM:18975, … 153
Listeria innocua ATCC:33090, BCCM/LMG:11387, … 106
Streptococcus iniae ATCC:29178, BCCM/LMG:14520, … 94
Vibrio lentus CECT:5110, CIP:107166, … 87
Vibrio cyclitrophicus ATCC:700982, BCCM/LMG:21359, … 83
Pseudomonas coronafaciens BCCM/LMG:5060, CFPB:2216, … 77
Aliivibrio fischeri ATCC:7744, BCCM/LMG:4414, … 66
Xanthomonas fragariae ATCC:33239, BCCM/LMG:708, … 61

Table 1. The top 10 candidate species for sequencing type-strains sorted by the number of assemblies. These have designated type strains but no type strain assembly. We generated the list by sorting by “number of assemblies from type materials per species”, then by decreasing “number of assemblies per taxon”, then filtering out “type materials and coidentical strains” = “na”.

Table 2. The top 10 candidates for proposing a reftype assembly, or neotype where applicable sorted by the number of assemblies. These species have no designated type strain.  We generated the list by selecting for “type materials and coidentical strains” = “na”, “number of assemblies from type materials per species” = 0, and sorting by decreasing “number of assemblies per taxon”, then filtering out Candidatus.

Please contact info@ncbi.nlm.nih.gov if you want to provide information about missing type-strains.

Expanded average nucleotide identity analysis now available for prokaryotic genome assemblies

As we described in an earlier post, GenBank uses average nucleotide identity (ANI) analysis to find and correct misidentified prokaryotic genome assemblies. You can now access ANI data for the more than 600,000 GenBank bacterial and archaeal genome assemblies through a downloadable report (ANI_report_prokaryotes.txt) available from the genomes/ASSEMBLY_REPORTS area of the FTP site. The README describes the contents of the report in detail. You can use the ANI data to evaluate the taxonomic identity of genome assemblies of interest for yourself.

The new ANI_report_prokaryotes.txt replaces the older ANI_report_bacteria.txt in the same directory. We are no longer updating the ANI_report_bacteria.txt file and will remove it after 31st May 2020.