As reported in the journal Plant Disease, a recent collaboration between National Library of Medicine’s NCBI and the U.S. Department of Agriculture’s Animal and Plant Health Inspection Service (APHIS) analyzed public sequence records for the fungal genus Colletotrichum, an important group of fungal plant pathogens that are a significant threat to food production. Colletotrichum species are challenging to identify accurately, and public sequences may contain out of date taxonomic information. The study improved the accuracy of species names assigned to Colletotrichum database sequences, verified a comprehensive set of reliable reference markers for the genus, and produced a multi-marker tree as well as the genome based interactive tree shown in Figure 1.
Figure 1. Views from genome assembly derived multi-protein distance tree that shows the analysis of publicly available Colletotrichum genomes. The interactive tree is available online. You can browse, search, download, and export the tree. As an example search, you can demonstrate that assembly GCA_002901105.1 was incorrectly labeled as Colletotrichum gloeosporioides. Searching the tree for the name “Colletotrichum gloeosporioides” highlights two clades. Clicking the node for the Truncatum species complex and clicking “Show descendants” expands the clade and shows that assembly GCA_002901105.1, which was labelled as gloeosporioides, clusters with the Truncatum species complex. You can find more details on the tree building process in the supplementary material for the publication and on GitHub.
The annotation of human assemblies GRCh38.p14 and T2T-CHM13v2.0
We are happy to announce the first de novo annotation of human T2T-CHM13v2.0, the gap-less assembly generated by the T2T Consortium, and the full re-annotation of the human reference assembly, GRCh38.p14. We hope the results will serve both the needs of those eager to explore newly sequenced regions of the genome, including telomeres and centromeres, and those interested in refreshing their interpretation of the human reference, in light of recently curated transcripts and new transcriptomic and other data incorporated in the annotation. Continue reading “Announcing Human Annotation Release 110”→
We have a curated set of ribosomal RNA (rRNA) reference sequences (Targeted Loci) with verifiable organism sources and current names. This set is critical for correctly identifying and classifying prokaryotic (bacteria and archaea) and fungal samples (Table 1). To provide easy access to these sequences, we recently added a separate rRNA/ITS databases section on the nucleotide BLAST page for these targeted sequences that makes it convenient to quickly identify source organisms (Figure 1)