Author: NCBI Staff

NCBI Hidden Markov Models (HMM) Release 13.0 Now Available!

NCBI Hidden Markov Models (HMM) Release 13.0 Now Available!

Release 13.0 of the NCBI protein profile Hidden Markov models (HMMs) used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available for download. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.

What’s new?

The 13.0 release contains:

  • 16,143 HMMs maintained by NCBI
  • 315 new HMMs since release 12.0
  • 286 HMMs with better names, EC numbers, Gene Ontology (GO) terms, gene symbols or publications

Continue reading “NCBI Hidden Markov Models (HMM) Release 13.0 Now Available!”

GenBank Release 257.0 is Available!

GenBank Release 257.0 is Available!

GenBank release 257.0 (8/15/2023) is now available on the NCBI FTP site. This release has 25.10 trillion bases and 3.69 billion records.

The current release has:

  • 246,119,175 traditional records containing 2,112,058,517,945 base pairs of sequence data
  • 2,631,493,489 WGS records containing 22,294,446,104,543 base pairs of sequence data
  • 686,271,945 bulk-oriented TSA records containing 646,176,166,908 base pairs of sequence data
  • 124,421,006 bulk-oriented TLS records containing 48,289,699,026 base pairs of sequence data

During the 59 days between the close dates for GenBank Releases 256.0 and 257.0, the traditional portion of GenBank grew by 145,578,541,799 base pairs and by 2,558,312 sequence records. We updated 34,840 records during that same period. We added and/or updated an average of 43,952 traditional records per day! Continue reading “GenBank Release 257.0 is Available!”

Now Live! New and Improved SciENcv Biographical Sketch Experience 

Now Live! New and Improved SciENcv Biographical Sketch Experience 

Required for NSF grant application submissions beginning October 2023 

New in Science Experts Network Curriculum Vita (SciENcv)! We are excited to introduce an updated experience for the National Science Foundation (NSF) Biographical Sketch document, allowing you to submit federal grant applications quicker and easier than ever!   

Features & Benefits
  • Enhanced user experience with a modern look and feel   
  • Intuitive and easy process that helps you fill out forms correctly   
  • Revised navigation to reduce administrative burden   
  • Document preview allows you to view your document prior to certification   
  • Eliminates the need to repeatedly enter biographical sketch and financial document information   
  • Reduces the administrative burden associated with federal grant application and reporting requirements   
  • Allows you to describe your scientific contributions in your own words   
  • Document certification  

Continue reading “Now Live! New and Improved SciENcv Biographical Sketch Experience “

NCBI at the Biodiversity Genomics Academy 2023 (BGA23)

NCBI at the Biodiversity Genomics Academy 2023 (BGA23)

Virtual Talks, September 14, 2023

NCBI will be presenting virtually at the Biodiversity Genomics Academy 2023 (BGA23) on September 14, 2023. Our short, interactive talks will focus on NCBI Datasets and the Comparative Genome Viewer (CGV). Both resources are part of the NIH Comparative Genomics Resource (CGR), which facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration.

Recordings will be made available post-event! Continue reading “NCBI at the Biodiversity Genomics Academy 2023 (BGA23)”

Using Average Nucleotide Identity (ANI) to Expose Potentially Problematic Taxonomic Merges

Using Average Nucleotide Identity (ANI) to Expose Potentially Problematic Taxonomic Merges

Help us improve our microbial taxonomy

NCBI uses Average Nucleotide Identity (ANI) to evaluate the taxonomic classification of prokaryotic genomes submitted to GenBank. As part of this effort, we identified heterotypic synonyms that fail to match each other with high ANI, and we invite you to help us evaluate these cases.

What is Heterotypic Synonymy?

Heterotypic synonymy refers to two or more names for different taxa (such as species) that were described independently but have been subsequently merged into a single taxon. The merged taxon will generally be referred to by the oldest name. Continue reading “Using Average Nucleotide Identity (ANI) to Expose Potentially Problematic Taxonomic Merges”

New Annotations in RefSeq!

New Annotations in RefSeq!

In April, May, and June, the NCBI Eukaryotic Genome Annotation Pipeline released eighty-two new annotations in RefSeq!

Highlights:

  • Homo sapiens (human) T2T-CHM13v2.0 now includes many more alternative splice variants
  • Homo sapiens (human) GRCh38.p14 includes all transcripts from MANE v1.2, and includes over 78,000 new RefSeq Functional Element (RefSeqFE) features added since our last annotation in 2022
  • Mus musculus (house mouse) GRCm39 integrates curation for over 3,000 genes and 14,000 transcripts since September 2020
  • Rattus norvegicus (Norway rat) mRatBN7.2, including curation of over 5000 genes since our last annotation in 2021

New annotations: Continue reading “New Annotations in RefSeq!”

New and Improved SciENcv Biographical Sketch Experience Coming Soon!

New and Improved SciENcv Biographical Sketch Experience Coming Soon!

Required for NSF grant application submissions beginning October 2023

We recently introduced a new experience for the Science Experts Network Curriculum Vita (SciENcv) Current & Pending (Other) Support Forms with updated features and functionality. Beginning in August 2023, we will offer a similar updated experience for the National Science Foundation (NSF) Biographical Sketch document, too. Submit federal grant applications quicker and easier than ever!   Continue reading “New and Improved SciENcv Biographical Sketch Experience Coming Soon!”

RefSeq Release 219

RefSeq Release 219

RefSeq release 219 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of July 18, 2023, this full release incorporates genomic, transcript, and protein data containing:

  • 371,291,248 records
  • 3,752,372,037,103 nucleotide bases
  • 106,842,615,422 amino acids
  • sequences from 138,491 organisms

The release is provided in several directories as a complete dataset and divided by logical groupings.

Updates & announcements

Continue reading “RefSeq Release 219”

dbGaP: Making it Easier to Find Study Data with Third-Party Annotations

dbGaP: Making it Easier to Find Study Data with Third-Party Annotations

The database of Genotypes and Phenotypes (dbGaP) is a free resource that contains human data from a variety of large-scale studies. While you can’t view individual-level data without applying for controlled access, you can easily find dbGaP studies using the dbGaP Advanced Search (see screenshot below) and quickly filter studies based on study variables, molecular data type, study focus, NIH Institute, study consent, and more. Third-party annotations and mapping of phenotypic and study variables to controlled vocabularies allow you to search across studies. Once you find a study of interest, you can follow the Authorized Access link on records to apply for access. 

Phenotypic and study variables include:
  • Clinical measures (e.g., height, weight, blood pressure) 
  • Demographic information (e.g., age, gender, ethnicity) 
  • Sample information (e.g., analyte type, body site) 
  • Molecular data type (e.g., DNA sequence, genotypes, gene expression) 

Continue reading “dbGaP: Making it Easier to Find Study Data with Third-Party Annotations”

New & Improved NCBI Datasets Genome and Assembly Pages 

New & Improved NCBI Datasets Genome and Assembly Pages 

Legacy pages now redirect 

Effective July 10, 2023, NCBI’s Assembly and Genome record pages now redirect to new NCBI Datasets pages. As previously announced, these updates are part of our ongoing effort to modernize and improve your user experience. NCBI Datasets is a new resource that makes it easier to find and download genome data.   

The following pages have been updated:
  • The NCBI Assembly record pages now redirect to the new NCBI DatasetsGenomerecord pages that describe assembled genomes and provide links to related NCBI tools such as Genome Data Viewer and BLAST.  
  • The NCBIGenome record pages now redirect to the NCBI DatasetsTaxonomyrecord pages that provide a taxonomy-focused portal to genes, genomes, and additional NCBI resources.   

During this transition, you will have the option to return to the legacy Genome and Assembly record pages. We will remove the legacy pages in early 2024.   Continue reading “New & Improved NCBI Datasets Genome and Assembly Pages “