New! NIH Genetic Testing Registry (GTR) API

New! NIH Genetic Testing Registry (GTR) API

Want to automate submitting genetic test-related information to the NIH Genetic Testing Registry? Now you can! In September 2022, GTR released a submission API that supports fully automated submission of test data to GTR. The new API is one more way, in addition to the Submission Portal wizard and bulk submission using a spreadsheet template, to submit test data to GTR.

Why an API?

An API will allow you to programmatically generate and deposit your latest information into GTR, especially for a large volume of genetic tests. Our customers rely on your up-to-date information to make accurate decisions for their patients. The API creates a one-time setup, multiple-time reuse pathway for timely updates.

How to get started

To start the new submission process:

  1. If you haven’t already, register your lab with GTR
  2. Request an API service account from the GTR staff
  3. Once we’ve established your service account, create an API key

Continue reading “New! NIH Genetic Testing Registry (GTR) API”

Connect with NCBI at ASHG 2022

Connect with NCBI at ASHG 2022

Join us October 25-29 in Los Angeles, CA

We are looking forward to seeing you in-person at the American Society of Human Genetics (ASHG) annual meeting, October 25-29, 2022, in Los Angeles, California.

We will present a variety of talks and posters featuring our clinical and human genetic resources, as well as genome products and tools. We are excited to introduce the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. If you’re interested in providing feedback that will be used to help drive CGR forward, consider joining our round table discussion.  

Check out NCBI’s schedule of activities and events: 

Continue reading “Connect with NCBI at ASHG 2022”

New Upcoming NCBI Virtual Workshops!

New Upcoming NCBI Virtual Workshops!

Apply to attend October 2022 interactive, hands-on workshops

Want to learn more about NCBI resources and how to implement our cutting-edge tools in your research? NCBI offers a variety of educational opportunities, including workshops, webinars, codeathons, tutorials, and more!

We are excited to announce our upcoming virtual workshop series for October 2022. Our interactive, hands-on workshops are taught by experienced NCBI Education Faculty. Applications are open to the public; however, each workshop will accept a limited number of participants to facilitate the best possible educational experience. Continue reading “New Upcoming NCBI Virtual Workshops!”

NCBI Workshop at the ASM NGS 2022 Meeting

NCBI Workshop at the ASM NGS 2022 Meeting

NCBI Microbial Pathogen and SARS-CoV-2 Resources in the Cloud

Get hands-on experience with NCBI Pathogen Detection and SARS-CoV-2 Surveillance data in the cloud. No prior cloud experience necessary!

NCBI staff are presenting a workshop at the American Society for Microbiology Next-Generation Sequencing (ASM NGS) 2022 Meeting on Sunday, October 16, 2022 from 10 am – 3 pm ET (with a 1 hour break) to help conference attendees learn about two NCBI cloud-hosted resources, Pathogen Detection and SARS-CoV-2 Genome Sequence datasets. Continue reading “NCBI Workshop at the ASM NGS 2022 Meeting”

Stephen Sherry, PhD, is the new NCBI Director and NLM Associate Director for Scientific Data Resources

We are excited that our own Stephen Sherry, PhD, is now the new NCBI Director at the National Library of Medicine (NLM), and the NLM Associate Director for Scientific Data Resources. In these roles, Dr. Sherry will oversee the development and deployment of advanced computational solutions to meet life and health science information needs and facilitate open science and scholarship through a growing array of data, literature, and other information offerings and services from NLM.

Dr. Sherry brings a history of innovation and leadership to the NCBI Director position. Most recently, he served as Acting Director of NCBI, bringing a vision of customer engagement, and modular, interoperable, and cloud-based approaches to the technical platforms for NLM offerings and services. He is also recognized for his inventiveness in leveraging research for public health emergency response. Dr. Sherry has been central in making key innovations at NLM including the ClincalTrials.gov modernization effort and development of the NIH Comparative Genomics Resource, ensuring public input and technical innovation in the process. Dr. Sherry positioned NCBI as a strong collaborative force across the NIH and in supporting major NLM projects including the MEDLINE 2022 initiative, which resulted in 100% automated indexing of the biomedical literature available through NLM’s PubMed and PubMed Central (PMC).

“Dr. Sherry has the skills, knowledge, and insight to deliver creative, forward-thinking scientific and operational leadership for NLM and the communities we serve,” said NLM Director Patricia Flatley Brennan, RN, PhD. “His vast experience, expertise, and vision for NCBI is a great fit for NLM’s eye to the future and its commitment to drive innovation.”

Throughout his tenure at NCBI, Dr. Sherry has participated in many NIH efforts to characterize human genetic diversity and has served on numerous working groups across NIH to address a range of data science issues including the development of the genomic data sharing policy, privacy analysis for risk-sensitive data sets, and advances in scientific publications.

Dr. Sherry earned his PhD in Anthropology at the Pennsylvania State University in 1996 and completed a postdoctoral fellowship at the Louisiana State University Medical Center prior to joining NLM in 1998.

Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)

Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)

In October 2022, NCBI Datasets will release version 14 of our datasets and dataformat command-line tools. This release will contain breaking changes to the command syntax, content of the data packages and data reports. Thank you for your feedback that inspired these new features. We hope they will improve your experience!

We will continue to support CLI v13.x, although new features and improvements will be exclusive to CLI v14.0.0 release and up.

NCBI Datasets supports the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms. Join our mailing list to keep up to date with NCBI Datasets and other CGR news.

More details

How is version 14 of the Datasets command-line tools (CLI v14.x) different from CLI v13.x and previous versions?  Continue reading “Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)”

Conserved Domain Database version 3.20 is available!

Conserved Domain Database version 3.20 is available!

A new version of the Conserved Domain Database (CDD) is now available. Version 3.20 contains 1,614 new or updated NCBI/CDD-curated domains and now mirrors Pfam version 34 as well as new models from the NCBIfam collection. Fine-grained classifications of the [(+)ssRNA] virus RNA-dependent RNA polymerase catalytic domain, RING-finger/U-box, dimerization/docking domains of the cAMP-dependent protein kinase regulatory subunit, and Galactose/rhamnose-binding lectin domain superfamily have been added, along with many other new models.

We have significantly increased the fraction of CD-Search and interactive BATCH CD-Search queries that yield results showing conserved domain architecture information and attributes that further characterize protein function through links to information-rich resources such as Enzyme Commission (EC) numbers , Gene Ontology (GO) terms, PubMed IDs, and identifiers from the CaZY, TCDB, and MEROPS databases. See our earlier post for additional details. You can access CDD and find updated content on the CDD FTP site at CDD version 3.20.

 Database statistics for CDD version 3.20:

Models Source
64,234 Total models from all Source Databases

Organized into 4,541 multi-model Superfamilies

18,882 NCBI CDD curation effort
1,125 NCBIfams
1,009 SMART v6.0
19,178 PFAM v34
4,871 COGs v1.0
10,140 NCBI Protein Clusters
4,488 TIGRFAM v15
59,693 Total models form the default CD-Search database

CD Search is part of the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms.

Join our mailing list to keep up to date with CD Search and other CGR news.

RefSeq release 214 is available!

RefSeq release 214 is available!

RefSeq release 214 is now available online, from the FTP site, and through NCBI’s Entrez programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available as of September 12, 2022, and contains 328,588,569 records, including 239,609,016 proteins, 47,387,931 RNAs, and sequences from 123,394 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

Foreign contamination screening
Introducing the new Foreign Contamination Screen (FCS) tool! If you produce assembled genomes, check out FCS, a tool you can run yourself to improve your genome assemblies and facilitate high-quality data submissions to GenBank. FCS is part of the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms. See our previous blog post to learn how FCS enhances contaminant detection sensitivity. Continue reading “RefSeq release 214 is available!”

Join NCBI virtually at the Biodiversity Genomics 2022 conference

Join NCBI virtually at the Biodiversity Genomics 2022 conference

Learn about the NIH Comparative Genomics Resource (CGR) Project

The Biodiversity Genomics conference will take place virtually, October 2-7, 2022. This event is hosted by the Earth BioGenome Project and is open and free for all to attend.

NCBI staff will present a variety of recorded talks and posters highlighting various elements of the NIH Comparative Genomics Resource (CGR), including NCBI Datasets and the Comparative Genome Viewer (CGV). CGR is a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. NCBI is charged with leading CGR development and engaging genomics communities. The CGR project will facilitate reliable comparative genomics analyses for all eukaryotic organisms in collaboration with the genomics community.

Check out NCBI’s schedule of activities to learn more about CGR: Continue reading “Join NCBI virtually at the Biodiversity Genomics 2022 conference”

NCBI hidden Markov models (HMM) release 10.0 now available!

NCBI hidden Markov models (HMM) release 10.0 now available!

Release 10.0 of the NCBI Hidden Markov models (HMM) used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available for download. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.

The 10.0 release contains 15,360 models maintained by NCBI, including 228 that are new since 9.0, 99 that were modified significantly, and 205 that were assigned better names, EC numbers, Gene Ontology (GO) terms, gene symbols or publications. You can search and view the details for these in the Protein Family Model collection, which also includes conserved domain architectures and BlastRules, and find all RefSeq proteins they name.

GO terms associated with HMMs are now propagated to CDSs and proteins annotated with PGAP. In case you missed it, see our previous blog post on this topic.