Announcing GenBank release 252.0

Announcing GenBank release 252.0

Now over 3 billion records!

GenBank release 252.0 (10/17/2022) is now available on the NCBI FTP site. This release has 20.35 trillion bases and 3.10 billion records. The current release has 240,539,282 traditional records containing 1,562,963,366,851 base pairs of sequence data. There are also 2,167,900,306 WGS records containing 18,231,960,808,828 base pairs of sequence data, 574,020,080 bulk-oriented TSA records containing 511,476,787,957 base pairs of sequence data, and 115,123,306 bulk-oriented TLS records containing 43,860,512,749 base pairs of sequence data. 

Continue reading “Announcing GenBank release 252.0”

New version of PGAP now available!

New version of PGAP now available!

We are happy to announce a new version of the stand-alone Prokaryotic Genome Annotation Pipeline (PGAP). This version helps you interpret your results by providing an estimate of the completeness and contamination of your PGAP-annotated genome assembly using CheckM.

CheckM uses the presence of a set of lineage-specific genes for the species provided  or the species returned by the taxonomy check (–taxcheck, –auto-correct-tax). The higher the completeness and the lower the contamination, the better the assembly is! If contamination is a concern, please try FCS-GX, a highly sensitive tool for detecting foreign contaminants in prokaryotic and eukaryotic genome assemblies.

This new release also contains code changes that improve prediction of some long genes, especially in low complexity regions. And, as with every release, PGAP incorporates incremental improvements from expert curators of the Protein Family Model collection that increase the precision of PGAP’s structural and functional annotation.

Please try this new version and share your experience with us!

 

Now Available! Updated NCBI Datasets Command-Line Tools 

Now Available! Updated NCBI Datasets Command-Line Tools 

NLM’s NCBI Datasets announces the release of version 14 of our command-line (CLI) tools, datasets, and dataformat. This release (CLI v14.0.0) contains many improvements that are inspired by your feedback. It’s now easier than ever to browse and format metadata, generate customized tables, and download data packages. We hope these updates will improve your experience! 

NCBI Datasets CLIv14 includes changes to the command syntax, data package contents, and data report schemas that are not backwards-compatible. Commands written for CLI versions prior to version 14 may fail after the latest update. For more details see our FAQs.   Continue reading “Now Available! Updated NCBI Datasets Command-Line Tools “

Now available: Updated prokaryote representative genomes collection

Now available: Updated prokaryote representative genomes collection

An updated bacterial and archaeal representative genomes collection is available! We selected a total of 16,665 of the 262,000 prokaryotic assemblies in RefSeq to represent their respective species. For the first time, more complete assemblies (as calculated by CheckM) were ranked higher than less complete assemblies. See the ranked list of criteria for selecting representative assemblies here. Continue reading “Now available: Updated prokaryote representative genomes collection”

New! NIH Genetic Testing Registry (GTR) API

New! NIH Genetic Testing Registry (GTR) API

Want to automate submitting genetic test-related information to the NIH Genetic Testing Registry? Now you can! In September 2022, GTR released a submission API that supports fully automated submission of test data to GTR. The new API is one more way, in addition to the Submission Portal wizard and bulk submission using a spreadsheet template, to submit test data to GTR.

Why an API?

An API will allow you to programmatically generate and deposit your latest information into GTR, especially for a large volume of genetic tests. Our customers rely on your up-to-date information to make accurate decisions for their patients. The API creates a one-time setup, multiple-time reuse pathway for timely updates.

How to get started

To start the new submission process:

  1. If you haven’t already, register your lab with GTR
  2. Request an API service account from the GTR staff
  3. Once we’ve established your service account, create an API key

Continue reading “New! NIH Genetic Testing Registry (GTR) API”

Connect with NCBI at ASHG 2022

Connect with NCBI at ASHG 2022

Join us October 25-29 in Los Angeles, CA

We are looking forward to seeing you in-person at the American Society of Human Genetics (ASHG) annual meeting, October 25-29, 2022, in Los Angeles, California.

We will present a variety of talks and posters featuring our clinical and human genetic resources, as well as genome products and tools. We are excited to introduce the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. If you’re interested in providing feedback that will be used to help drive CGR forward, consider joining our round table discussion.  

Check out NCBI’s schedule of activities and events: 

Continue reading “Connect with NCBI at ASHG 2022”

New Upcoming NCBI Virtual Workshops!

New Upcoming NCBI Virtual Workshops!

Apply to attend October 2022 interactive, hands-on workshops

Want to learn more about NCBI resources and how to implement our cutting-edge tools in your research? NCBI offers a variety of educational opportunities, including workshops, webinars, codeathons, tutorials, and more!

We are excited to announce our upcoming virtual workshop series for October 2022. Our interactive, hands-on workshops are taught by experienced NCBI Education Faculty. Applications are open to the public; however, each workshop will accept a limited number of participants to facilitate the best possible educational experience. Continue reading “New Upcoming NCBI Virtual Workshops!”

NCBI Workshop at the ASM NGS 2022 Meeting

NCBI Workshop at the ASM NGS 2022 Meeting

NCBI Microbial Pathogen and SARS-CoV-2 Resources in the Cloud

Get hands-on experience with NCBI Pathogen Detection and SARS-CoV-2 Surveillance data in the cloud. No prior cloud experience necessary!

NCBI staff are presenting a workshop at the American Society for Microbiology Next-Generation Sequencing (ASM NGS) 2022 Meeting on Sunday, October 16, 2022 from 10 am – 3 pm ET (with a 1 hour break) to help conference attendees learn about two NCBI cloud-hosted resources, Pathogen Detection and SARS-CoV-2 Genome Sequence datasets. Continue reading “NCBI Workshop at the ASM NGS 2022 Meeting”

Stephen Sherry, PhD, is the new NCBI Director and NLM Associate Director for Scientific Data Resources

We are excited that our own Stephen Sherry, PhD, is now the new NCBI Director at the National Library of Medicine (NLM), and the NLM Associate Director for Scientific Data Resources. In these roles, Dr. Sherry will oversee the development and deployment of advanced computational solutions to meet life and health science information needs and facilitate open science and scholarship through a growing array of data, literature, and other information offerings and services from NLM.

Dr. Sherry brings a history of innovation and leadership to the NCBI Director position. Most recently, he served as Acting Director of NCBI, bringing a vision of customer engagement, and modular, interoperable, and cloud-based approaches to the technical platforms for NLM offerings and services. He is also recognized for his inventiveness in leveraging research for public health emergency response. Dr. Sherry has been central in making key innovations at NLM including the ClincalTrials.gov modernization effort and development of the NIH Comparative Genomics Resource, ensuring public input and technical innovation in the process. Dr. Sherry positioned NCBI as a strong collaborative force across the NIH and in supporting major NLM projects including the MEDLINE 2022 initiative, which resulted in 100% automated indexing of the biomedical literature available through NLM’s PubMed and PubMed Central (PMC).

“Dr. Sherry has the skills, knowledge, and insight to deliver creative, forward-thinking scientific and operational leadership for NLM and the communities we serve,” said NLM Director Patricia Flatley Brennan, RN, PhD. “His vast experience, expertise, and vision for NCBI is a great fit for NLM’s eye to the future and its commitment to drive innovation.”

Throughout his tenure at NCBI, Dr. Sherry has participated in many NIH efforts to characterize human genetic diversity and has served on numerous working groups across NIH to address a range of data science issues including the development of the genomic data sharing policy, privacy analysis for risk-sensitive data sets, and advances in scientific publications.

Dr. Sherry earned his PhD in Anthropology at the Pennsylvania State University in 1996 and completed a postdoctoral fellowship at the Louisiana State University Medical Center prior to joining NLM in 1998.

Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)

Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)

In October 2022, NCBI Datasets will release version 14 of our datasets and dataformat command-line tools. This release will contain breaking changes to the command syntax, content of the data packages and data reports. Thank you for your feedback that inspired these new features. We hope they will improve your experience!

We will continue to support CLI v13.x, although new features and improvements will be exclusive to CLI v14.0.0 release and up.

NCBI Datasets supports the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms. Join our mailing list to keep up to date with NCBI Datasets and other CGR news.

More details

How is version 14 of the Datasets command-line tools (CLI v14.x) different from CLI v13.x and previous versions?  Continue reading “Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)”