In October 2022, NCBI Datasets will release version 14 of our datasets and dataformat command-line tools. This release will contain breaking changes to the command syntax, content of the data packages and data reports. Thank you for your feedback that inspired these new features. We hope they will improve your experience!
We will continue to support CLI v13.x, although new features and improvements will be exclusive to CLI v14.0.0 release and up.
The NCBI Datasets SARS-CoV-2 taxonomy page brings you both SARS-CoV-2 genomes and proteins, basic information about SARS-CoV-2, and connections to related NCBI pages, all in one place (see Figures 1 and 2).
Figure 1. NCBI Datasets SARS-CoV-2 taxonomy page. For command-line access, try the datasets command-line tool (top box). For customized filtering options, check out NCBI Virus (bottom box).
If you scroll down the taxonomy page you will find a table of SARS-CoV-2 proteins, each with “Actions” that provide the option to download a package of protein sequences from all annotated SARS-CoV-2 genomes (Figure 2), as well as links to NCBI Gene and the protein sequence from the reference genome.
Figure 2. NCBI Datasets SARS-CoV-2 taxonomy page (cont’d).Click the blue download button to download a package of all SARS-CoV-2 genomes (6 M and counting as of 7/15/22), or just the SARS-CoV-2 reference genome (top box). Below that you see a table of SARS-CoV-2 proteins, each with “Actions” available through the three-dot menu that provides the option to download a package of protein sequences from all annotated SARS-CoV-2 genomes (bottom boxes).
We want to hear from you! Check out the new SARS-CoV-2 taxonomy page and let us know what you think. Contact us with questions or feedback.
Join us on June 15 , 2022 at 12PM US eastern time learn about the NCBI Virus resource – a community portal for viral sequence data that has been important in supporting SARS-CoV-2 research and management of the COVID-19 pandemic. Enhancements to NCBI Virus that support these efforts include: SARS-CoV-2 specific filters, a dedicated web interface that reports on geotemporal prevalence of sequence records for SARS2 lineages, plus details on NCBI’s lineage-defining mutations.
Date and time: Wed, June 15, 2022 12:00 PM – 12:45 PM EDT
The first complete genome sequence of the current monkeypox virus (MPXV) outbreak (isolate name MPXV_USA_2022_MA001) is now available with accession ON563414 in GenBank, a public database of DNA sequences hosted by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM).