Do you need a smaller dataset for your analyses of virus data? In response to your feedback, NCBI Virus now allows you to download a randomized subset of your results for nucleotide, protein, or RefSeq genome sequences from any supported virus (Figure 1). This option is useful for viruses such as SARS-CoV-2 or Influenza A that have very large numbers of records, where the entire dataset may present a challenge. In such cases, a smaller representative sample is easier to work with to support your analysis. You can also reduce the bias in a dataset by getting a representative number of records for each country or host (Figure 2). Continue reading “Download Randomized Data Subset from NCBI Virus”
Tag: NCBI Virus
NCBI Virus Extends Dashboard Visualizations to all Virus Sequences!
Do you want to be able to quickly filter your virus search results based on important attributes? Good news, now you can! We are pleased to announce the extension of Dashboard Visualizations for any virus in the NCBI Virus collection (Figure 1). Dashboard Visualizations allow data to be quickly visualized in a graphical presentation based on a few highly sought-after attributes to prefilter your dataset.
What are Dashboard Visualizations?
Dashboard Visualizations allow you to filter your search by geographic location, collection time, and release time. Each feature on the Dashboard is interactive, so when a filter is applied, it limits the data shown in the other features. When using these filters, the top summary section updates to provide you a snapshot of the number of records in NCBI RefSeq, Nucleotide, and Protein that fit the combined conditions of your search in the NCBI Virus database. Continue reading “NCBI Virus Extends Dashboard Visualizations to all Virus Sequences!”
Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)
In October 2022, NCBI Datasets will release version 14 of our datasets and dataformat command-line tools. This release will contain breaking changes to the command syntax, content of the data packages and data reports. Thank you for your feedback that inspired these new features. We hope they will improve your experience!
We will continue to support CLI v13.x, although new features and improvements will be exclusive to CLI v14.0.0 release and up.
NCBI Datasets supports the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms. Join our mailing list to keep up to date with NCBI Datasets and other CGR news.
How is version 14 of the Datasets command-line tools (CLI v14.x) different from CLI v13.x and previous versions? Continue reading “Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)”
Announcing the NCBI Datasets SARS-CoV-2 taxonomy page
Need SARS-CoV-2 assembled genome sequences or specific SARS-CoV-2 protein sequences? You can find them on the new SARS-CoV-2 taxonomy page brought to you by NCBI Datasets.
The NCBI Datasets SARS-CoV-2 taxonomy page brings you both SARS-CoV-2 genomes and proteins, basic information about SARS-CoV-2, and connections to related NCBI pages, all in one place (see Figures 1 and 2).
Figure 1. NCBI Datasets SARS-CoV-2 taxonomy page. For command-line access, try the datasets command-line tool (top box). For customized filtering options, check out NCBI Virus (bottom box).
If you scroll down the taxonomy page you will find a table of SARS-CoV-2 proteins, each with “Actions” that provide the option to download a package of protein sequences from all annotated SARS-CoV-2 genomes (Figure 2), as well as links to NCBI Gene and the protein sequence from the reference genome.
Figure 2. NCBI Datasets SARS-CoV-2 taxonomy page (cont’d). Click the blue download button to download a package of all SARS-CoV-2 genomes (6 M and counting as of 7/15/22), or just the SARS-CoV-2 reference genome (top box). Below that you see a table of SARS-CoV-2 proteins, each with “Actions” available through the three-dot menu that provides the option to download a package of protein sequences from all annotated SARS-CoV-2 genomes (bottom boxes).
We want to hear from you! Check out the new SARS-CoV-2 taxonomy page and let us know what you think. Contact us with questions or feedback.
Join our mailing list to keep up to date with Datasets and other NCBI news.
June 15 Webinar: What’s new with NCBI Virus?
Join us on June 15 , 2022 at 12PM US eastern time learn about the NCBI Virus resource – a community portal for viral sequence data that has been important in supporting SARS-CoV-2 research and management of the COVID-19 pandemic. Enhancements to NCBI Virus that support these efforts include: SARS-CoV-2 specific filters, a dedicated web interface that reports on geotemporal prevalence of sequence records for SARS2 lineages, plus details on NCBI’s lineage-defining mutations.
- Date and time: Wed, June 15, 2022 12:00 PM – 12:45 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the NCBI Outreach Events page.
Monkeypox virus: Complete genome from the current outbreak now available in GenBank
The first complete genome sequence of the current monkeypox virus (MPXV) outbreak (isolate name MPXV_USA_2022_MA001) is now available with accession ON563414 in GenBank, a public database of DNA sequences hosted by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM).
Several cases of monkeypox have been identified in geographically widespread countries. Monkeypox is classified as a zoonotic disease where transmission of the virus is usually due to animal-human contact. Genetically, monkeypox viruses cluster into two groups: the Congo basin and the west African clade. This particular outbreak has been identified as due to a virus from the west African clade which is often associated with milder disease and, in this case, human-to-human spread is suspected. Continue reading “Monkeypox virus: Complete genome from the current outbreak now available in GenBank”