Tag: SARS-CoV-2

Four new options to simplify your SARS-CoV-2 submissions

Four new options to simplify your SARS-CoV-2 submissions

We have recently added several exciting improvements to the SARS-CoV-2 GenBank submission process based on community feedback. To save you time, NCBI completes feature annotation for you, which means SARS-CoV-2 GenBank submission only requires a FASTA file and source metadata. Here are other new features to ease and simplify your submission workflow.

Automatically remove failed sequences from a submission: On the web, a single click lets you opt-in to automatic removal of failed sequences (Figure 1) so that the rest of your sequences can be swiftly accessioned! A report provided after the submission lists your failed sequences and points out potential sequence problems so that you can take a closer look after your error-free sequences are released. This option is also available for submission via FTP.

Need to set up FTP submissions? The NCBI team is here to help. Contact gb-admin@ncbi.nlm.nih.gov.

Figure 1. GenBank submission page showing the option to remove sequences with processing errors.

Continue reading “Four new options to simplify your SARS-CoV-2 submissions”

June 30 Webinar: Using NCBI Datasets to download sequence and annotation for genomes and genes

June 30 Webinar: Using NCBI Datasets to download sequence and annotation for genomes and genes

Join us on June 30, 2021 at 12PM eastern time to learn how to use the new NCBI Datasets resource to find and download gene, genome and SARS-CoV-2 sequence and annotation. You will learn how to access these datasets through either the web interface or the new command-line tools that allow you to incorporate these data in your bioinformatic workflows.

  • Date and time: Wed, June 30, 2021 12:00 PM – 12:45 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Structure viewer iCn3D version 3 featuring analysis of 3D structures!

The NCBI structure viewer iCn3D version 3 is now available on the NCBI web site and from GitHub.

Analysis of 3D Structures

You can use the current version with the icn3d package at npm to write scripts to call functions in iCn3D. For example, this script on GitHub can calculate the change in interactions due to a mutation.  The results of this analysis for the structure (6M0J) of the SARS-CoV-2 spike protein bound to the ACE2 receptor are displayed in Figure 1. These show the predicted changes in interactions with other residues in the the SARS-CoV-2 spike protein and in the ACE2 receptor when the asparagine (N) at position 501 of the spike protein is changed to a tyrosine (Y). You can also run these scripts from the command line to process a list of 3D structures to get and analyze annotations.

Figure 1. iCn3D viewer showing the predicted interactions with other residues in the spike protein and in the ACE2 target when the asparagine (N) at position 501 of the SARS-CoV-2 spike protein is substituted with  tyrosine (Y), highlighted in yellow. Interactions were calculated using the script interactions2.js.

Continue reading “Structure viewer iCn3D version 3 featuring analysis of 3D structures!”

A dedicated SARS-CoV-2 BioSample submission package in the NCBI Submission Portal

During the COVID-19 pandemic, it is critical to collect descriptive information about the provenance and attributes of SARS-CoV-2 genomic samples so that the course of the virus may be tracked and analyzed. The NCBI Submission Portal now includes a dedicated BioSample submission package to help further improve the quality and richness of submitted SARS-CoV-2 sample metadata. The SARS-CoV-2 clinical or host-associated package presents a framework and standardized fields for submitters to provide attributes considered useful for the rapid analysis and surveillance of SARS-CoV-2 clinical and host-associated cases. For example, mandatory attributes include collection date and geographic location, while suggested but optional attributes include date of SARS-CoV-2 vaccination, vaccine received, and host disease outcome.

a mock-up of a SARS submission shows fields for important metadata
Figure 1. This mock-up shows mandatory attributes like collection date and geographic location, as well as suggested attributes like date of vaccination and host disease outcome.

Continue reading “A dedicated SARS-CoV-2 BioSample submission package in the NCBI Submission Portal”

Data for SARS-CoV-2 variants now available at NCBI

Looking for genomes for the B.1.1.7 SARS-CoV-2 variant? NCBI now supports searches for SARS-CoV-2 variant names such as B.1.1.7, B.1.351, or P.1. For example, search for B.1.1.7 (Figure 1) and you’ll see a virus classification box with an option to download a SARS-CoV-2 data package. SARS-CoV-2 data packages include genome and protein sequences and a detailed data report for all SARS-CoV-2 genomes classified as that variant. SARS-CoV-2 genome lineages are classified by pangolin, using the pangoLEARN algorithm.

Figure 1. SARS-CoV-2 variant search result with button to download a data package containing data for all SARS-CoV-2 genomes matching that variant lineage, B.1.1.7 in this case.

Continue reading “Data for SARS-CoV-2 variants now available at NCBI”

New NCBI Datasets home and documentation pages provide easier access

NCBI Datasets, the new set of services for downloading genome assembly and annotation data (previous Datasets posts), has redesigned and reorganized web pages to make it easier to find and access the services and documentation you need.

NCBI Datasets has a fresh new homepage (Figure 1) highlighting the types of data available through our tools. Available data include genome assemblies, genes, and SARS-CoV-2 genomic and protein data.  You can easily access these from the new page or learn more with our new documentation pages.

Figure 1. Features of the new Datasets homepage with quick access to help documentation including the Quickstart and How-to guides as well as access to Genome, Gene, and Coronavirus Data, and the Datasets and Dataformat command-line tools. Continue reading “New NCBI Datasets home and documentation pages provide easier access”

NCBI on YouTube: RAPT and BLAST+ on the Cloud, SARS-CoV-2 genome data in Datasets

It’s time we do another roundup of what’s been happening on YouTube!

First up, the NCBI YouTube channel has merged with the NLM YouTube channel. You’ll now be able to find diverse content all on one channel, from tips on using resources to fascinating moments in the history of medicine and more!

Continue reading “NCBI on YouTube: RAPT and BLAST+ on the Cloud, SARS-CoV-2 genome data in Datasets”

The latest in COVID-19 related human gene annotation now in NCBI RefSeq and Gene

Interested in human genes involved in COVID-19 biology? NCBI’s RefSeq group has been hard at work compiling a set of human genes with roles in coronavirus infection and disease. You can now see and search for these genes and their regulatory elements in NCBI Gene and RefSeq.

Figure 1. Top section of the human ACE2 record in the Gene database. COVID-19 information can be found in the Summary and Annotation information sections.

Continue reading “The latest in COVID-19 related human gene annotation now in NCBI RefSeq and Gene”

Coronavirus host gene regulatory elements now annotated by RefSeq Functional Elements

The COVID-19 pandemic has drawn attention to the human host genes associated with SARS-CoV-2 entry and to the elements that regulate expression of these genes. At NCBI, we have prioritized curation of experimentally validated regulatory elements for these genes in the RefSeq Functional Elements project. Our annotations include several enhancers, promoters, cis-regulatory elements and protein binding sites, among other feature types.  We have annotated 236 regulatory features for 27 distinct biological regions in the latest human Annotation Release (109.20200522) including regulatory elements for the ABOACE2, ANPEPCD209CLEC4GCLEC4MCTSL, DPP4,and TMPRSS2 genes

You can view our regulatory element to target gene linkages in the regulatory interactions track using our new track hub that we recently announced.  You can also see the biological regions and features tracks. These have functional and descriptive metadata, including biological region summaries, experimental evidence types, publication support and more.

The example in Figure 1 shows RefSeq Functional Element feature annotation in NCBI’s Genome Data Viewer (GDV) for the ABO gene region (GRCh38, NW_009646201.1: 73,864-103,789) the determiner of the human ABO blood group. A genome-wide association study recently identified non-coding  ABO variants associated with COVID-19 disease severity (PMID:32558485), which map to some of the RefSeq Functional Elements in this region.ABO region showing biological regions in GDVFigure 1. The human ABO gene region in the NCBI GDV displaying the RefSeq Functional Element features.  The biological regions aggregate track shows underlying feature annotation for an ABO upstream enhancer (LOC112637023),  promoter region (LOC112679202),  +5.8 intron 1 enhancer (LOC112679198),  a 3′ regulatory region (LOC112639999), and a +36.0 downstream enhancer (LOC112637025).  Functional Element features include numerous enhancers, promoters, cis-regulatory elements and protein / transcription factor binding sites.

We have more information about RefSeq Functional Elements on our website, including data download and extraction options. Stay tuned to NCBI Insights and other NCBI social media for future announcements about RefSeq Functional Elements!