ClinVar is a freely available submission-driven database for information about genomic variation and its relationship to human health. ClinVar holds more than 1.5 million variants, and is powered by submitters around the world, who provide us with their assessments, the evidence, and the criteria they use to guide their interpretation process and come to their conclusions. To streamline the ClinVar submission process, we are simplifying how submitters provide their assertion criteria. In the past, assertion criteria were provided for each variant. Moving forward, one single set of assertion criteria will be associated with an entire submission regardless of the number of variants. Continue reading “NEW! Streamlining ClinVar Submission of Assertion Criteria”
Tag: Submissions
Ten reasons to submit to ClinVar
#1: Every deposit can help a patient
The healthcare community relies on the standardized view offered by ClinVar variant reports, which include interpretations of clinical significance in relation to Mendelian disease, cancer and pharmacogenetics; an aggregated view of interpretations highlighting those in consensus, conflict or reviewed by expert panel; and detailed views of submitter data, including supporting evidence for the interpretation such as phenotype, assertion criteria and references.
Nov 3 Webinar: dbGaP submission improvements and GaPTools
Attention dbGaP submitters! Join us on November 3, 2021 at 12PM US eastern time to learn about data submission and processing improvements to dbGaP, NIH’s database of Genotype and Phenotype, which contains individual-level data associated with human research studies. You will see how we have made submission easier through the Submission Portal using automated preliminary validation and how you can use GaPTools, a stand-alone data validation tool, on your own submission to expedite the submission process. Join us to discover how dbGaP ensures integrity and high-quality in the genomic data that scientists can access to further their research.
-
- Date: Wed, November 3, 2021
- Time: 12:00 PM – 12:45 PM EDT
- Register
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.
Four new options to simplify your SARS-CoV-2 submissions
We have recently added several exciting improvements to the SARS-CoV-2 GenBank submission process based on community feedback. To save you time, NCBI completes feature annotation for you, which means SARS-CoV-2 GenBank submission only requires a FASTA file and source metadata. Here are other new features to ease and simplify your submission workflow.
Automatically remove failed sequences from a submission: On the web, a single click lets you opt-in to automatic removal of failed sequences (Figure 1) so that the rest of your sequences can be swiftly accessioned! A report provided after the submission lists your failed sequences and points out potential sequence problems so that you can take a closer look after your error-free sequences are released. This option is also available for submission via FTP.
Need to set up FTP submissions? The NCBI team is here to help. Contact gb-admin@ncbi.nlm.nih.gov.
Figure 1. GenBank submission page showing the option to remove sequences with processing errors.
Continue reading “Four new options to simplify your SARS-CoV-2 submissions”
NCBI on YouTube: ClinVar API, check data with GaPTools, get genetic context with Sequence Viewer
Every so often, we gather our most recent videos in one post on the blog, for your convenience. Scroll down – and don’t forget to subscribe to our channel!
Introducing GaPTools for dbGaP Submitters
This video introduces new standalone software called GaPTools, which you can use to check your data before submitting to dbGaP. GaPTools uses the same preliminary validation checks as the dbGaP submission portal.
July 28 Webinar: An update on native NCBI password retirement
The password you set at NCBI to log in to My NCBI, SciENcv, My Bibliography, or submit data to NCBI, will be going away. You will soon have to link a third-party login (e.g. eRA Commons, Google, Microsoft, or a university or institutional log in) to access your account. Join us on July 28, 2021 at 12PM eastern time to learn learn what you need to do link a third-party login using our Wizards, get an updated timeline for the transition third-party logins, and get answers to your questions.
- Date and time: Wed, July 28, 2021 12:00 PM – 12:45 PM EDT
- Register
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.
A dedicated SARS-CoV-2 BioSample submission package in the NCBI Submission Portal
During the COVID-19 pandemic, it is critical to collect descriptive information about the provenance and attributes of SARS-CoV-2 genomic samples so that the course of the virus may be tracked and analyzed. The NCBI Submission Portal now includes a dedicated BioSample submission package to help further improve the quality and richness of submitted SARS-CoV-2 sample metadata. The SARS-CoV-2 clinical or host-associated package presents a framework and standardized fields for submitters to provide attributes considered useful for the rapid analysis and surveillance of SARS-CoV-2 clinical and host-associated cases. For example, mandatory attributes include collection date and geographic location, while suggested but optional attributes include date of SARS-CoV-2 vaccination, vaccine received, and host disease outcome.

Continue reading “A dedicated SARS-CoV-2 BioSample submission package in the NCBI Submission Portal”
March 3 Webinar: Changes are coming to the way you log in to your NCBI account
Join us on March 3, 2021 to learn about changes to NCBI account log ins that will affect those of you who sign in directly your NCBI account. After June 1, 2021 you will need to log in using your institution, social media, Google, Microsoft or login.gov account username and password. In this webinar, you will learn how to register for a free login.gov account and how to link this to an existing NCBI account. You’ll also see where to find the most up-to-date information and FAQs on this topic.
We will answer a few questions from our mail bag on these changes. If you would like to submit a question in advance, please send an Email to at info@ncbi.nlm.nih.gov with the subject line “Changes to my NCBI Log In” by February 24th.
-
- Date and time: Wed, March 3, 2020 12:00 PM – 12:45 PM EST
- Register
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.
Genome Workbench Submission Wizard to replace Sequin for prokaryotic and eukaryotic genome submissions in January 2021
If you use Sequin to submit prokaryotic or eukaryotic genome sequences to GenBank, you need to be aware that Sequin will be retired in January 2021. Genome Workbench’s Submission Wizard, which is already available for submitting annotated genomes, will be the submission tool to use for annotated genomes going forward.
Genome Workbench is desktop software that offers a rich set of integrated tools for studying and analyzing genetic data. You can explore and compare data from multiple sources, including the NCBI databases or the your own private data. The Submission Wizard, available since 2019, allows you to prepare submissions of single genomes where all sequences come from the same organism. This interface (Figure 1) is particularly valuable for:
- Eukaryotic genomes with annotations, for example those prepared with tbl2asn
- Prokaryotic genomes annotated by non-NCBI tools including Prokka and RAST.
Please register to attend our webinar on November 18 to see how to use Genome Workbench to prepare a submission.
(Note: You should continue to submit organelle and viral genomes using BankIt. Please visit the Submission Portal page for information on other submission options.)
Figure 1. Genome Workbench and Submission Wizard. Once the Sequence Editing package is enabled the Submission menu can open the Genome Submission Wizard that prompts you to upload sequence data and presents a tabbed set of forms for entering information about the submission. The Wizard validates the submission and provides editing capabilities for correcting errors. Continue reading “Genome Workbench Submission Wizard to replace Sequin for prokaryotic and eukaryotic genome submissions in January 2021”
INSDC Statement on SARS-CoV-2 sequence data sharing during COVID-19
The National Library of Medicine and its partners in the International Nucleotide Database Collaboration (INSDC) have joined together to issue a statement encouraging the scientific community to submit their SARS-CoV-2 sequences to INSDC databases. The databases offer broad open access and integrated data, literature and tools – features that we believe are critical as the research community works together to understand and combat COVID-19. Read the full statement below.
The databases of the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org/) capture, organize, preserve and present nucleotide sequence data as part of the open scientific record. INSDC member institutions – the EMBL European Bioinformatics Institute (EMBL-EBI), the NIG DNA Data Bank of Japan (NIG-DDBJ) and the National Library of Medicine’s National Center for Biotechnology Information at NIH (NCBI) – are committed to the continued delivery of this critical element of scientific infrastructure.
The global COVID-19 crisis has brought an urgent need for the rapid open sharing of data relating to the outbreak. Most importantly, access to sequence data from the SARS-CoV-2 viral genome is essential for our understanding of the biology and spread of COVID-19. To aid in that effort, all three INSDC members have prioritized processing of SARS-CoV-2 sequence data and have streamlined the submission process.
Availability of data through INSDC databases provides:
-
- Rapid open access – INSDC quickly makes submitted data freely available to everyone, without restrictions on reuse
- Linkage of raw sequence read data to genome assemblies, providing researchers with the ability to validate the integrity of assemblies and investigate asserted mutations and changes in genome sequences
- Integration of SARS-CoV-2 sequences with entirety of INSDC data, including related coronaviruses genome sequences, enabling comparison across species
- Linkage of sequences to the published literature
- Tools – INSDC partners provide integrated data analysis tools, such as BLAST, enhancing the discovery process
In support of the global response to the COVID-19 crisis, the INSDC calls upon the research community to:
-
- Submit raw SARS-CoV-2 data to the databases of the INSDC
- Submit consensus/assembled SARS-CoV-2 data to the databases of the INSDC
- Provide information relating to the sequenced isolate or sample as part of the sequence submission; minimally the time and place of isolation/sampling and an isolate/sample identifier should be provided to maximize the value of the sequences.
- In cases where scientists have already established submissions to other databases, these submissions should continue in parallel to the INSDC submission
The integration of INSDC databases with the global bioinformatics data infrastructure, including tools, secondary databases, compute capacity and curation processes, assures the rapid dissemination of data and drives its maximal impact.
In addition to these fundamental roles of INSDC member institutions in the sharing of viral sequence data, each institution has rapidly established COVID-19-specific programs and resources: the European COVID-19 Data Platform from EMBL-EBI, the DDBJ’s Research Data Resources on New Coronavirus and the NCBI SARS-CoV-2 Resources. These resources both demonstrate the connectedness of INSDC databases to broader bioinformatics initiatives and serve to add immediate value to COVID-19 research.
Guy Cochrane (EMBL-EBI), Ilene Karsch-Mizrachi (NCBI-NLM-NIH), & Masanori Arita (DDBJ) on behalf of the International Nucleotide Sequence Database Collaboration