“Database resources of the National Center for Biotechnology Information”
by Eric W Sayers, Jeff Beck, J Rodney Brister, Evan E Bolton, Kathi Canese et al. (PMID: 31602479)
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 38 distinct databases. This article provides a brief overview of the NCBI Entrez system of databases, followed by a summary of resources that were either introduced or significantly updated in the past year, including PubMed, PMC, Bookshelf, BLAST databases and more!
We have added a new feature to ClinVar that allows you to follow a particular variant and be notified if the overall clinical interpretation in ClinVar changes, for example from a pathogenic category to a non-pathogenic one. This service will let you know about changes that may require you to update your analysis reports and contact your patients and ordering physicians. The new feature allows you to follow a variant from the variation page (Figure 1). Simply click the “Follow” button to begin receiving notifications.
Figure 1. A ClinVar variant page (VCV000541155.1) showing the ‘Follow’ button. The text on the button changes to ‘Following’ after you add it to your followed variants. Clicking ‘Following’ presents the option to ‘Unfollow’, which removes the variant from the followed list when clicked.
Check out the latest videos on YouTube to learn how to best use NCBI graphical viewers, SRA, PGAP, and other resources.
Genome Data Viewer: Analyzing Remote BAM Alignment Files and Other Tips
This video shows you how to upload remote BAM files, and succinctly demonstrates handy viewer settings, such as Pileup display options, and highlights the very helpful tooltips in the Genome Data Viewer (GDV). There’s also a brief blog post on the same topic.
ClinVar is proud to announce the submission of the one millionth record to its database.
The millionth submission was published on Friday, December 20, 2019, a milestone achievement for providing open access to human variant data with asserted consequence to the clinical genetics and research communities.
ClinVar extends its thanks to the many laboratories, partners, and members of the community whose efforts and adoption of the practice of data-sharing paved the way for this achievement. All organizations that contributed to ClinVar’s genetics resources share in this accomplishment, with special recognition reserved for ClinGen and several of their members, including EGL Genetic Diagnostics/Eurofins Clinical Diagnostics, GeneDx, Invitae, and Laboratory for Molecular Medicine/Partners HealthCare Personalized Medicine, whose early submissions helped jump-start ClinVar’s database.
Have you ever searched for a variant in ClinVar with a gene symbol and a c., and wondered why you got no result? Is the variant not in ClinVar, or was something wrong with your search?
Wonder no more – we’ve improved searching in ClinVar so you get results for a gene symbol and c. more often!
While a gene symbol and c. make an ambiguous query and a full HGVS expression is always the best search term, this new service will help you find the variant when gene symbol and c. are all the information that you have.
Now it’s easier than ever to access all data in ClinVar for a variant or set of variants across all reported diseases. ClinVar’s new XML is organized by variant only (Variation ID), instead of the variant-disease pair. This reduces redundancy, for example in cases where a variant is related to several disease concepts, and makes the XML consistent with the ClinVar web pages. You can get ClinVarVariationRelease XML from the /xml/clinvar_variation/ directory on the ClinVar FTP site. New features in ClinVarVariationRelease XML shown in Figure 1 include:
Explicit elements to distinguish between variants that were directly interpreted and “included” variants, those that were interpreted only as part of a Haplotype or Genotype. The clinical significance for included variants is indicated as “no interpretation for the single variant”.
Explicit elements to distinguish records for simple allele, haplotypes, and genotypes
The Replaces element that provides a history and indicates accessions that were merged into the current accession.
A section that maps the submitted name or identifier for the interpreted condition to the corresponding name used in ClinVar and the MedGen Concept Identifier (CUI)
Figure 1. ClinVar variant-centric XML showing a variant record for a haplotype (VCV000236230) that comprises two included variations (SimpleAlleles) that are marked as “no interpretation for the single variant”. The record includes all the condition records (RCVList) with names and identifiers from MedGen, OMIM and other sources.
To learn more about how to use this data, read our documentation.
We’ve expanded the catalog of clinically relevant structural variants (SV) in dbVar by adding 57,520 ClinVar records. You can access the newly added data through study nstd102.
The updated collection includes:
20,000 new SVs, and more than 37,000 copy number variants (CNV) observed in ClinGen laboratories during routine cytogenomic laboratory testing that were previously accessioned separately at dbVar
15,000 SVs asserted as ‘Pathogenic’ or ‘Likely pathogenic’ for thousands of clinical genetic disorders including breast, ovarian, and colon cancers; hypercholesterolemia; schizophrenia; Duchenne Muscular Dystrophy; autism spectrum disorders; and many others
You can browse dbVar studies on the web or download the data. We provide dbVar data in a number of standard formats (VCF, GVF, and TSV) mapped to assemblies GRCh38, GRCh37, and NCBI36 allowing you perform analysis using standard tools and integrate the data into your bioinformatic workflows.
Visit our Walkthrough pageto learn how to use these new dbVar data to help interpret structural variation in your favorite gene or genomic region.
In about a week, NCBI staff will join GeneReviews® on their home turf, Seattle, at the Annual Clinical Genetics Meeting hosted by the American College of Medical Genetics and Genomics (ACMG). While there we will have an exhibit booth (#531) where you can meet our staff, get answers to your questions, and pick-up informative handouts on our various resources for clinical practice.
Also, be sure to visit our two posters on Friday, April 5 from 10:30 AM to 12 PM.
If you’ve been searching in ClinVar, you might have noticed search improvements introduced in December that reliably connect you with information on your variant of interest. ClinVar has broadened its search capability to accept many different ways of expressing the same variation, including variation described on RefSeq transcripts and proteins. If your variant expression is not reported in ClinVar, we alert you to other variants at the same genomic location or link you to related information in other NCBI resources such as dbSNP, LitVar, and PubMed. ClinVar will also now interpret expressions that contain minor errors or warn you about improper syntax that it cannot interpret.
Figure 1. Improved search results in Clinvar showing mapping of an HGVS expression to the equivalent variant in ClinVar.
Here are some example queries that show the improved search results.
NM_001318787.1:c.2258G>A – an HGVS expression that is not in ClinVar, but ClinVar has an alternate expression for a variant (Figure 1).
NM_004958.3:c.7365C>A – a variant not in ClinVar, but another variant is at the same genomic location is in ClinVar.
NM_002113.2:c.19delG – a variant is not in ClinVar, but there is additional information for the variant in other databases.
We welcome your feedback on your search experience and any additional ideas on how to improve searching in ClinVar.