NCBI and EBI have been hard at work on our joint MANE collaboration, providing a set of representative transcripts for human protein-coding genes that are identically annotated in the NCBI RefSeq and Ensembl/GENCODE annotation sets and exactly match the GRCh38 reference assembly. We’re pleased to announce MANE v0.92, now covering 16,865 genes or ~88% of known human protein-coding genes.
In particular, we’ve focused on clinically relevant genes and MANE Select now includes 99% of genes with high gene-disease validity. This release also includes 43 extra transcripts labeled “MANE Plus Clinical” that we’ve chosen to aid in clinical reporting, for example, when there are additional pathogenic variants not covered in the MANE Select transcript. While it’s critical to consider other alternatively-spliced transcripts for variant interpretation or functional analyses, the MANE Select and MANE Plus Clinical transcripts provide a common foundation for clinical reporting, and other analyses that benefit from using just one well-supported transcript or protein per gene.
You, as a submitter, are the beating heart of ClinVar. Your contributions helps thousands of genetic counselors and clinicians, as well as their patients and patients’ family members. We have added validation to the online file submissions portal, so that you submitters have more control over how to deal with errors in your submitted files.
You now have two options when submitting data. You can submit any data that passes validation and receive a report of the data that failed. The failed data can be reviewed and resubmitted when it’s convenient for you.
In support of data sharing efforts, NCBI’s ClinVar and Genetic Testing Registry (GTR) now accept submissions that use MONDO IDs to identify conditions.
To submit to ClinVar, download our updated spreadsheet templates and enter MONDO as the Condition ID type. Note: The updated template is necessary only if you identify the condition by MONDO ID, not by name.
GTR submitters can use MONDO IDs to identify phenotypes in the clinical tests submitted via spreadsheet, and Mondo phenotype names in both clinical and research test submissions.
“Database resources of the National Center for Biotechnology Information”
by Eric W Sayers, Jeff Beck, J Rodney Brister, Evan E Bolton, Kathi Canese et al. (PMID: 31602479)
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts published in life science journals. The Entrez system provides search and retrieval operations for most of these data from 38 distinct databases. This article provides a brief overview of the NCBI Entrez system of databases, followed by a summary of resources that were either introduced or significantly updated in the past year, including PubMed, PMC, Bookshelf, BLAST databases and more!
We have added a new feature to ClinVar that allows you to follow a particular variant and be notified if the overall clinical interpretation in ClinVar changes, for example from a pathogenic category to a non-pathogenic one. This service will let you know about changes that may require you to update your analysis reports and contact your patients and ordering physicians. The new feature allows you to follow a variant from the variation page (Figure 1). Simply click the “Follow” button to begin receiving notifications.
Figure 1. A ClinVar variant page (VCV000541155.1) showing the ‘Follow’ button. The text on the button changes to ‘Following’ after you add it to your followed variants. Clicking ‘Following’ presents the option to ‘Unfollow’, which removes the variant from the followed list when clicked.
Check out the latest videos on YouTube to learn how to best use NCBI graphical viewers, SRA, PGAP, and other resources.
Genome Data Viewer: Analyzing Remote BAM Alignment Files and Other Tips
This video shows you how to upload remote BAM files, and succinctly demonstrates handy viewer settings, such as Pileup display options, and highlights the very helpful tooltips in the Genome Data Viewer (GDV). There’s also a brief blog post on the same topic.
ClinVar is proud to announce the submission of the one millionth record to its database.
The millionth submission was published on Friday, December 20, 2019, a milestone achievement for providing open access to human variant data with asserted consequence to the clinical genetics and research communities.
ClinVar extends its thanks to the many laboratories, partners, and members of the community whose efforts and adoption of the practice of data-sharing paved the way for this achievement. All organizations that contributed to ClinVar’s genetics resources share in this accomplishment, with special recognition reserved for ClinGen and several of their members, including EGL Genetic Diagnostics/Eurofins Clinical Diagnostics, GeneDx, Invitae, and Laboratory for Molecular Medicine/Partners HealthCare Personalized Medicine, whose early submissions helped jump-start ClinVar’s database.
Have you ever searched for a variant in ClinVar with a gene symbol and a c., and wondered why you got no result? Is the variant not in ClinVar, or was something wrong with your search?
Wonder no more – we’ve improved searching in ClinVar so you get results for a gene symbol and c. more often!
While a gene symbol and c. make an ambiguous query and a full HGVS expression is always the best search term, this new service will help you find the variant when gene symbol and c. are all the information that you have.
Now it’s easier than ever to access all data in ClinVar for a variant or set of variants across all reported diseases. ClinVar’s new XML is organized by variant only (Variation ID), instead of the variant-disease pair. This reduces redundancy, for example in cases where a variant is related to several disease concepts, and makes the XML consistent with the ClinVar web pages. You can get ClinVarVariationRelease XML from the /xml/clinvar_variation/ directory on the ClinVar FTP site. New features in ClinVarVariationRelease XML shown in Figure 1 include:
Explicit elements to distinguish between variants that were directly interpreted and “included” variants, those that were interpreted only as part of a Haplotype or Genotype. The clinical significance for included variants is indicated as “no interpretation for the single variant”.
Explicit elements to distinguish records for simple allele, haplotypes, and genotypes
The Replaces element that provides a history and indicates accessions that were merged into the current accession.
A section that maps the submitted name or identifier for the interpreted condition to the corresponding name used in ClinVar and the MedGen Concept Identifier (CUI)
Figure 1. ClinVar variant-centric XML showing a variant record for a haplotype (VCV000236230) that comprises two included variations (SimpleAlleles) that are marked as “no interpretation for the single variant”. The record includes all the condition records (RCVList) with names and identifiers from MedGen, OMIM and other sources.
To learn more about how to use this data, read our documentation.