NCBI at ASHG 2019: Two Data CoLabs Demonstrate How to Analyze NextGen Sequence Data and Access Genetic Variation Population Data


NCBI will be attending the American Society of Human Genetics (ASHG) 2019 in Houston Texas on Oct 15-19.

This year, we will be presenting two CoLabs – interactive sessions where you can learn about new NCBI tools and resources. Read on below for a description of each CoLab and join us at ASHG next week!

Continue reading

dbSNP celebrates 20 years!


dbSNP was established in August 1999 as a collaboration between NCBI and the National Human Genome Research Institute (NHGRI) as a database of small scale nucleotide variants. The database includes both common and rare single-base nucleotide variation (SNV), short (=< 50bp) deletion/insertion polymorphisms, and other classes of small genetic variations.

Continue reading

Structural Variant Hackathon


NCBI is pleased to announce a Structural Variant Hackathon at the Baylor College of Medicine, Houston Texas, immediately before ASHG on October 11-13, 2019.

We’re specifically looking for folks who have experience in working with structural variants, complex disease, precision medicine, and similar genomic analysis.  If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments (please note that the event itself will focus on open access public human data).

Potential topics include:

  • Mapping structural variants to public databases
  • Calculating the heritability of different types of structural variants
  • CNV effect on isoform expression
  • Assembly accuracy for metagenomics
  • Quality assessment in large cohorts

The hackathon runs from 9 am – 6 pm each day, with the potential to extend into the evening hours each day. There will also be optional social events at the end of each day. Working groups of five to six individuals, with various backgrounds and expertise, will be formed into five to eight teams with an experienced leader. These teams will build pipelines and tools to analyze large datasets within a cloud infrastructure. Each day, we will come together to discuss progress on each of the topics, bioinformatics best practices, coding styles, etc.

There will be no registration fee associated with attending this event.

Note: Participants will need to bring their own laptop to this program. No financial support for travel, lodging, or meals is available for this event.

Continue reading

ClinVar’s new XML aggregated by Variation ID


Now it’s easier than ever to access all data in ClinVar for a variant or set of variants across all reported diseases.  ClinVar’s new XML is organized by variant only (Variation ID), instead of the variant-disease pair. This reduces redundancy, for example in cases where a variant is related to several disease concepts, and makes the XML consistent with the ClinVar web pages. You can get ClinVarVariationRelease XML from the /xml/clinvar_variation/ directory on the ClinVar FTP site.  New features in ClinVarVariationRelease XML shown in Figure 1 include:

  • Explicit elements to distinguish between variants that were directly interpreted and “included” variants, those that were interpreted only as part of a Haplotype or Genotype. The clinical significance for included variants is indicated as “no interpretation for the single variant”.
  • Explicit elements to distinguish records for simple allele,  haplotypes, and genotypes
  • The Replaces element that provides a history and indicates accessions that were merged into the current accession.
  • A section that  maps the submitted name or identifier for the interpreted condition to the corresponding name used in ClinVar and the MedGen Concept Identifier (CUI)

ClinVarXML_markupFigure 1.  ClinVar variant-centric XML showing a variant record for a haplotype (VCV000236230) that comprises two included variations (SimpleAlleles) that are marked as “no interpretation for the single variant”.  The record includes all the condition records (RCVList) with names and identifiers from MedGen, OMIM and other sources.

To learn more about how to use this data, read our documentation.

Tell us how ClinVar has helped you by writing to us at clinvar@ncbi.nlm.nih.gov.

50,000 new clinically relevant structural variation calls in dbVar


We’ve expanded the catalog of clinically relevant structural variants (SV) in dbVar by adding 57,520 ClinVar records.  You can access the newly added data through study nstd102.

The updated collection includes:

  • 20,000 new SVs, and more than 37,000 copy number variants (CNV) observed in ClinGen laboratories during routine cytogenomic laboratory testing that were previously accessioned separately at dbVar
  • 15,000 SVs asserted as ‘Pathogenic’ or ‘Likely pathogenic’ for thousands of clinical genetic disorders including breast, ovarian, and colon cancers; hypercholesterolemia; schizophrenia; Duchenne Muscular Dystrophy; autism spectrum disorders; and many others
  • links to more than 1,600 related PubMed articles and thousands of related data records in ClinVar, OMIM, GeneReviews, MedGen, MeSH, etc.

You can browse dbVar studies on the web or download the data.  We provide dbVar data  in a number of standard formats (VCF, GVF, and TSV) mapped to assemblies GRCh38, GRCh37, and NCBI36 allowing you perform analysis using standard tools and integrate the data into your bioinformatic workflows.

Visit our Walkthrough page to learn how to use these new dbVar data to help interpret structural variation in your favorite gene or genomic region.

Improved ClinVar search quickly connects you to information about variants


If you’ve been searching in ClinVar, you might have noticed search improvements introduced in December that reliably connect you with information on your variant of interest. ClinVar has broadened its search capability to accept many different ways of expressing the same variation, including variation described on RefSeq transcripts and proteins. If your variant expression  is not reported in ClinVar, we alert you to other variants at the same genomic location or link you to related information in other NCBI resources such as dbSNP, LitVar, and PubMed. ClinVar will also now interpret expressions that contain minor errors or warn you about improper syntax that it cannot interpret.

sensor2Figure 1.  Improved search results in Clinvar showing mapping of an HGVS expression to the equivalent variant in ClinVar.

Here are some example queries that show the improved search results.

NM_001318787.1:c.2258G>A – an HGVS expression that is not in ClinVar, but ClinVar has an alternate expression for a variant (Figure 1).

NM_004958.3:c.7365C>A – a variant not in ClinVar, but another variant is at the same genomic location is in ClinVar.

NM_002113.2:c.19delG – a variant is not in ClinVar, but there is additional information for the variant in other databases.

We welcome your feedback on your search experience and any additional ideas on how to improve searching in ClinVar.

February 6 Webinar: New Variation Services for Normalizing, Remapping, and Annotating Variants


Join us on Wednesday, February, 2019, when NCBI staff will show you how to use a new set of NCBI variation services that rely on a variant data model called SPDI (Sequence Position Deletion Insertion). These services and data model allow you to inter-convert, map and disambiguate variants in standard formats (RefSNP accessions, HGVS and VCF). Unlike many current variant notation systems, SPDI provides unambiguous, machine-readable definitions of variants. SPDI not only powers SNP build and mapping procedures at NCBI but also our variant sensors that are active in the global search and ClinVar. These services and notation system provide valuable new tools for people who work with sequence variants.additional variant information.

Date and time: Wed, Feb 6, 2019 12:00 PM – 12:30 PM EDT

Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Update single records easily with ClinVar’s Single SCV Update


The ClinVar Team is happy to announce a new online form in the ClinVar Submission Portal, the Single SCV Update, which makes it easier for you to update a single record.

ClinVar_SIngle_SCV_2The new ClinVar Single SCV Update form showing the sections for editing the evaluation date, clinical significance, condition, and citations.

Continue reading

November 14 Webinar: Variant Interpretation using NCBI Resources


Next Wednesday, November 14, 2018, NCBI staff will show you how to use NCBI’s genome browsers and other resources to interpret variants. The graphical displays of Genome Data Viewer (GDV) and Variation Viewer offer an interactive experience that allows you to explore NCBI’s rich collection of annotations, datasets and literature for deciphering your variant-associated data. In this presentation, we’ll step through case studies and show you how to quickly display relevant NCBI track sets — including the new RefSeq Functional Elements track, upload a file or remotely-hosted dataset and display these as a track, and use browser tracks to identify known variants, then assess variant functional and clinical significance and allele frequency. You will also learn how to navigate from the browsers to NCBI resources such as ClinVar, dbSNP and PubMed, for additional variant information.

Date and time: Wed, Nov 14, 2018 12:00 PM – 12:45 PM EDT

Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.