From February 25-27, 2019, NCBI will help with a Data Science hackathon at USF in Tampa Florida!
The hackathon will focus on the genomics of Iron-linked Rare Diseases as well as large scale RNA-Seq indexing and analysis. This event is for researchers, including students and postdocs, who have already engaged in the use of large datasets or in the development of pipelines for analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians, or librarians.
The event is open to anyone selected for the hackathon and willing to travel to Tampa.
Working groups of five to six individuals will be formed into five to eight teams. These teams will build or expand on pipelines and tools to analyze large datasets within a cloud infrastructure. Example subjects for such hackathons include:
Integrative pipelines to analyze large scale RNA-Seq experiments
Visualization tools for mapping phenotypes to genotypes
Rapid clinical diagnostics tools
Structural variant mining with single molecule sequencing data
Please see the application form for more details and additional projects. The project list will continue to evolve and will be updated on the application form.
We’ve recently added save and share options to PubMed Labs. From your PubMed Labs search results list, you can now use the ‘Save’ button to save a selection of results in a variety of formats, including Summary and Abstract. You can also use the ‘Email’ button to share a selection of results, including abstracts, with colleagues.
If you’ve been searching in ClinVar, you might have noticed search improvements introduced in December that reliably connect you with information on your variant of interest. ClinVar has broadened its search capability to accept many different ways of expressing the same variation, including variation described on RefSeq transcripts and proteins. If your variant expression is not reported in ClinVar, we alert you to other variants at the same genomic location or link you to related information in other NCBI resources such as dbSNP, LitVar, and PubMed. ClinVar will also now interpret expressions that contain minor errors or warn you about improper syntax that it cannot interpret.
Figure 1. Improved search results in Clinvar showing mapping of an HGVS expression to the equivalent variant in ClinVar.
Here are some example queries that show the improved search results.
NM_001318787.1:c.2258G>A – an HGVS expression that is not in ClinVar, but ClinVar has an alternate expression for a variant (Figure 1).
NM_004958.3:c.7365C>A – a variant not in ClinVar, but another variant is at the same genomic location is in ClinVar.
NM_002113.2:c.19delG – a variant is not in ClinVar, but there is additional information for the variant in other databases.
We welcome your feedback on your search experience and any additional ideas on how to improve searching in ClinVar.
Join us on Wednesday, February, 2019, when NCBI staff will show you how to use a new set of NCBI variation services that rely on a variant data model called SPDI (Sequence Position Deletion Insertion). These services and data model allow you to inter-convert, map and disambiguate variants in standard formats (RefSNP accessions, HGVS and VCF). Unlike many current variant notation systems, SPDI provides unambiguous, machine-readable definitions of variants. SPDI not only powers SNP build and mapping procedures at NCBI but also our variant sensors that are active in the global search and ClinVar. These services and notation system provide valuable new tools for people who work with sequence variants.additional variant information.
Date and time: Wed, Feb 6, 2019 12:00 PM – 12:30 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
RefSeq release 92 is accessible online, via FTP and through NCBI’s Entrez programming utilities, E-utilities.
This full release incorporates genomic, transcript, and protein data available, as of January 4, 2019 and contains 185,738,687 records, including 130,366,644 proteins, 25,088,890 RNAs, and sequences from 86,867 organisms. The release is provided in several directories as a complete dataset and as divided by logical groupings.
dbSNP build 152 is a small incremental update from build 151 provided for you to begin testing and integrating the new build products into your workflow. Build 152 uses the new system with SPDI variant notation and is now available on FTP and the new RefSNP webpage.
From February 4-6, 2019, the NCBI will help with a data science hackathon at the Fred Hutchinson Cancer Research Center in Seattle. To apply, complete this form (approximately 10 minutes to complete). Initial applications are due Friday, January 11th by 11 pm ET.
The hackathon will focus on genomics as well as general data science. This event is for researchers, including students and postdocs, who have already engaged in the use of large datasets or in the development of pipelines for analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians, or librarians.