This blog post is directed toward people who use dbSNP and dbVar, particularly those who submit non-human data to the two databases.
dbSNP and dbVar archive, process, display and report information related to germline and somatic variations from multiple species. These two databases have grown rapidly as sequencing and other discovery technologies have evolved, and now contain nearly two billion variants from over 360 species.
Based on projected growth and the resources required to archive and distribute the data, continued support for all organisms will become unsustainable for NCBI in the near future. Therefore, NCBI will phase out support for all non-human organisms in dbSNP and dbVar, and will support only human variation.
This blog post is directed toward Assembly users.
A new “Download assemblies” button is now available in the Assembly database. This makes it easy to download data for multiple genomes without having to write scripts.
For example, you can run a search in Assembly and use check boxes (see left side of screenshot below) to refine the set of genome assemblies of interest. Then, just open the “Download assemblies” menu, choose the source database (GenBank or RefSeq), choose the file type, and start the download. An archive file will be saved to your computer that can be expanded into a folder containing your selected genome data files.
dbSNP’s Human Build 150 includes a large number of new submissions from the Human Longevity, Inc. (HLI) and TopMed, increasing the total number of Human RefSNPs in the database from 154 to 324 million. TopMed has also provided new allele frequency data for 163 million RefSNPs.
Central Bearded Dragon (Pogona vitticeps)
(Credit: Mark Sum, USGS. Public domain.)
In April, the NCBI Eukaryotic Genome Annotation Pipeline released new annotations in RefSeq for the following eleven organisms:
From June 19-21, 2017, the NCBI will assist in a bioinformatics hackathon at the New York Genome Center (NYGC). This hackathon will focus on advanced bioinformatics analysis of next generation sequencing (NGS) data, proteomics and metadata. To apply for this hackathon, complete this application (approximately 10 minutes to complete). Applications are due Monday, May 22, 2017 by 5 PM ET.
This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for bioinformatics analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians or librarians.
The event is open to anyone selected for the hackathon and able to travel to the NYGC (see address below).
GenBank release 219.0 (4/14/2017) has 200,877,884 traditional records containing 231,824,951,552 base pairs of sequence data. In addition, there are 451,840,147 WGS records containing 2,035,032,639,807 base pairs of sequence data, 165,068,542 TSA records containing 149,038,907,599 base pairs of sequence data, as well as 1,438,349 TLS records containing 636,923,295 base pairs of sequence data.
At the March 2017 NCBI Genomics Hackathon, participants developed six functional software prototypes, several of which are still under active development. Software is available from the NCBI-Hackathons GitHub site.
In the past month, the NCBI Eukaryotic Genome Annotation Pipeline has released new annotations in RefSeq for the following organisms:
Next week, NCBI staff will show you how to quickly find and download human genome annotations from both the web and the command line for incorporation into your workflows. We will also show you how to convert the accessions in these files to those used in other bioinformatics databases, as well as how to visualize these annotations on our Genome Data Viewer.
Date and time: Wednesday, May 10, 2017 12:00 PM – 12:30 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar.
After the live presentation, the webinar will be uploaded to the NCBI YouTube channel. Any related materials will be accessible from the Webinars and Courses page; you can also learn about future webinars there.