NCBI to assist in Southern California genomics hackathon in January


From January 10-12, 2018, the NCBI will help with a bioinformatics hackathon in Southern California hosted by San Diego State University. The hackathon will focus on advanced bioinformatics analysis of next generation sequencing data, proteomics, and metadata. This event is for researchers, including students and postdocs, who have already engaged in the use of bioinformatics data or in the development of pipelines for bioinformatics analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians, or librarians.

The event is open to anyone selected for the hackathon and willing to travel to SDSU (see below).  Applications are due Monday, December 11th, 2017 by 3 pm PT (6PM EST).

Continue reading

August 23 NCBI Minute: Using the Run Selector to Find Relevant Next-Generation Sequencing (NGS) Datasets


Do you have trouble searching the NCBI webpage for relevant datasets? Wish you could filter the search results more precisely? You can with SRA Run Selector.

In this NCBI Minute, you’ll learn how to filter the SRA database using the metadata details captured for each submitted dataset. This is easily done in a spreadsheet format that displays all recorded metadata for each SRA Run. The user-friendly interface allows you to selectively filter datasets down to the most relevant data for your research question and then export it in a spreadsheet.

Date and time: Wednesday, August 23, 2017 12:00 PM – 12:30 PM EDT

After registering, you will receive a confirmation email with information about attending the webinar. After the live presentation, the webinar will be uploaded to the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

The Tasmanian Devil 2: The tumor and Tasmanian devil mitochondrial genomes


The Tasmanian devil (Sarcophilus harrisii), the last remaining large marsupial carnivore, now faces extinction because of a strange and deadly infection, a transmissible cancer known as Transmissible Devil Facial Tumor Disease (TDFTD).  In a previous NCBI Insights post, we discussed gene expression data from the tumors that established their neural origin and showed the tumors were likely derived from Schwann cells.  In this post, we’ll consider some of the genome sequencing projects in the NCBI databases and explore evidence that the tumor originated in a different individual than the affected animal supporting the idea that the tumor cells themselves are infectious agents. Continue reading

The Human Reference Genome – Understanding the New Genome Assemblies


What is a genome assembly?

The haploid human genome consists of 22 autosomal chromosomes and the Y and the X chromosomes. Each of the chromosomes represents a single DNA molecule, a sequence of millions of nucleotide bases.  These molecules are linear, so one might expect that we should represent each chromosome by a single, continuous sequence.

Unfortunately, this is not the case for two main reasons: 1) because of the nature of genomic DNA and the limitations of our sequencing methods, some parts of the genome remain unsequenced, and 2) emerging evidence suggests that some regions of the genome vary so much between individual people that they cannot be represented as a single sequence.

In response to this, modern genomic data sets present a model of the genome known as a genome assembly. This post will introduce the basic concepts of how we produce such assemblies as well as some basic vocabulary.

Continue reading

The Tasmanian Devil and Cancer as an Infectious Disease: Analysis of transcriptome data


The Tasmanian devil (Sarcophilus harrisii), the last remaining large marsupial carnivore, now faces extinction because of a strange and deadly infection: a transmissible cancer known as Devil Facial Tumor Disease.  These tumor infections are apparently passed to other devils through bites during mating or during squabbles over carrion when devils gather to feed. In this unusual situation, the cancer cells themselves are the infectious agent.

The failure of devil immune systems to recognize and destroy the foreign tumor cells may be related to a decline in genetic diversity and may serve as a warning about the vulnerability of species with reduced gene pools.  The advent of next-generation sequencing has provided an unprecedented opportunity to track the spread and identify the origin of this unusual zoonosis, as well as to examine the population structure of an endangered mammal and generate a complete genome sequence for this unique marsupial.

Continue reading