From June 19-21, 2017, the NCBI will assist in a bioinformatics hackathon at the New York Genome Center (NYGC). This hackathon will focus on advanced bioinformatics analysis of next generation sequencing (NGS) data, proteomics and metadata. To apply for this hackathon, complete this application (approximately 10 minutes to complete). Applications are due Monday, May 22, 2017 by 5 PM ET.
This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for bioinformatics analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians or librarians.
The event is open to anyone selected for the hackathon and able to travel to the NYGC (see address below).
The Tasmanian devil (Sarcophilus harrisii), the last remaining large marsupial carnivore, now faces extinction because of a strange and deadly infection, a transmissible cancer known as Transmissible Devil Facial Tumor Disease (TDFTD). In a previous NCBI Insights post, we discussed gene expression data from the tumors that established their neural origin and showed the tumors were likely derived from Schwann cells. In this post, we’ll consider some of the genome sequencing projects in the NCBI databases and explore evidence that the tumor originated in a different individual than the affected animal supporting the idea that the tumor cells themselves are infectious agents. Continue reading
What is a genome assembly?
The haploid human genome consists of 22 autosomal chromosomes and the Y and the X chromosomes. Each of the chromosomes represents a single DNA molecule, a sequence of millions of nucleotide bases. These molecules are linear, so one might expect that we should represent each chromosome by a single, continuous sequence.
Unfortunately, this is not the case for two main reasons: 1) because of the nature of genomic DNA and the limitations of our sequencing methods, some parts of the genome remain unsequenced, and 2) emerging evidence suggests that some regions of the genome vary so much between individual people that they cannot be represented as a single sequence.
In response to this, modern genomic data sets present a model of the genome known as a genome assembly. This post will introduce the basic concepts of how we produce such assemblies as well as some basic vocabulary.
The Tasmanian devil (Sarcophilus harrisii), the last remaining large marsupial carnivore, now faces extinction because of a strange and deadly infection: a transmissible cancer known as Devil Facial Tumor Disease. These tumor infections are apparently passed to other devils through bites during mating or during squabbles over carrion when devils gather to feed. In this unusual situation, the cancer cells themselves are the infectious agent.
The failure of devil immune systems to recognize and destroy the foreign tumor cells may be related to a decline in genetic diversity and may serve as a warning about the vulnerability of species with reduced gene pools. The advent of next-generation sequencing has provided an unprecedented opportunity to track the spread and identify the origin of this unusual zoonosis, as well as to examine the population structure of an endangered mammal and generate a complete genome sequence for this unique marsupial.