NIH Biomedical Data Science Codeathon in Pittsburgh, Jan 8-10

NIH Biomedical Data Science Codeathon in Pittsburgh, Jan 8-10

NCBI is pleased to announce a Biomedical Data Science Codeathon in collaboration with Carnegie Mellon in Pittsburgh, PA on January 8-10, 2020.

We’re specifically seeking people with experience working with complex diseases, precision medicine, and genomic analyses.  If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments. The event is open to anyone selected for the codeathon and willing to travel to Pittsburgh.

Potential topics include:

  • Virus Genome Graph tools
  • Image analysis pipelines
  • RNAseq pipelines
  • Cancer graph genomes
  • Complex Disease Analysis

Codeathon Logistics

The event runs from 9 am – 5 pm each day, with an optional social event on the evening of the second day.  We will form working groups of five to six individuals, with various backgrounds and expertise, into five to eight teams with an experienced leader. These teams will build pipelines and tools to analyze large datasets within a cloud infrastructure.

There is no registration fee for this event.

Note: Participants will need to bring their own laptop to this program. No financial support for travel, lodging, or meals is available for this event.


After a brief organizational session, teams will spend three days addressing a challenging set of scientific problems related to a group of datasets.  Participants will analyze and combine datasets in order to work on these problems. Throughout the three days, we will come together to discuss progress on each of the topics, bioinformatics best practices, coding styles, etc.


Datasets will come from public repositories such as the cloud-hosted sequence read archive data and contigs derived from these data.  Image stacks and phenotype data may also be available from a variety of labs.


We will make all pipelines, other scripts, software, and programs generated in this codeathon available on a dedicated public GitHub repository.

Each team may submit manuscripts describing the design and use of the software tools they created  to an appropriate journal such as the F1000Research hackathons channelBMC BioinformaticsGigaScienceGenome Research, or PLoS Computational Biology.


Please fill out the application form. Initial applications are due December 15th, 2019 by 3 pm ET. We will select participants based on their experience and their motivation to attend.

We encourage prior participants and prior applicants to apply. We will notify the first round of accepted applicants on December 18th. Accepted applicants have until December 18th at noon ET to confirm their participation. International applicants or those with particular skillsets may be accepted early. If you confirm, please make sure that you can attend, as confirming and not attending prevents other scientists from attending this event.


Entrants retain ownership of all intellectual property rights (including moral rights) in the code submitted to as well as developed in the codeathon. Employees of the U.S. Government attending as part of their official duties retain no copyright on their work and their work is in the public domain in the U.S.

The Government disclaims any rights in the code submitted or developed in the codeathons.

Participants agree to publish the code and any related data in GitHub.

For more information, please contact Ben Busby.

Leave a Reply