Pangenomics in the Cloud hackathon, March 25-27, 2019

Pangenomics in the Cloud hackathon, March 25-27, 2019

We are pleased to announce the first ever pangenomics, graphs and haplotypes hackathon.

From March 25-27, 2019, the NCBI will help run a bioinformatics hackathon in Santa Cruz, California, hosted by the University of California, Santa Cruz (UCSC).  Potential topics include:

  • Building large scale graphs from pangenomes using several assembly methods
  • Simplification of mapping
  • Resolving haplotypes
  • Identification of population-specific structural variants
  • Defining haplotype-specific expression, visualization, and coordination with the GRC

We’re specifically looking for people who have experience in working with graph-building, mapping, isoform definition, and similar genomic analysis. If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments (please note that the event itself will focus on human). The event is open to anyone selected for the hackathon and willing to travel to UCSC (see below).


Working groups of five to six individuals will be formed into five to eight teams. These teams will build pipelines to analyze large datasets within a cloud infrastructure. The projects will be unveiled before the hackathon starts and will build off previous NCBI hackathons and community projects.


After a brief organizational session, teams will spend three days addressing a challenging set of scientific problems related to a group of datasets. Participants will analyze and combine datasets to work on these problems. Throughout the three days, we will come together to discuss progress on each of the topics, bioinformatics best practices, coding styles, etc.


Datasets will come from public repositories, with a focus on a number of trios produced by long read sequencing as a base graph and short read datasets in the sequence read archive that have been ported to cloud infrastructure, as well as derivative contigs of the above.


All pipelines and other scripts, software, and programs generated in this hackathon will be added to a public GitHub repository designed for that purpose.

Manuscripts describing the design and usage of the software tools constructed by each team may be submitted to an appropriate journal such as the F1000Research hackathons channel, BMC Bioinformatics, GigaScience, Genome Research or PLoS Computational Biology. Ideally, we will present a graph genome, several protocols for associating short read omics data with it, and some derived datasets (e.g. variant calls) from such protocols.


To apply, please complete this form (approximately 5 minutes to complete). Initial applications are due Wednesday, February 27th, 2018 by 3 pm PT. Participants will be selected based on the experience and motivation they provide on the form.

Prior participants and applicants are especially encouraged to apply. The first round of accepted applicants will be notified on March 1st by 3 pm PT and have until March 4th at noon PT to confirm their participation. If you confirm, please make sure it is highly likely you can attend, as confirming and not attending prevents other data scientists from attending this event. Also, please note that UCSC will be charging a small fee to defray their costs for provided meals. Upon acceptance, there will be a link to remit this fee to UCSC. Please include a monitored email address, in case there are follow-up questions.

Note: Participants will need to bring their own laptop to this program. A working knowledge of scripting (e.g., Shell, Python, R) is useful but not necessary to be successful in this event. Employment of higher level scripting or programming languages may also be useful.

Applicants must be willing to commit to all three days of the event.

It’s unlikely that financial support for travel, lodging or meals is available for this event. Also, note that the hackathon may extend into the evening hours each day. Please make any necessary arrangements to accommodate this possibility.

There will be no registration fee associated with attending this event.

For more information, or with any questions, please contact Ben Busby.

3 thoughts on “Pangenomics in the Cloud hackathon, March 25-27, 2019

Leave a Reply