Petabyte-Scale Sequence Search: Metagenomics Benchmarking Codeathon

This codeathon concluded on October 1st, 2021. To see what teams accomplished, check out their GitHub repositories.

Codeathon Description:

The National Institutes of Health (NIH) Office of Data Science Strategy, the National Center for Biotechnology and Information at the National Library of Medicine, and the Department of Energy’s (DOE) Office of Biological and Environmental Research invite you to apply to the virtual Petabyte-Scale Sequence Search: Metagenomics Benchmarking Codeathon. NIH’s Sequence Read Archive now makes more than 14 petabytes of data available in the cloud. To make full use of this data, the scientific community needs high-performance search tools that can work efficiently up to the petabyte scale.

The focus of the codeathon is creating publicly available resources that make it easy for scientists to compare sequence search methods across a standardized set of benchmarks and datasets. This event is part of a series bringing together a diverse group (biologists, bioinformaticians, statisticians, mathematicians, computer scientists, and engineers) of collaborators to develop and test new approaches for sequence search. Codethon projects will lay the groundwork for future events focused on methods development.

