Do you work with human-derived sequence data? Do you often struggle with the need to determine if your data is free of human sequence and therefore suitable for public distribution? We encourage submitters to screen for and remove contaminating human reads from data files prior to submission to SRA. To support investigators in this effort, we offer a tool to remove human sequence contamination from your SRA submissions!
Human Read Removal Tool (HRRT)
The Human Read Removal Tool (HRRT; also known as the Human Scrubber) is available on GitHub and DockerHub. The HRRT is based on the SRA Taxonomy Analysis Tool (STAT) that will take as input a fastq file and produce as output a fastq.clean file in which all reads identified as potentially of human origin are masked with ‘N’. Continue reading “Scrubbing human sequence contamination from Sequence Read Archive (SRA) submissions”