We are excited to announce two improvements to the Read assembly and Annotation Pipeline Tool (RAPT), which allows you to assemble genomic reads for bacterial or archaeal isolates and annotate their genes at the click of a button.
Improved taxonomic assignment
Now RAPT verifies the scientific name you provide with the reads, and corrects it as needed with the Average Nucleotide Identity (ANI) tool, which compares your genome to type strain assemblies in GenBank to place it in the taxonomic tree. So, even if you only have a rough idea of the species you have sequenced, input datasets tailored to your genome will be used for the annotation and you will get the best possible gene set from RAPT.
Genes are now assigned Gene Ontology (GO) terms when possible. The terms are derived from the collection of Protein Family Models (hidden Markov models, BlastRules and conserved domain architectures) used by RAPT to name the proteins. On average, a third of Coding Sequences (CDSs) annotated by RAPT will get at least one GO term. We are actively working on mapping more GO terms to our Protein Family Models, so this percentage will grow with time. See more information in this blog post.
If you prefer to run RAPT on your own local machine or on the cloud, please visit our github site to get started.
New to RAPT?
RAPT is an easy-to-use pilot service for the de novo assembly and gene annotation of public or private Illumina genomic reads sequenced from bacterial or archaeal isolates. RAPT consists of three major components, the genome assembler SKESA, the taxonomic assignment tool ANI and the Prokaryotic Genome Annotation Pipeline (PGAP), and produces an annotated genome of quality comparable to RefSeq in a couple of hours.
Find more information about RAPT here and check out our video!