Tag: NCBI Prokaryotic Genome Annotation Pipeline (PGAP)

NCBI Hidden Markov Models (HMM) Release 13.0 Now Available!

NCBI Hidden Markov Models (HMM) Release 13.0 Now Available!

Release 13.0 of the NCBI protein profile Hidden Markov models (HMMs) used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available for download. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.

What’s new?

The 13.0 release contains:

  • 16,143 HMMs maintained by NCBI
  • 315 new HMMs since release 12.0
  • 286 HMMs with better names, EC numbers, Gene Ontology (GO) terms, gene symbols or publications

Continue reading “NCBI Hidden Markov Models (HMM) Release 13.0 Now Available!”

New! May 2023 Release of Stand-Alone PGAP

New! May 2023 Release of Stand-Alone PGAP

We are happy to announce the release of a new version of the stand-alone Prokaryotic Genome Annotation Pipeline (PGAP) with many exciting new features.

Improved user interface

This version has an improved user interface that takes the genome FASTA file and associated organism name directly on the command line. For example, to annotate a Vibrio cholerae genome sequence in the file Vchol.fasta:

pgap.py -r -g Vchol.fasta -s 'Vibrio cholerae' -o Vchol.annot

For more details visit our Quick Start page. Continue reading “New! May 2023 Release of Stand-Alone PGAP”

NCBI Hidden Markov Models (HMM) Release 12.0 Now Available!

NCBI Hidden Markov Models (HMM) Release 12.0 Now Available!

Release 12.0 of the NCBI protein profile Hidden Markov models (HMMs) used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available for download. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.

What’s new?

The 12.0 release contains:

  • 15,849 HMMs maintained by NCBI
  • 271 new HMMs since release 11.0
  • 1,248 HMMs with better names, EC numbers, Gene Ontology (GO) terms, gene symbols or publications

Continue reading “NCBI Hidden Markov Models (HMM) Release 12.0 Now Available!”

NCBI hidden Markov models (HMM) release 11.0 now available!

NCBI hidden Markov models (HMM) release 11.0 now available!

Release 11.0 of the NCBI protein profile Hidden Markov models (HMMs) used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available for download. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package. Continue reading “NCBI hidden Markov models (HMM) release 11.0 now available!”

New version of PGAP now available!

New version of PGAP now available!

We are happy to announce a new version of the stand-alone Prokaryotic Genome Annotation Pipeline (PGAP). This version helps you interpret your results by providing an estimate of the completeness and contamination of your PGAP-annotated genome assembly using CheckM.

CheckM uses the presence of a set of lineage-specific genes for the species provided  or the species returned by the taxonomy check (–taxcheck, –auto-correct-tax). The higher the completeness and the lower the contamination, the better the assembly is! If contamination is a concern, please try FCS-GX, a highly sensitive tool for detecting foreign contaminants in prokaryotic and eukaryotic genome assemblies.

This new release also contains code changes that improve prediction of some long genes, especially in low complexity regions. And, as with every release, PGAP incorporates incremental improvements from expert curators of the Protein Family Model collection that increase the precision of PGAP’s structural and functional annotation.

Please try this new version and share your experience with us!

 

NCBI hidden Markov models (HMM) release 10.0 now available!

NCBI hidden Markov models (HMM) release 10.0 now available!

Release 10.0 of the NCBI Hidden Markov models (HMM) used by the Prokaryotic Genome Annotation Pipeline (PGAP) is now available for download. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.

The 10.0 release contains 15,360 models maintained by NCBI, including 228 that are new since 9.0, 99 that were modified significantly, and 205 that were assigned better names, EC numbers, Gene Ontology (GO) terms, gene symbols or publications. You can search and view the details for these in the Protein Family Model collection, which also includes conserved domain architectures and BlastRules, and find all RefSeq proteins they name.

GO terms associated with HMMs are now propagated to CDSs and proteins annotated with PGAP. In case you missed it, see our previous blog post on this topic.

Come see NCBI at the ASM Microbe Conference 2022

Come see NCBI at the ASM Microbe Conference 2022

The American Society of Microbiology (ASM) Microbe conference is back, and scheduled to take place in-person, June 9th-13th in Washington, D.C.

NCBI staff member Dr. Michael Feldgarden will be recognized by ASM with an award for his research. Other NCBI staff will present posters on NCBI resources and will also be available at our booth (#1128) to address your questions. Drop by to see what’s new and provide your feedback. We hope to see you there! Check out NCBI’s schedule of activities:  Continue reading “Come see NCBI at the ASM Microbe Conference 2022”

New in RAPT: Better taxonomic assignment and GO annotation

New in RAPT: Better taxonomic assignment and GO annotation

We are excited to announce two improvements to the Read assembly and Annotation Pipeline Tool (RAPT), which allows you to assemble genomic reads for bacterial or archaeal isolates and annotate their genes at the click of a button.

Improved taxonomic assignment

Now RAPT verifies the scientific name you provide with the reads, and corrects it as needed with the Average Nucleotide Identity (ANI) tool, which compares your genome to type strain assemblies in GenBank to place it in the taxonomic tree. So, even if you only have a rough idea of the species you have sequenced, input datasets tailored to your genome will be used for the annotation and you will get the best possible gene set from RAPT. Continue reading “New in RAPT: Better taxonomic assignment and GO annotation”

NCBI hidden Markov models (HMM) release 8.0 now available!

NCBI hidden Markov models (HMM) release 8.0 now available!

Release 8.0 of the NCBI Hidden Markov models (HMM), used by the Prokaryotic Genome Annotation Pipeline (PGAP), is now available for download. You can search this collection against your favorite prokaryotic proteins to identify their function using the HMMER sequence analysis package.

The 8.0 release contains 15,358 models, including 160 that are new since 7.0. In addition, we have added better names, EC numbers, Gene Ontology (GO) terms, gene symbols or publications to over 550 existing HMMs. You can search and view the details for these in the Protein Family Model collection, which also includes conserved domain architectures and BlastRules, and find all RefSeq proteins they name.

GO terms associated with HMMs are now propagated to  coding sequences and proteins annotated with PGAP. In case you missed it, see our previous blog post on this topic.

New version of PGAP available now!

We are happy to announce the release of a new version of the stand-alone Prokaryotic Genome Annotation Pipeline (PGAP).

This version of PGAP offers a more streamlined experience to users who are uncertain about the taxonomic classification of the genomes they wish to annotate. Adding one flag to the command (--auto-correct-tax) results in the override of the species name provided on input if the taxonomy verification process predicts a different organism with high confidence. Continue reading “New version of PGAP available now!”