NCBI’s Pathogen Detection resource now provides selected data on the Google Cloud Platform (GCP) allowing you better access to over 1 million bacterial isolates.
Data on GCP include:
- The tables from the MicroBIGG-E database of anti-microbial resistance (AMR), stress response, virulence genes, and genomic elements and the Pathogen Isolates Browser that are both accessible through Google BigQuery.
- The MicroBIGG-E sequences in FASTA format that are available from Google Cloud Storage.
Features & Benefits
Pathogen Detection data on GCP allows you larger-scale access than is currently available through the web or from FTP. Notably, there is no FTP access to MicroBIGG-E; the web interface is limited to 100K rows and sequence downloads are restricted. There are no such restrictions on GCP. MicroBIGG-E at BigQuery also allows you to download all AMRFinderPlus results. Currently there are more than 20 million rows of antimicrobial resistance, virulence, and stress response genes, and point mutations, identified in more than 1 million pathogen isolates.
Here are two examples where researchers have used MicroBIGG-E and AMFinderPlus data to advance research on antimicrobial resistance:
- Identifying conserved functional regions in erythromycin resistance methyltransferases (PMID: 34795028).
- Assessing the health risks of antibiotic resistance genes (PMCID: PMC8346589).
Continue reading “Full-scale access to microbial Pathogen Detection data in the Cloud!”
On Wednesday, December 11, 2019 at 12 PM, NCBI staff will present a webinar that will show you how to use NCBI’s PGAP (https://github.com/ncbi/pgap) on your own data to predict genes on bacterial and archaeal genomes using the same inputs and applications used inside NCBI. You can run PGAP your own machine, a compute farm, or in the Cloud. Plus, you can now submit genome sequences annotated by your copy of PGAP to GenBank. Attend the webinar to learn more!
- Date and time: Wed, Dec 11, 2019 12:00 PM – 12:45 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
The latest improvement in the NCBI search experience is designed to help you quickly find microbial proteins. Now when you search for a prokaryotic protein name such as recombinase RecA in NCBI’s sequence databases or in the All databases search, a high-quality representative protein sequence is highlighted in a panel at the top of the results page (Figure 1).
The result panel also allows you to quickly link to related resources such as NCBI’s new pages for protein family models, Identical Protein Groups, and SPARCLE, NCBI’s protein domain architecture resource. We also provide as-you-type suggestions so you don’t have to type out some of the long names.
Figure 1. The result for a search with recombinase RecA. The panel provides access to analysis tools, downloads, and relevant links to the protein family, the RefSeq protein, the identical protein group, and citations in PubMed.
Try these protein name searches, or your own, and use the as-you-type suggestions to assist your searches.
Please let us know how you like these results!
We are now showing the curated evidence used for assigning names and, if possible, gene symbols, publications, and Enzyme Commission numbers on nearly 70% (83 million) microbial RefSeq proteins. This evidence includes a hierarchical collection of curated Hidden Markov Model (HMM)-based and BLAST-based protein families, and conserved domain architectures.
Continue reading “Evidence for naming the protein now on non-redundant refseq records (WP_ accessions)”
From August 13 – 15 2019, the NCBI will run a bioinformatics hackathon on the NIH campus!
We’re specifically looking for folks who have experience in working with computational microbial genomics, evolutionary biology, antimicrobial resistance, and similar genomic analysis. If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments (please note that the event itself will focus on open access public human).
Continue reading “Microbial Virulence in the Cloud hackathon August 13 – 15 2019”
As of March 2018, there were 141,000 prokaryotic genomes in the Assembly database. As this database grows, misassigned prokaryotic genomes becomes a serious problem. Taxonomy misassignment can occur through simple submission error or can accumulate as new information adds greater specification to the taxonomic tree.
A paper in the International Journal of Systematic and Evolutionary Microbiology presents the method NCBI scientists used to verify taxonomic identities in prokaryotic genomes. The authors used an Average Nucleotide Identity method with optimum threshold ranges for prokaryotic taxa to review all prokaryotic genome assemblies in GenBank. This method relies on Type strain information and is one outcome of a 2015 workshop involving several important parties in the bacteriology community.
NCBI will be attending and presenting at ASM Microbe 2018 this June. Read on for all NCBI activities.
Continue reading “NCBI attending ASM Microbe 2018 June 7-11”
This blog post is for researchers, students, and postdocs, as well as non-scientific developers, mathematicians and librarians.
This summer, we were quite busy running and cohosting hackathons. These events educate participants, allow for networking among computational biologists and produce bioinformatics software prototypes. Read on for a review of products from our Summer 2017 hackathons.
Continue reading “Summer 2017 NCBI Hackathon Products”