Full-scale access to microbial Pathogen Detection data in the Cloud!

Full-scale access to microbial Pathogen Detection data in the Cloud!

NCBI’s Pathogen Detection resource now provides selected data on the Google Cloud Platform (GCP) allowing you better access to over 1 million bacterial isolates.

Data on GCP include:

  1. The tables from the MicroBIGG-E database of anti-microbial resistance (AMR), stress response, virulence genes, and genomic elements and the Pathogen Isolates Browser that are both accessible through Google BigQuery.
  2. The MicroBIGG-E sequences in FASTA format that are available from Google Cloud Storage.

Features & Benefits

Pathogen Detection data on GCP allows you larger-scale access than is currently available through the web or from FTP.  Notably, there is no FTP access to MicroBIGG-E; the web interface is limited to 100K rows and sequence downloads are restricted.  There are no such restrictions on GCP. MicroBIGG-E at BigQuery also allows you to download all AMRFinderPlus results. Currently there are more than 20 million rows of antimicrobial resistance, virulence, and stress response genes, and point mutations, identified in more than 1 million pathogen isolates.

Here are two examples where researchers have used MicroBIGG-E and AMFinderPlus data to advance research on antimicrobial resistance:

    • Identifying conserved functional regions in erythromycin resistance methyltransferases (PMID: 34795028).
    • Assessing the health risks of antibiotic resistance genes (PMCID: PMC8346589).

Learn more

For more information on Pathogen Detection resources at Google Cloud see our documentation:


Please contact us with any questions or feedback.

Also let us know if you are interested in the full set of assemblies currently in MicroBIGG-E as the current GCP data does NOT include the full assemblies.

Please note: Pathogen Detection data at Google Cloud Platform is updated daily. Updated results may lag behind results presented in the browsers or on FTP by up to one day.

The Cloud allows you access to flexible and scalable data exploration and analysis.
It also serves as a platform to support FAIR (findable, accessible, interoperable, and repeatable) data practices.

As part of NIH’s Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative, NCBI makes sequence, metadata, and analysis results available in the cloud for select high value projects.



2 thoughts on “Full-scale access to microbial Pathogen Detection data in the Cloud!

Leave a Reply