The BLAST programs and databases are available in Docker and cloud-ready


In modern biomedical research, you often need to analyze very large datasets. This may require computing and storage capacity that exceeds what you have available locally. Working in a cloud environment where you can provision nearly limitless computing power, gain access to enormous data sets, and pay for only what you need is a great option in these cases.

To help with these tasks, NCBI is now providing a Docker version of NCBI BLAST that you can use on the cloud. This implementation will help you work with large volumes of sequence data and the set of NCBI BLAST databases. The BLAST Docker image makes using BLAST on the cloud much more convenient.

  • Installation and maintenance of the BLAST programs and databases is all handled by Docker.
  • Integration with other tools in your pipelines is easier.
  • NCBI BLAST databases are pre-loaded on the Google Cloud, providing fast access.

While we have tested the Docker image on the Google Cloud, the Docker image will allow BLAST to run equally well on any Docker-enabled platform, such as another cloud platform or on your local computer  — and you can still can use the cloud-installed  BLAST databases.

See the  BLAST in the Cloud and  database information documentation to get started.

Run the Prokaryotic Genome Annotation Pipeline (PGAP) on your own machine


You can now download PGAP from GitHub and run it on your machine, compute farm or the cloud, on any public or privately-owned genome. PGAP predicts genes on bacterial and archaeal genomes using the same inputs and applications used inside NCBI. This is a great opportunity for you to try it now and send us comments (please use GitHub issues).

Continue reading

Pangenomics in the Cloud hackathon, March 25-27, 2019


We are pleased to announce the first ever pangenomics, graphs and haplotypes hackathon.

From March 25-27, 2019, the NCBI will help run a bioinformatics hackathon in Santa Cruz, California, hosted by the University of California, Santa Cruz (UCSC).  Potential topics include:

  • Building large scale graphs from pangenomes using several assembly methods
  • Simplification of mapping
  • Resolving haplotypes
  • Identification of population-specific structural variants
  • Defining haplotype-specific expression, visualization, and coordination with the GRC

Continue reading

NCBI at ASHG 2018: “Storage and use of dbGaP data in the cloud”


As the American Society of Human Genetics (ASHG) conference is around the corner, the NCBI staff begin to prep for their presentations in San Diego. Here is some background for dbGaP’s poster about their process to improve data storage and accessibility.

Visit Poster 1435T “Storage and use of dbGaP data in the cloud” Thursday, October 18 from 2 PM to 3PM. (Exhibit Hall, Ground Floor)

Continue reading