As we announced, the new default database version for BLAST+ is dbV5. To complete the transition to the new version, we will modify the directory structure and naming conventions on the BLAST FTP database directory. We expect to make this change around February 4th, 2020.
Here is a list of what we will change:
All databases at the base of the blastdb directory (/ blast/db/) will be the dbV5 versions.
The version 5 databases will no longer have “_v5” as part of the archive or database names.
We will move the dbV4 databases to a v4 subdirectory (/blast/db/v4/).
The now legacy dbV4 database archives will have “_v4” in their names (e.g., nr_v4.00.tar.gz); we will not rename the files within the archive.
We will no longer update the dbV4 databases.
We will freeze the cloud directory (/blast/db/cloud/) with no new entries after January 13, 2020.
We will provide only nr, nt, swissprot, and pdbaa files in the FASTA directory (/blast/db/FASTA/).
Please adjust your scripts or procedures to accommodate the changes!
If you have any questions or concerns, please contact us.
IgBLAST is a popular NCBI package for classifying and analyzing immunoglobulin (IG) and T cell receptor (TCR) variable domain sequences. We’ve released a new version (1.15.0) of IgBLAST with four new improvements / bug fixes:
Support for the new framework region 4 (FWR4) annotation feature in the standard alignment formats and AIRR format.
Renamed the previous “-penalty” parameter to -V_penalty to be consistent with other IgBLAST penalty options.
Restored constant internal BLAST search parameters for domain annotation (i.e., FWR/CDR) so that this process is not influenced by user-provided parameters.
Corrected FWR/CDR annotations for certain mouse VK and rat VH germline genes.
IgBLAST 1.15 is available for download from the BLAST FTP area. See the manual on GitHub for information about setting up and running IgBLAST.
The BLAST+ 2.10.0 release is now available from our FTP site. The new version offers the following improvements:
updated composition-based statistics for protein-protein (including translated BLAST) comparisons to provide stable results when you request fewer than the default number of results
an experimental Adaptive Composition Based Statistics option that increases the likelihood of finding novel results. To enable this option set the environment variable ADAPTIVE_CBS to 1. We welcome your feedback on this new option.
See the release notes for details on more improvements and bug fixes with this release.
The new version fully supports the version 5 (v5) databases with built in taxonomy and other improvements. For more information on v5 databases (download), see the previous NCBI Insights article and the recording of our webinar. If you are still using the older version 4 (v4) databases, we recommend you begin using the v5 version as soon as possible. We will discontinue updates to the older v4 databases in early 2020.
Read the recent publication (PMID: 31427293) on the AMRFinder, a tool that identifies antimicrobial resistance (AMR) genes in bacterial genome sequences using a high-quality curated AMR gene reference database. We use the AMRFinder to identify AMR genes in the hundreds of bacterial genomes that NCBI receives every day, and the results of AMRFinder are used in NCBI’s Isolates Browser to provide accurate assessments of AMR gene content. You can install AMRFinder locally and run it yourself. Follow the instructions on our GitHub site.
Since the publication we have upgraded AMRFinder to AMRFinderPlus. The enhanced tool now
supports searches based on protein annotations, nucleotide sequences, or both for best results
identifies point mutations in Campylobacter, E.coli, Shigella, and Salmonella
optionally identifies many genes involved in biocide, heat, metal, and stress resistance, as well as many antigenicity and virulence genes
provides information about gene function, including resistance to individual antibiotics and other phenotypes
As you may know, we have been offering a new BLAST results (Figure 1) as a test page since April. In response to your positive reception and after incorporating many improvements that you suggested, we made the new results the default today, August 1, 2019.
You will still be able to access to the traditional results for a several months. This will provide you additional time if you need it to adjust your workflows or teaching materials to the new display.
In modern biomedical research, you often need to analyze very large datasets. This may require computing and storage capacity that exceeds what you have available locally. Working in a cloud environment where you can provision nearly limitless computing power, gain access to enormous data sets, and pay for only what you need is a great option in these cases.
To help with these tasks, NCBI is now providing a Docker version of NCBI BLAST that you can use on the cloud. This implementation will help you work with large volumes of sequence data and the set of NCBI BLAST databases. The BLAST Docker image makes using BLAST on the cloud much more convenient.
Installation and maintenance of the BLAST programs and databases is all handled by Docker.
Integration with other tools in your pipelines is easier.
NCBI BLAST databases are pre-loaded on the Google Cloud, providing fast access.
While we have tested the Docker image on the Google Cloud, the Docker image will allow BLAST to run equally well on any Docker-enabled platform, such as another cloud platform or on your local computer — and you can still can use the cloud-installed BLAST databases.
As you probably know, BLAST has been offering a new results page as an option for standard BLAST for you to test and provide feedback since April. See our post from earlier this spring for more details. We have just added new results pages (Figure 1) for the following four specialized BLAST services for you to test.
We recently updated the version 5 BLAST protein databases, (dbV5), on our FTP site to be completely accession-based. As we described in a previous post, this means they now contain the gi-less proteins from the NCBI Pathogen Project and other high-throughput projects. The v5 databases are also compatible with proteins from PDB structures with multi-character chain identifiers and will include these as they become available in our other protein systems. Only the latest version of BLAST+ (2.9.0, download) will work with the updated v5 databases and allow you to access all of the most recent protein data. At the end of September 2019, we will stop updating the version 4 BLAST databases and offer the v5 databases as the default for download.
We have been offering the new BLAST results page (Figure 1) for you to try out since April and have been collecting your comments and feedback. Thank you all for your input on this new results display. Over 90% of your comments have been positive. We have made several changes to the page that address issues or problems that you have pointed out and are also working on adding several additional features that you have suggested in future releases.
At this time, 96% of you who have tried the new page have kept it as your default results page. We are planning to make the new page the default for everyone on August 1, 2019. We will still provide access to the old results for some time to allow people who have workflows or teaching materials to adjust to the new display.
Figure 1. The New BLAST Results with filters directly on the page and a more concise tabbed output that includes the taxonomy report.
Please view our video introduction to the new results to see highlights of the improved display. As always, we will continue to incorporate your feedback into the design and features on the new page, so please test it out and let us know what you think.