We recently updated the version 5 BLAST protein databases, (dbV5), on our FTP site to be completely accession-based. As we described in a previous post, this means they now contain the gi-less proteins from the NCBI Pathogen Project and other high-throughput projects. The v5 databases are also compatible with proteins from PDB structures with multi-character chain identifiers and will include these as they become available in our other protein systems. Only the latest version of BLAST+ (2.9.0, download) will work with the updated v5 databases and allow you to access all of the most recent protein data. At the end of September 2019, we will stop updating the version 4 BLAST databases and offer the v5 databases as the default for download.
We have been offering the new BLAST results page (Figure 1) for you to try out since April and have been collecting your comments and feedback. Thank you all for your input on this new results display. Over 90% of your comments have been positive. We have made several changes to the page that address issues or problems that you have pointed out and are also working on adding several additional features that you have suggested in future releases.
At this time, 96% of you who have tried the new page have kept it as your default results page. We are planning to make the new page the default for everyone on August 1, 2019. We will still provide access to the old results for some time to allow people who have workflows or teaching materials to adjust to the new display.
Figure 1. The New BLAST Results with filters directly on the page and a more concise tabbed output that includes the taxonomy report.
Please view our video introduction to the new results to see highlights of the improved display. As always, we will continue to incorporate your feedback into the design and features on the new page, so please test it out and let us know what you think.
IgBLAST is a popular NCBI package for classifying and analyzing immunoglobulin (IG) and T cell receptor (TCR) variable domain sequences. We’ve released a new version (1.14.0) of IgBLAST with three new improvements / bug fixes:
- Adaptive Immune Receptor Repertoire (AIRR) format is more consistent with AIRR specs including changing undefined type (NON, N/A) to empty string, not appending “reversed” to sequence identifier when the query is in reversed orientation, and using standard locus names such as IGH, TRB instead of traditional VH, VB etc.
- The logic for showing CDR3 end of TCR sequences is improved.
- The sequence identifier is restored in the case of no results in AIRR rearrangement format.
Next Wednesday, May 15, 2019 at 11AM, NCBI staff will show you how to use the latest version of standalone BLAST+ (2.9.0) and the new accession-based DBv5 databases with built-in taxonomy information. You will learn how to limit searches to taxonomic groups and to retrieve sequences from the database by taxonomy without having to download an identifier list. You will also learn about additional improvements in the BLAST databases and programs that make them compatible with the new PDB identifiers and gi-less proteins from the Pathogen Detection Project.
Date and time: Wed, May 15, 2018 11:00 AM – 11:30 AM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
We have made some recent improvements to the BLAST+ applications that take full advantage of the version 5 BLAST databases (BLASTDBv5), which include built in taxonomic information for sequences and no longer rely on the integer sequence identifiers (gi numbers).
With the latest version of BLAST, you can now:
NCBI Labs is showcasing an experiment to improve the BLAST results page. The goal is to provide a more useful BLAST output that better meets your needs and integrates with your workflows. The new results incorporate feedback from surveys and interviews with BLAST users. We think you’ll find the new results are more compact, easier to navigate, and expose useful formatting and other features that you may not have known about.
The results page has organism, percent identity, and E value filters in plain view and easily accessible. The Descriptions and Graphic Summary are on separate tabs, and the popular taxonomy view is on a fourth tab rather than on a separate web page. These changes along with other enhancements make the display more concise and easier to navigate. The figure below shows the new output format.
Figure 1. The New BLAST Results with filters directly on the page and a more concise tabbed output that includes the taxonomy report. The Back to Traditional Results Page link re-loads the results in the standard format.
- The 2.9.0 programs handle the new four character identifiers for chains of 3D structure records from RCSB Protein Data Bank (PDB). The previous version of the BLAST databases and programs do not support these identifiers. See the MMDB News for additional details about the PDB change and the impact on NCBI Structure resources.
- Another important improvement in 2.9.0 is the ability to configure the output separator for tabular and CSV output formats. See the BLAST Manual for details.
More improvements and a few bug fixes with this release are detailed in the release notes.
For more information on new database version, BLASTDBv5 (download), see the previous NCBI Insights article and the recording of our webinar. We will continue to update the BLAST databases in their current version (BLASTDBv4) until September 2019.
IgBLAST is a popular NCBI package for classifying and analyzing immunoglobulin and T cell receptor variable domain sequences. We’ve released a new version of IgBLAST with three new improvements:
- The new release determines the V gene reading frame from the end of FWR3 region instead of end of V gene. This helps identify the correct reading frames for rearrangements that have insertions or deletions near the V gene end.
- The allowed distance between V gene end and J gene start has been increased to 225 bp to allow detection of ultra long D/N regions.
- The standalone program and files has been repackaged to make it easier to install.
Next week, NCBI staff will attend the Plant and Animal Genome (PAG) Conference. We have several activities planned, including 1 booth (#223), 4 workshops, 1 talk and 2 posters.
Read on to learn more about what you can look forward to if you’re attending PAG this year. (Note: The listed times are Pacific time.)
BLAST+ 2.8.1 is now available for download from our FTP site. This the first production release of standalone BLAST to support the new BLAST v5 databases (BLASTDBv5), which are also now available. The new databases have taxonomy information for the database sequences built-in. This gives you the following important advantages over the v4 databases.
- The ability to limit your search by taxonomic group — species level as well as higher taxa.
- Improved performance when limiting BLAST search with accessions.
- Retrieval of sequences by taxonomic group from a BLAST database with blastdbcmd.
There are some additional enhancements to the search program options.
- A new option (-subject_besthit) culls HSPs on a per subject sequence basis by removing HSPs that are completely enveloped by another HSP. This is an experimental option and is subject to change.
- Use of the -max_target_seqs option for formats 0-4 is now allowed. The number of alignments and descriptions will be set to the max_target_seqs.
- BLAST now issues a warning about the possibility not seeing all equivalent matches if -max_target_seqs is set to less than five.