We have made some recent improvements to the BLAST+ applications that take full advantage of the version 5 BLAST databases (BLASTDBv5), which include built in taxonomic information for sequences and no longer rely on the integer sequence identifiers (gi numbers).
NCBI Labs is showcasing an experiment to improve the BLAST results page. The goal is to provide a more useful BLAST output that better meets your needs and integrates with your workflows. The new results incorporate feedback from surveys and interviews with BLAST users. We think you’ll find the new results are more compact, easier to navigate, and expose useful formatting and other features that you may not have known about.
The results page has organism, percent identity, and E value filters in plain view and easily accessible. The Descriptions and Graphic Summary are on separate tabs, and the popular taxonomy view is on a fourth tab rather than on a separate web page. These changes along with other enhancements make the display more concise and easier to navigate. The figure below shows the new output format.
Figure 1. The New BLAST Results with filters directly on the page and a more concise tabbed output that includes the taxonomy report. The Back to Traditional Results Page link re-loads the results in the standard format.
The BLAST+ 2.9.0 release is now available from our FTP site. This latest release has enhanced support for the new BLAST database version (BLASTDBv5).
The 2.9.0 programs handle the new four character identifiers for chains of 3D structure records from RCSB Protein Data Bank (PDB). The previous version of the BLAST databases and programs do not support these identifiers. See the MMDB News for additional details about the PDB change and the impact on NCBI Structure resources.
Another important improvement in 2.9.0 is the ability to configure the output separator for tabular and CSV output formats. See the BLAST Manual for details.
More improvements and a few bug fixes with this release are detailed in the release notes.
For more information on new database version, BLASTDBv5 (download), see the previous NCBI Insights article and the recording of our webinar. We will continue to update the BLAST databases in their current version (BLASTDBv4) until September 2019.
IgBLAST is a popular NCBI package for classifying and analyzing immunoglobulin and T cell receptor variable domain sequences. We’ve released a new version of IgBLAST with three new improvements:
The new release determines the V gene reading frame from the end of FWR3 region instead of end of V gene. This helps identify the correct reading frames for rearrangements that have insertions or deletions near the V gene end.
The allowed distance between V gene end and J gene start has been increased to 225 bp to allow detection of ultra long D/N regions.
The standalone program and files has been repackaged to make it easier to install.
The new release is available from the BLAST FTP area, along with a new manual on GitHub.
BLAST+ 2.8.1 is now available for download from our FTP site. This the first production release of standalone BLAST to support the new BLAST v5 databases (BLASTDBv5), which are also now available. The new databases have taxonomy information for the database sequences built-in. This gives you the following important advantages over the v4 databases.
The ability to limit your search by taxonomic group — species level as well as higher taxa.
Improved performance when limiting BLAST search with accessions.
Retrieval of sequences by taxonomic group from a BLAST database with blastdbcmd.
There are some additional enhancements to the search program options.
A new option (-subject_besthit) culls HSPs on a per subject sequence basis by removing HSPs that are completely enveloped by another HSP. This is an experimental option and is subject to change.
Use of the -max_target_seqs option for formats 0-4 is now allowed. The number of alignments and descriptions will be set to the max_target_seqs.
BLAST now issues a warning about the possibility not seeing all equivalent matches if -max_target_seqs is set to less than five.
Going to the ASCB | EMBO meeting? Stop by the NCBI booth (#327) to learn about all that NCBI has to offer, ask questions, and provide feedback on how we can better meet your needs for research and teaching.
Booth #327, Exhibit Hall:
Sunday, December 9, 9:30 AM – 4:00 PM
Monday, December 10, 9:30 AM – 4:00 PM
Tuesday, December 11, 9:30 AM – 4:00 PM
Visit the booth anytime during exhibit hours to discuss any topic or just to say hello. We’re also offering specific times at the booth for focused conversations about using specific sets of NCBI resources in your research and teaching.
12:30 PM NCBI BLAST in research and teaching
12:30 PM Jupyter notebooks to teach scripting and NCBI resources
12:30 PM EDirect for command-line access to NCBI databases
2:00 PM Jupyter notebooks to teach scripting and NCBI resources
To stay up-to-date about NCBI at ASCB or in general, follow us on Twitter at @NCBI .
Next Wednesday, October 3, 2018, the lead of the NCBI BLAST group will show you how to be more effective with NCBI’s standalone BLAST applications. You will learn how to optimize database selection, output formats, taxonomy information and use our next-gen alignment program Magic-BLAST. You can also use many of these strategies to improve your web BLAST searches.
Date and time: Wed, Oct 3, 2018 12:00 PM – 12:45 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
What is Magic-BLAST and why are we excited about it?
Magic-BLAST is a BLAST tool, but it’s unlike any other.
It aligns next generation sequencing reads, both DNA and RNA-seq. It implements the aligner algorithm from MAGIC , a trusted pipeline, but uses the well tested and supported BLAST infrastructure. We think it’s like putting two great things together, like having your favorite ice cream in your morning coffee.
We’re so excited about it that we even wrote an article that compares Magic-BLAST to a few other aligners on several data sets.
If you look at the figures in our article, we think you’ll see that Magic-BLAST excels at finding introns and processing ultra-long sequences. It also can handle high levels of mismatches as well compositionally biased DNA. Finally, you’ll see that Magic-BLAST works in a lot of relevant situations in which current aligners won’t. If our results got your attention, here is our documentation, which includes a cookbook with a few examples.
Update:NCBI is now in the process of merging EST and GSS records into the Nucleotide database, and we expect to complete this process in early 2019. Accession.version and GI identifiers will not change during this process.
As of December 1, 2018, all records from the databases for Expressed Sequence Tags (EST) and Genome Survey Sequences (GSS) will reside in NCBI’s Nucleotide database. This change will provide a single point of access for all GenBank sequence data with a common look and feel.
Read more to learn about how this change affects these resources: