BLAST (Basic Local Alignment Search Tool) is a popular tool for finding sequences in a given database that are similar to a query sequence. Traditionally, BLAST displays these results as a sorted list of matches between the query and each database sequence. While this display is useful for examining how each subject sequence matches the query, it treats all subject sequences the same, regardless of the quality of the sequence data or its annotation, and also does not allow easy comparisons between different subject sequences.
For example, the subject sequences may fall into multiple groups of similar sequences, or all of the subject sequences may be more similar to each other than to the query. A common way to obtain this information is to construct a multiple sequence alignment of the query and some or all of the subject sequences, but to this point, BLAST has not provided such alignments directly.
Enter SmartBLAST! SmartBLAST is a new and experimental NCBI tool that makes it easier to complete common sequence analysis tasks, such as finding a candidate protein name for a sequence, locating regions of high sequence conservation, or identifying regions covered by database sequences but missing from the query.
To do this, SmartBLAST performs the following tasks in much less time than it takes to run a typical BLASTp search:
- a BLASTp comparison of the query with the closest matching sequences available;
- a parallel BLASTp search to find the closest matches to high quality sequences from model organisms;
- a multiple alignment between the query and five of the closest matching sequences (usually including two high quality sequences);
- an analysis that produces a phylogenetic tree from the multiple sequence alignment.
SmartBLAST then presents the results of the above tasks in a graphical view similar to that of a regular BLASTp search.
Figure 1 presents a sample SmartBLAST search with a predicted protein (XP_006115387.1) as the query. The display includes a phylogenetic tree in which every leaf-node is labeled with either the common name of the organism (if available) or the scientific name. This tree displays a wide taxonomic diversity that includes reptiles, mammals, and birds. The display also clearly indicates that the query contains N-terminal sequence that is not found in the matching sequences, and C-terminal sequence that BLAST could not align in the top three matches.
SmartBLAST can be accessed at http://blast.ncbi.nlm.nih.gov/smartblast/. It is still in an experimental phase and will change with little or no notice.
Please give it a try and let us know what you think by commenting on this post!
8 thoughts on “SmartBLAST: Faster BLASTp search results in a graphical view”
Looks really nice. Please provide access to SmartBlast via the URL API
I like it! It was lightning fast, and gives a lot of useful information that one usually wants when dealing with a new sequence. I can see this being my new ‘go to’ protein blast tool to use as a jumping off point for new proteins. I must say that it did over-emphasize some poor hits in the example I tried, I’m not sure why.
I found smartBlast very useful. How should I cited it , if I gonna use it in a publication?
Here is the NLM style citation for SmartBLAST:
SmartBLAST [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; 2016 – cited 2016 Jul 12. Available from: http://blast.ncbi.nlm.nih.gov/blast/smartblast/
If you’re talking about SmartBLAST as a resource and need to cite a paper about it, it is mentioned in this year’s NAR Database Issue, in a paper called “Database resources of the National Center for Biotechnology Information.”: https://www.ncbi.nlm.nih.gov/pubmed/26615191
Hi, SmartBlast is great so far. Now, I would love to have the smart blast results but using more than one query. Or at least select which landmark database sequences I want to use.