BLAST (Basic Local Alignment Search Tool) is a popular tool for finding sequences in a given database that are similar to a query sequence. Traditionally, BLAST displays these results as a sorted list of matches between the query and each database sequence. While this display is useful for examining how each subject sequence matches the query, it treats all subject sequences the same, regardless of the quality of the sequence data or its annotation, and also does not allow easy comparisons between different subject sequences.
For example, the subject sequences may fall into multiple groups of similar sequences, or all of the subject sequences may be more similar to each other than to the query. A common way to obtain this information is to construct a multiple sequence alignment of the query and some or all of the subject sequences, but to this point, BLAST has not provided such alignments directly.
Enter SmartBLAST! SmartBLAST is a new and experimental NCBI tool that makes it easier to complete common sequence analysis tasks, such as finding a candidate protein name for a sequence, locating regions of high sequence conservation, or identifying regions covered by database sequences but missing from the query.
To do this, SmartBLAST performs the following tasks in much less time than it takes to run a typical BLASTp search:
- a BLASTp comparison of the query with the closest matching sequences available;
- a parallel BLASTp search to find the closest matches to high quality sequences from model organisms;
- a multiple alignment between the query and five of the closest matching sequences (usually including two high quality sequences);
- an analysis that produces a phylogenetic tree from the multiple sequence alignment.
SmartBLAST then presents the results of the above tasks in a graphical view similar to that of a regular BLASTp search.
Figure 1 presents a sample SmartBLAST search with a predicted protein (XP_006115387.1) as the query. The display includes a phylogenetic tree in which every leaf-node is labeled with either the common name of the organism (if available) or the scientific name. This tree displays a wide taxonomic diversity that includes reptiles, mammals, and birds. The display also clearly indicates that the query contains N-terminal sequence that is not found in the matching sequences, and C-terminal sequence that BLAST could not align in the top three matches.
SmartBLAST can be accessed at http://blast.ncbi.nlm.nih.gov/smartblast/. It is still in an experimental phase and will change with little or no notice.
Please give it a try and let us know what you think by commenting on this post!