Introducing the new Virus Sequence Search Interface
BLAST is a powerful search tool, but often a search is just the beginning of the journey. We put ourselves in the shoes of a researcher who has just sequenced a handful of samples from the latest viral outbreak and tried to understand what information would be most useful. We also reached out to researchers in the field and asked: a) what questions do they really want to answer? and b) how can NCBI best provide the answers? Based on insights from those questions and answers, we developed the new Virus Sequence Search Interface (Fig. 1). The Search Interface is an NCBI Labs project, which means it is an experimental project, and we may modify the resource based on your feedback and experiences.
This tool provides rapid insight into query sequences by presenting Blastn and Blastp results alongside normalized metadata, when available. These include: isolation source, host, country, and date, as well as genetic attributes such as completeness, and segment or protein names when applicable. The normalized metadata is generated via an internal, curator-guided data-processing pipeline that maps sequence-record attributes to standardized vocabularies to provide a user-friendly view of the data.
The interface currently supports BLAST searches for influenza viruses, rotavirus A, dengue viruses, West Nile virus, Zika virus, ebolaviruses, and MERS coronavirus sequences.
Select the type of BLAST search you want to perform: Nucleotide tab for Blastn, and Protein tab for Blastp.
Enter a single query sequence (currently, the interface supports only a single query) in the search box. The accepted formats: accession number, FASTA, bare sequence. You can also use the example included in the interface.
Select the virus of interest from dropdown menu.
The results of the search will appear below the search box (Fig. 2).
The results table can be customized by adding or deleting the columns from “Select Columns” menu (Fig. 3).
Not only are BLAST results presented alongside normalized metadata, but the results can be refined by filtering along these terms (Fig. 4).
Additionally, to further facilitate rapidly placing your sequence of interest in a biological context, the results can be viewed as a phylogenetic-tree or as a multiple sequence alignment (Figs. 5, 6, and 7).
We invite you to try out this tool and send us your feedback!