Visualize and Interpret Alignment Data with the Multiple Sequence Alignment Viewer

Visualize and Interpret Alignment Data with the Multiple Sequence Alignment Viewer

The NCBI Multiple Sequence Alignment Viewer (MSAV) is a versatile web application that helps you visualize and interpret MSAs for both nucleotide and amino acid sequences. You can display alignment data from many sources, and the viewer is easily embedded into your own web pages with customizable options. An even simpler way to use MSAV is to use our page, upload your data, and share the link to a fully functional viewer displaying your results.

Display a variety of data sources

Here are the current data sources that you can upload to the viewer:

  • MUSCLE output, including FASTA text and ClustalW
  • NCBI BLAST request IDs (RIDs), including COBALT RIDs
  • URLs pointing to a data file
  • [Text annotation – track]
  • NCBI ASN.1
  • NCBI Genome Workbench project files

Try it out

The viewer serves a wide variety of uses, from exploration of protein families to overlapping short sequence reads to tracking bacterial or viral strains. We’ve placed several example alignments with links to the viewer on NCBI’s MSAV page.  These include both protein and nucleotide alignments, as well as the API’s upload function, so you can experiment with your own data.

Two such examples are below. In Figures 1a and 1b, you see a protein MSA of carbohydrate kinases, primarily ribulokinases, from a broad taxonomic range – bacteria to human.

Figure 1a covers the full extent of the master sequence, the top entry, and points out how insertions are presented in the viewer at this level.

Zoomed out view of ribulokinase proteins.
Figure 1a. Zoomed out view of ribulokinase proteins. (You can click on this picture and the others in this post to view them at full size.)

Figure 1b is zoomed to the sequence level, and shows an expanded row revealing labels for the ribulokinase conserved domains and small-scale features like active site residues, and illustrates how insertions are presented at the sequence level. Mousing over an inserted residue in the viewer provides more information.

Zoomed in view of Figure 1a.
Figure 1b. Zoomed in view of Figure 1a.

Figures 2a and 2b show an alignment of polymerase PB2 proteins from avian influenza A isolates, focusing on the E627K variant known to affect pathogenicity in mammals.

Figure 2a illustrates the Rasmol amino acid coloring schemes.

Rasmol amino acid coloring of aligned polymerase PB2 proteins.
Figure 2a. Rasmol amino acid coloring of aligned polymerase PB2 proteins.

Figure 2b shows the Frequency-based differences coloring schemes.

Frequency-based differences coloring of aligned polymerase PB2 proteins.
Figure 2b. Frequency-based differences coloring of aligned polymerase PB2 proteins.

Learn more about coloring schemes, navigation, and other MSAV functions in the Getting Started tutorial and a short introductory video. We welcome your feedback on the MSA Viewer– see the link in the upper right of the images above.

Leave a Reply