NCBI’s Genome Data Viewer (GDV) now supports visualization and analysis of nearly 400 submitter-annotated chromosome-level assemblies from the INSDC (GenBank/ENA/DDBJ). These submitter-annotated assemblies join more than 1,200 NCBI RefSeq-annotated assemblies available in GDV for hundreds of eukaryotes, spanning fungi, plants, fish, insects, and all major model organisms.
Figure 1. Submitter-annotated Malus domestica (apple) assembly displayed in GDV. GDV provides submitter-provided gene annotation, as well as some additional tracks including interspersed repeats identified by RepeatMasker and six-frame translations (not shown). Red boxes indicate useful tools and panels including a search box, an exon navigator, and interfaces to add user data and conduct NCBI BLAST searches.
Analyzing INSDC assemblies in GDV
GDV provides a number of tools for genome navigation. You can use the Search box to go to a coordinate on a chromosome, or to search and navigate to locus_tags or accession identifiers in the submitter-annotated gene track. Switch chromosomes using the ideogram or drop-down selectors. You can also browse through genes and gene exons in the annotation track using the exon navigator(Figure 1).
In addition to submitter-provided INSDC gene annotation tracks, you may find additional NCBI-supplied tracks that can aid in analysis of these assemblies in the browser. User-submitted assemblies in GDV may display assembly scaffolds, six-frame translation, and repeat annotation (Figure 1). You can conduct BLAST searches of the genome assembly from within GDV and add BLAST alignments into view. In addition, you can add your own custom data alignments as uploads or remote files. GDV supports track data in a variety of common bioinformatics formats, including GFF, BAM, BED/BigBED, Wig/BigWig, and VCF.
How to find and display an INSDC assembly in GDV
There are several ways to find and display an available INSDC assembly in GDV. These include the GDV homepage search, the GDV browser, and searching the assembly database
Here are a few suggestions:
1. Start on the GDV home page , and search or browse for your organism of interest. You can use the table view (Figure 2) to see submitter-annotated assemblies, which have the assembly accession prefix ‘GCA_’. RefSeq assemblies have the prefix ‘GCF_’. If you prefer not to see submitter-annotated assemblies in this table, you can filter them out by selecting the ‘RefSeq Only’ option .
Figure 2. The table view from the GDV search page with a search for Saccharomyces. You can switch between the table view and the dendrogram view on the page using the ‘switch view’ button at the upper left. The submitted assemblies are labeled as GenBank and have ‘GCA_’ prefixed accessions. You can filter the table to show only RefSeq assemblies using the ‘Filter assemblies’ feature at the top of the table.
2. If you already know the GenBank accession number (prefix GCA_) of your assembly of interest, you can search ‘All Databases’ on the NCBI home page and link to GDV from the search results box (Figure 3).
Figure 3. The panel that appears when you search ‘All Databases’ with an assembly accession number. The panel provides a link to ‘see a graphical view in Genome Data Viewer’ that loads the assembly into GDV.
3. You can also search NCBI’s Assembly resource directly using a species name or GenBank accession number, and link to GDV from the sidebar of the assembly details page (Figure 4).
Figure 4. The assembly record for the Saccharomyces assembly GCA_001413975 retrieved by a direct search in the Assembly resource with the accession. You can also easily find submitted and RefSeq assemblies by searching with a taxon name. The ‘Genome Data Viewer’ link on the right-hand side loads the assembly into GDV.
Please contact us if you’d like information on how to make sure your favorite annotated eukaryotic assemblies are available for viewing at NCBI. You can reach us via the green Feedback button on the GDV pages, or by writing to the NCBI Help Desk.