If you’ve ever tried searching for a genomic location in NCBI’s Genome Data Viewer (GDV) or Variation Viewer and found that your search term didn’t work, it’s time to try again! We recently expanded support for searches in our genome browsers using non-NCBI identifiers such as HGVS patterns (e.g. NM_001318787.2:c.2258G>A) and Ensembl IDs. You can also search by chromosome coordinates, cytogenetic band, assembly scaffold/component, disease/phenotype, dbSNP identifier, or RefSeq transcript/protein accession. We’ve gathered example searches in the table below.
When you search by single coordinate, SNP or dbVar ID, or HGVS, the browser view zooms to the location of the search result. A marker is automatically created to identify the searched position. For HGVS, the marker is labelled with the corresponding rsID, if there is one.
As always, please contact us if you have additional questions or suggestions about this or any other feature in GDV or Variation Viewer. You can use the Feedback button on the page or write to the NCBI Help Desk directly.
NCBI’s Genome Data Viewer (GDV) now supports visualization and analysis of nearly 400 submitter-annotated chromosome-level assemblies from the INSDC (GenBank/ENA/DDBJ). These submitter-annotated assemblies join more than 1,200 NCBI RefSeq-annotated assemblies available in GDV for hundreds of eukaryotes, spanning fungi, plants, fish, insects, and all major model organisms.
Figure 1. Submitter-annotated Malus domestica (apple) assembly displayed in GDV. GDV provides submitter-provided gene annotation, as well as some additional tracks including interspersed repeats identified by RepeatMasker and six-frame translations (not shown). Red boxes indicate useful tools and panels including a search box, an exon navigator, and interfaces to add user data and conduct NCBI BLAST searches.
The Genome Data Viewer (GDV) is now the comprehensive NCBI genome browser. The development of GDV led to a few different types of genome browsers along the way, each one originally delivering visual displays for particular datasets. We developed the 1000 Genomes Browser for variation data from the 1000 Genomes project, the dbGaP Data Browser for controlled-access sequence read alignment data, and the GeT-RM browser for Genome in a Bottle (GIAB) data.
The data displayed in these three browsers is now either obsolete and/or can largely be accessed from the GDV browser or other NCBI resources. Moreover, unlike GDV, these older browsers are no longer under active development and the data has not been updated to meet changing needs of the communities they were developed to serve. For these reasons we will retire these browsers in April 2022. Please see details below for more information on the data displayed in these browsers and how to access and display these data now through GDV and other means.
Did you know that you can see epigenomic or other experimental data in NCBI’s Genome Data Viewer (GDV)?
You can easily add aligned study results from GEO, SRA, and dbGaP as data tracks to GDV browser view. Just go to the Tracks button on the toolbar and select the menu option to Configure Tracks. Navigate to the ‘Find Tracks’ tab on the pop-up Configure panel (Figure 1).
The Bulk Sequence-Cytogenetic Conversion Service tool at NCBI will be retired in April 2022. This tool obtained cytogenetic locations for a list of annotated genes, SNPs, or assembly coordinates from human, fruit fly, mouse, or rat genomes. It also obtained sequence coordinates for cytogenetic locations for these genomes. This web service will be retired due to low usage and obsolescence.
The underlying cgi (bp2band) will be retained and continues to drive the Ideogram service within the Genome Data Viewer (GDV) and the Genome Decoration Page. Researchers interested in understanding where features are located relative to chromosome cytogenetic banding should check out the Genome Decoration Page, where you can enter a file of genome annotations and display them on a ideogram of your assembly of interest. You can also go directly to a cytogenetic location on a genome using the search box in the GDV genome browser.
Do you need to know which of the many NCBI dbSNP variants annotated near your region of interest are likely to be functionally or clinically significant? Figure it out with the track labelled ‘ClinVar variants with precise endpoints’, available on sequence display viewers at NCBI, including the Genome Data Viewer (GDV) and Variation Viewer!
This track shows variation annotation, including single nucleotide variants and other short variants (e.g. insertions, deletions, etc.) in the NCBI ClinVar database and provides pathogenicity and other metadata. The ClinVar track is displayed next to the default NCBI and Ensembl gene annotation tracks and other NCBI-provided dbSNP and RNA-seq expression tracks.
Every so often, we gather our most recent videos in one post on the blog, for your convenience. Scroll down – and don’t forget to subscribe to our channel!
Introducing GaPTools for dbGaP Submitters
This video introduces new standalone software called GaPTools, which you can use to check your data before submitting to dbGaP. GaPTools uses the same preliminary validation checks as the dbGaP submission portal.
We are excited to announce new track display options for gene annotation tracks in the NCBI Genome Data Viewer genome browser and other instances of the NCBI Sequence Viewer!
Now, you can simplify gene annotation tracks to show only the genes and transcripts that you care about most. For instance, you can choose to hide non-coding transcripts, including pseudogenes, so that only protein-coding transcript variants are shown in your view. You can also hide any transcript models predicted using NCBI’s Gnomon algorithm. Learn more:
The new reference assembly for sheep is now annotated! Assembly ARS-UI_Ramb_v2.0 is made of 142 scaffolds, a drop from 2,640 in the 2017 assembly Oar_rambouillet_v1.0. With a contig N50 of 43 Mb, ARS-UI_Ramb_v2.0 is 15 times more contiguous than the first assembly of the Rambouillet breed.
Annotation Release 104 (AR 104) of ARS-UI_Ramb_v2.0 reflects these improvements. Nearly 200 more coding genes have a 1:1 ortholog in the human genome than in the annotation of Oar_rambouillet_v1.0 (AR 103). The number of coding models annotated as partial is down 35% from 165 to 107, and the number of coding models labeled low quality due to suspected indels or base substitutions in the underlying genomic sequence decreased by 51% (1646 to 796). Based on BUSCO analysis, 99.1% of the models (cetartiodactyla_odb10) are complete in AR 104 versus 98.8% in AR 103. Details of this annotation, including statistics on the annotation products, the input data used in the pipeline and intermediate alignment results, can be found here. Continue reading “Announcing the RefSeq annotation of sheep ARS-UI_Ramb_v2.0!”→
The genomes table (Figure 1) now offers filters for:
Reference genomes — switch it on to only show reference or representative genomes
Annotated — switch it on to only show annotated genomes
Assembly level — use the assembly level slider to select higher-quality genomes
Year released — use the slider to limit your results to recent genomes
In addition, the new Actions column connects you to NCBI’s Genome Data Viewer, BLAST, and Assembly. The Text filter box lets you search by the name of the assembly, species/infraspecies, or submitter.Figure 1. The new Datasets Genomes page with primate assemblies showing the STATUS switches (reference genomes, annotated); expanded filters section with ASSEMBLY LEVEL and YEAR RELEASED sliding selectors; and the Actions column menu with access to Assembly details, BLAST, the Genome Data Viewer, and Download options. Continue reading “Introducing the new NCBI Datasets Genomes page”→