Author: NCBI Staff

NCBI genome browsers: search and you will find!

If you’ve ever tried searching for a genomic location in NCBI’s Genome Data Viewer (GDV) or Variation Viewer and found that your search term didn’t work, it’s time to try again! We recently expanded support for searches in our genome browsers using non-NCBI identifiers such as HGVS patterns (e.g. NM_001318787.2:c.2258G>A) and Ensembl IDs. You can also search by chromosome coordinatescytogenetic bandassembly scaffold/componentdisease/phenotypedbSNP identifier, or RefSeq transcript/protein accession. We’ve gathered example searches in the table below.

Search term Example(s)
Chromosome coordinate chr1:1,500,000-2,000,000
chr2: 1.5M-2,540.2K
3: 21.335M..21.337M
3: 21.335M..21.337M
chr5
Cytogenetic band 1p36.21
2q13
Assembly scaffold NT_005403.18
NW_021159987.1
Assembly component AC106865.4
AC018680.4
Gene/protein name PTEN
protease
Disease/phenotype diabetes
eye color
SNP rsID rs863223352
dbVar ID rs863223352
RefSeq transcript/protein accession NM_017551.3
XP_011538173.1
Ensembl gene/transcript indentifier ENSG00000233258
ENST00000404547
HGVS NM_001318787.2:c.2258G>A
NP_001289617: p.Arg272Cys

When you search by single coordinate, SNP or dbVar ID, or HGVS, the browser view zooms to the location of the search result. A marker is automatically created to identify the searched position.  For HGVS, the marker is labelled with the corresponding rsID, if there is one.

variation viewer search by HGVS results
Figure 1. Variation Viewer showing results of search by an HGVS pattern, NP_001289617.1: p.Arg272Cys.

As always, please contact us if you have additional questions or suggestions about this or any other feature in GDV or Variation Viewer. You can use the Feedback button on the page or write to the NCBI Help Desk directly.

NCBI on YouTube: Customize MSA Viewer, SciENcv, plants and RNA-Seq data, Datasets and PubMed

Missed a few videos on YouTube? Here’s the latest from our channel.

Customize the MSA Viewer to Make Your Analysis Easier

We’re constantly improving the Multiple Sequence Alignment (MSA) Viewer. This video demonstrates several new and popular features, including the ability to change data columns, hide selected rows, analyze polymorphisms, and more.

Continue reading “NCBI on YouTube: Customize MSA Viewer, SciENcv, plants and RNA-Seq data, Datasets and PubMed”

Web IgBLAST can now determine immunoglobulin isotypes

We have added a new function to IgBLAST on the Web. You can now search immunoglobulin (Ig) nucleotide sequences against the Constant region (C) gene database (Figure 1) to determine the Ig isotypes including subtypes (IgM, IgG, IgA1, etc.). The isotype information is reported in the rearrangement summary table, and the C gene region is displayed in the alignment section. This feature is now available on the IgBLAST web service for human and mouse sequences with possible expansion to other organisms in the future.  The feature is not yet implemented for the standalone IgBLAST package.

Figure 1.  IgBLAST constant region database selection and rearrangement summary table showing the top C gene match, IgHM in this case.  The NCBI C genes database is based on the the current NCBI human Reference Genome annotation.

GenBank release 246.0

GenBank release 246.0

GenBank release 246.0 (11/2/2021) is now available on the NCBI FTP site. This release has 16.1 trillion bases and 2.57 billion records.

The current release has 233642893 traditional records containing 1,014,763,752,113 base pairs of sequence data. There are also 1,721,064,101 WGS records containing 14,599,101,574,547 base pairs of sequence data, 508,319,391 bulk-oriented TSA records containing 449,891,016,597 base pairs of sequence data, and 107,569,935 bulk-oriented TLS records containing 40,168,874,815 base pairs of sequence data.

Continue reading “GenBank release 246.0”

RefSeq Release 209 is available

RefSeq Release 209 is available

RefSeq release 209 is now available online, from the FTP site and through NCBI’s Entrez
programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available as of November 1, 2021, and contains 296,293,486 records, including 215,655,378 proteins, 41,751,205 RNAs, and sequences from 114,396 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings. Continue reading “RefSeq Release 209 is available”

A more modern PMC is on its way – there’s still time to give us feedback!

In June, we announced the arrival of PMC Labs, where you can test drive the work underway to create a more modern PMC website. Since then, we’ve continued to talk to users, gather input, and make ongoing adjustments based on your feedback.

the feedback button is at the bottom right of the PMC labs page
Figure 1. The PMC Labs page has a green feedback button at the bottom right of the page (outlined here). Click that to let us know what you think.

We hope that the planned updates will create an easier navigation and reading experience, while keeping all the features you use most within PMC. If you haven’t had a chance to try out the changes, there’s still time to give input using the green feedback button in the lower right-hand corner of the site.

Continue reading “A more modern PMC is on its way – there’s still time to give us feedback!”

NCBI’s Genome Data viewer now displays both NCBI RefSeq and submitted assemblies

NCBI’s Genome Data Viewer (GDV) now supports visualization and analysis of nearly 400 submitter-annotated chromosome-level assemblies from the INSDC (GenBank/ENA/DDBJ). These submitter-annotated assemblies join more than 1,200 NCBI RefSeq-annotated assemblies available in GDV for hundreds of eukaryotes, spanning fungi, plants, fish, insects, and all major model organisms.

Figure 1 shows a GenBank apple assembly (GCA_004115385) displayed in GDV.

Figure 1. Submitter-annotated Malus domestica (apple) assembly displayed in GDV. GDV provides submitter-provided gene annotation, as well as some additional tracks including interspersed repeats identified by RepeatMasker and six-frame translations (not shown). Red boxes indicate useful tools and panels including a search box, an exon navigator, and interfaces to add user data and conduct NCBI BLAST searches. 

Continue reading “NCBI’s Genome Data viewer now displays both NCBI RefSeq and submitted assemblies”

Three outdated browsers (1000 Genomes, dbGaP Data, and Get-RM) to retire in April 2022. Data available in GDV

The Genome Data Viewer (GDV) is now the comprehensive NCBI genome browser. The  development of GDV led to a few different types of genome browsers along the way, each one originally delivering visual displays for particular datasets. We developed the 1000 Genomes Browser for variation data from the 1000 Genomes project, the dbGaP Data Browser for controlled-access sequence read alignment data, and the GeT-RM browser for Genome in a Bottle (GIAB) data.

The data displayed in these three browsers is now either obsolete and/or can largely be accessed from the GDV browser or other NCBI resources. Moreover, unlike GDV, these older browsers are no longer under active development and the data has not been updated to meet changing needs of the communities they were developed to serve.  For these reasons we will retire these browsers in April 2022. Please see details below for more information on the data displayed in these browsers and how to access and display these data now through GDV and other means.

Continue reading “Three outdated browsers (1000 Genomes, dbGaP Data, and Get-RM) to retire in April 2022. Data available in GDV”

NCBI will assign 64-bit numeric GIs by November 15th. Update affected software!

As announced  last month, NCBI will begin assigning larger (64-bit) numeric ‘GIs’ to the remaining sequence types that still receive these identifiers. This change is expected as soon as Nov. 15th, 2021 but could occur earlier if data submission volumes are unexpectedly high. This is a reminder that all organizations and developers using our products should review software for any remaining reliance on GIs and compatibility with these larger identifiers.

How do you know if your software or organization may be impacted?

If you have built custom software to interface with NCBI data and consume a sequence database UID (i.e. GI), process the GI from an ASN1 or XML product, or process the GI from any tabular product on FTP, you should review all code to ensure that the new, longer, 64-bit GIs will be handled properly. To ensure a smooth transition and the best overall experience, please update to the latest versions of NCBI-provided programmatic and command line tools. Alternatively, you could make updates  to your code to use accession.version identifiers instead of GIs.

NCBI is here to help the community as we make this change. Stay tuned here or follow NCBI Twitter where we will share updates and additional information, such as a final confirmation of the projected cutover date.

Please contact info@ncbi.nlm.nih.gov with any questions about this change or to determine if any software you are using is affected.

Nov 3 Webinar: dbGaP submission improvements and GaPTools

Nov 3 Webinar: dbGaP submission improvements and GaPTools

Attention dbGaP submitters! Join us on November 3, 2021 at 12PM US eastern time to learn about data submission and processing improvements to dbGaP, NIH’s database of Genotype and Phenotype, which contains individual-level data associated with human research studies. You will see how we have made submission easier through the Submission Portal using automated preliminary validation and how you can use GaPTools, a stand-alone data validation tool, on your own submission to expedite the submission process. Join us to discover how dbGaP ensures integrity and high-quality in the genomic data that scientists can access to further their research.

    • Date: Wed, November 3, 2021
    • Time: 12:00 PM – 12:45 PM EDT
    • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.