Tag: Single Nucleotide Polymorphism Database (dbSNP)

dbSNP Enhances Scalability, Data Diversity, and Accessibility

dbSNP Enhances Scalability, Data Diversity, and Accessibility

As part of the Human Genome Project, NCBI, part of the National Library of Medicine, and the National Human Genome Research Institute (NHGRI) established the Single Nucleotide Polymorphism database (dbSNP) in 1998. Over the last 25 years, dbSNP has evolved into a reliable central public repository for genetic variation data. dbSNP is a community-accepted reference data set for genetic research, analysis pipelines, and for both open-source and commercial tools. It is also an essential part of genetic research and discovery. For example, dbSNP data are used in nearly all human genetic variation research workflows and it serves as the foundation for commercially available ancestry testing products.  

Current dbSNP statistics include:
  • 3,800 submitters from all over the world 
  • 3.3 billion submitted SNP records
  • 1.1 billion Reference SNP records 
  • 1.0 billion Reference SNP records with population frequency 
  • dbSNP accessions are cited in over 65K publications 

Continue reading “dbSNP Enhances Scalability, Data Diversity, and Accessibility”

NCBI ALFA Project at Bio-IT World 2022 Hackathon

NCBI ALFA Project at Bio-IT World 2022 Hackathon

Announcing the Allele Frequency Aggregator (ALFA) Project as part of the Bio-IT World 2022 Hackathon: Visualization of NCBI ALFA Variants

Join NCBI at the Bio-IT World 2022 Hackathon on May 4-5, 2022 to learn about and work with data from our ALFA project! The primary goal of this hackathon project is to develop a novel tool, app, or approach to explore and visualize NCBI ALFA variants and allele frequency for 12 different human populations. We aspire to create a new helpful variant interpretation resource for the clinical and research communities.

We hope to see you there! More information and registration hereContinue reading “NCBI ALFA Project at Bio-IT World 2022 Hackathon”

Using NCBI resources to research, detect, and treat genetic phenotypes

Using NCBI resources to research, detect, and treat genetic phenotypes

Clinical Genetics Information at Your Fingertips

NCBI offers a portfolio of medical genetics resources to help you research, diagnose, and treat diseases and conditions. You can easily access our data and tools through the Medical Genetics and Human Variation page of the NCBI website. We also encourage you to join our community of thousands of submitters and share your germline and/or somatic data to advance discovery and optimize clinical care. 

How and why should you use our resources? Consider the example below. 

Your patient is a 40-year-old mother of two presenting with changes in bathroom habits, bleeding, and belly pain. She has a medical history of colonic polyps. Her family history reveals that her maternal grandmother, mother and uncle had several forms of cancers including colon, breast, and endometrium. 

Continue reading “Using NCBI resources to research, detect, and treat genetic phenotypes”

NCBI genome browsers: search and you will find!

If you’ve ever tried searching for a genomic location in NCBI’s Genome Data Viewer (GDV) or Variation Viewer and found that your search term didn’t work, it’s time to try again! We recently expanded support for searches in our genome browsers using non-NCBI identifiers such as HGVS patterns (e.g. NM_001318787.2:c.2258G>A) and Ensembl IDs. You can also search by chromosome coordinatescytogenetic bandassembly scaffold/componentdisease/phenotypedbSNP identifier, or RefSeq transcript/protein accession. We’ve gathered example searches in the table below.

Search term Example(s)
Chromosome coordinate chr1:1,500,000-2,000,000
chr2: 1.5M-2,540.2K
3: 21.335M..21.337M
3: 21.335M..21.337M
chr5
Cytogenetic band 1p36.21
2q13
Assembly scaffold NT_005403.18
NW_021159987.1
Assembly component AC106865.4
AC018680.4
Gene/protein name PTEN
protease
Disease/phenotype diabetes
eye color
SNP rsID rs863223352
dbVar ID rs863223352
RefSeq transcript/protein accession NM_017551.3
XP_011538173.1
Ensembl gene/transcript indentifier ENSG00000233258
ENST00000404547
HGVS NM_001318787.2:c.2258G>A
NP_001289617: p.Arg272Cys

When you search by single coordinate, SNP or dbVar ID, or HGVS, the browser view zooms to the location of the search result. A marker is automatically created to identify the searched position.  For HGVS, the marker is labelled with the corresponding rsID, if there is one.

variation viewer search by HGVS results
Figure 1. Variation Viewer showing results of search by an HGVS pattern, NP_001289617.1: p.Arg272Cys.

As always, please contact us if you have additional questions or suggestions about this or any other feature in GDV or Variation Viewer. You can use the Feedback button on the page or write to the NCBI Help Desk directly.

View GEO, SRA, or dbGaP data tracks in NCBI’s Genome Data Viewer

Did you know that you can see epigenomic or other experimental data in NCBI’s Genome Data Viewer (GDV)?

You can easily add aligned study results from GEO, SRA, and dbGaP as data tracks to GDV browser view. Just go to the Tracks button on the toolbar and select the menu option to Configure Tracks. Navigate to the ‘Find Tracks’ tab on the pop-up Configure panel (Figure 1).

screenshot of genome data browser, showing 'Tracks' menu and 'Find Tracks' tab
Figure 1. Go to the ‘Tracks’ menu on the browser toolbar and select ‘Configure Tracks’ option. This will launch a panel where you can add, configure, remove, and search for data tracks. Go to the ‘Find Tracks’ tab to search for tracks to add to your browser view. Note: spaces act as AND operators in the search, and wildcards are accepted.

Continue reading “View GEO, SRA, or dbGaP data tracks in NCBI’s Genome Data Viewer”

NCBI Presents Two Online CoLabs at ASHG 2020!

NCBI Presents Two Online CoLabs at ASHG 2020!

Two up-and-coming NCBI resources will be featured in videos, surveys and live events at the American Society for Human Genetics (ASHG) 2020 Annual Meeting. Come and watch on-demand videos in the CoLab Theater. Then, let us know what you think and how you do or might use these resources by either taking an online survey or joining us for the CoLab Live! Events on Thursday, October 29, 2020.

Continue reading “NCBI Presents Two Online CoLabs at ASHG 2020!”

dbSNP human build 154 release + ALFA data

dbSNP human build 154, now available, includes new ALFA (Allele Frequency Aggregator) variants and allele frequency. This build contains over two billion Submitted SNP (ss) records and 730 million Reference SNP (rs) records.

New features include:

See the release notes for more information about what’s new in build 154.

April 22 Webinar on NCBI’s ALFA: allele frequency data for variant analysis and interpretation

April 22 Webinar on NCBI’s ALFA: allele frequency data for variant analysis and interpretation

On Wednesday, April 22, 2020 at 12 PM,  join NCBI staff to learn how results from the Allele Frequency Aggregator (ALFA) project will help you interpret the biological impact of common and rare sequence variants. ALFA’s initial release includes analysis of genotype data from ~100K unrestricted dbGaP subjects and provides high-quality allele frequency data now displayed on relevant dbSNP records. In this webinar, you will learn about the data in the recent ALFA release, see how to access the data from the web, FTP, and how to programmatically retrieve data by positions, genes, and other attributes using E-utilities and Variation Services API in Python.

  • Date and time: Wed, Apr 22, 2020 12:00 PM – 12:45 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

The ALFA dataset: New aggregated allele frequency from dbGaP and dbSNP now available

NIH’s data sharing policy now allows unrestricted access to genomic summary results for data from NCBI’s Database of Genotypes and Phenotypes (dbGaP).  Pooled allele frequency data from dbSNP and the dbGaP summary results are available as the new Allele Frequency Aggregator (ALFA) dataset. The ALFA dataset includes aggregated and harmonized array chip genotyping, exome, and genome sequencing data. The ALFA data are open access and freely available for you to incorporate into your workflows and applications from the dbSNP web pages (Figure 1), through FTP,and the Variation Services API. dbGaP currently has data for more than 2 million study subjects, approximately 1 million of whom have genotype data that is suitable for input into the ALFA dataset. The first release of ALFA contains data on about 100,000 subjects, and we hope to complete processing of data on the other 925,000 subjects within the next year. This volume and variety of data promises unprecedented opportunities to identify genetic factors that influence health and disease.  Register to attend our April 22 webinar and read on to learn more.

ALFAFigure 1.  ALFA allele frequencies for a variant (rs4988235) in the promotor of the lactase gene showing frequency differences across populations.

Continue reading “The ALFA dataset: New aggregated allele frequency from dbGaP and dbSNP now available”

NCBI on YouTube: Get the most out of NCBI resources with these videos

Check out the latest videos on YouTube to learn how to best use NCBI graphical viewers, SRA, PGAP, and other resources.

Genome Data Viewer: Analyzing Remote BAM Alignment Files and Other Tips

This video shows you how to upload remote BAM files, and succinctly demonstrates handy viewer settings, such as Pileup display options, and highlights the very helpful tooltips in the Genome Data Viewer (GDV). There’s also a brief blog post on the same topic.

Continue reading “NCBI on YouTube: Get the most out of NCBI resources with these videos”