Tag: Database of Genotypes and Phenotypes (dbGaP)

New dbGaP Subject Sample Telemetry Report Now Available

New dbGaP Subject Sample Telemetry Report Now Available

What is it and why does it matter? 

The Database of Genotypes and Phenotypes (dbGaP) has been used for over a decade to safely store and provide access to anonymized patient-level data related to research studies. Now you can get a Subject Sample Telemetry Report (SSTR) providing you more details about a dbGaP submission.   

With a growing database of over 2,300 studies with billions of demographic, phenotypic, and exposure measurements, we want to ensure you can easily access publicly available information for data submitted to us. 

What information is included in this report?

The SSTR is a one-stop shop for:  Continue reading “New dbGaP Subject Sample Telemetry Report Now Available”

Submit your data to dbGaP in 3 easy steps!

Submit your data to dbGaP in 3 easy steps!

Do you have human genetic data from a large-scale study? Submit your data to NCBI’s Database of Genotypes and Phenotypes (dbGaP) to contribute to meaningful discoveries about health. dbGaP contains data from more than 2.8 million study participants who have provided over 3.3 million molecular samples.

How do I submit data to dbGaP?

Step 1: Register your study

Step 2: Submit your data and get your study accession (phs#)

Step 3: Release your data

Continue reading “Submit your data to dbGaP in 3 easy steps!”

dbGaP: Data and analyses from millions of study participants, samples, and trillions of genotypes!

dbGaP: Data and analyses from millions of study participants, samples, and trillions of genotypes!

Are you familiar with the well-known Framingham Heart Study, a multi-generation study of residents of Framingham, Massachusetts begun in 1948? Much of what is now known about the impact of genetics, lifestyle, and diet on cardiovascular health and disease has come from this research study. (See PMC4159698  for a historical perspective.) Did you know that data from this study and over 2,000 other studies that demonstrate the relationship between genetic and medical outcomes and other phenotypes are available from NCBI’s Database of Genotypes and Phenotypes (dbGaP)?

dbGaP was established in 2007 as a repository of human data from large scale studies. You can access data from more than 2.8 million study participants who have provided over 3.3 million molecular samples. You can retrieve patient-level phenotypic (e.g., demographic, clinical, exposure) data and molecular (e.g., called genotypes omics, sequence) data, and the results of association analyses from genome-scale case-control and longitudinal studies of heritable diseases.

What types of studies and data are available in dbGaP?

dbGaP contains a wide range of studies and types of data, all relating to human genetic and phenotypic measurements. Most dbGaP data are from NIH-funded research, but recently we have expanded to include non-NIH funded studies. An easy way to find dbGaP Studies, Phenotype and Molecular Datasets, Variables, Analyses and Documents is through the dbGaP Advanced Search (Figure 1). The interface allows you to filter results by different characteristics depending on the tab you choose.

Figure 1. The dbGaP Advanced Search interface. Tabs that appear at the top of the web interface allow you to select the studies, datasets, analyses, etc. of interest. Filters (facets) appear on the left (see inset). Click on filters to select values to find Links on the study summary pages provide direct access to data. Top panel:  Studies tab and the corresponding filter categories.  Bottom panel: Molecular data tab results with Study (Framingham SHARe), Markerset Source (Affymetrix) filters applied. 

Continue reading “dbGaP: Data and analyses from millions of study participants, samples, and trillions of genotypes!”

Connect with NCBI at ASHG 2022

Connect with NCBI at ASHG 2022

Join us October 25-29 in Los Angeles, CA

We are looking forward to seeing you in-person at the American Society of Human Genetics (ASHG) annual meeting, October 25-29, 2022, in Los Angeles, California.

We will present a variety of talks and posters featuring our clinical and human genetic resources, as well as genome products and tools. We are excited to introduce the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. If you’re interested in providing feedback that will be used to help drive CGR forward, consider joining our round table discussion.  

Check out NCBI’s schedule of activities and events: 

Continue reading “Connect with NCBI at ASHG 2022”

Using NCBI resources to research, detect, and treat genetic phenotypes

Using NCBI resources to research, detect, and treat genetic phenotypes

Clinical Genetics Information at Your Fingertips

NCBI offers a portfolio of medical genetics resources to help you research, diagnose, and treat diseases and conditions. You can easily access our data and tools through the Medical Genetics and Human Variation page of the NCBI website. We also encourage you to join our community of thousands of submitters and share your germline and/or somatic data to advance discovery and optimize clinical care. 

How and why should you use our resources? Consider the example below. 

Your patient is a 40-year-old mother of two presenting with changes in bathroom habits, bleeding, and belly pain. She has a medical history of colonic polyps. Her family history reveals that her maternal grandmother, mother and uncle had several forms of cancers including colon, breast, and endometrium. 

Continue reading “Using NCBI resources to research, detect, and treat genetic phenotypes”

Three outdated browsers (1000 Genomes, dbGaP Data, and Get-RM) to retire in April 2022. Data available in GDV

The Genome Data Viewer (GDV) is now the comprehensive NCBI genome browser. The  development of GDV led to a few different types of genome browsers along the way, each one originally delivering visual displays for particular datasets. We developed the 1000 Genomes Browser for variation data from the 1000 Genomes project, the dbGaP Data Browser for controlled-access sequence read alignment data, and the GeT-RM browser for Genome in a Bottle (GIAB) data.

The data displayed in these three browsers is now either obsolete and/or can largely be accessed from the GDV browser or other NCBI resources. Moreover, unlike GDV, these older browsers are no longer under active development and the data has not been updated to meet changing needs of the communities they were developed to serve.  For these reasons we will retire these browsers in April 2022. Please see details below for more information on the data displayed in these browsers and how to access and display these data now through GDV and other means.

Continue reading “Three outdated browsers (1000 Genomes, dbGaP Data, and Get-RM) to retire in April 2022. Data available in GDV”

Nov 3 Webinar: dbGaP submission improvements and GaPTools

Nov 3 Webinar: dbGaP submission improvements and GaPTools

Attention dbGaP submitters! Join us on November 3, 2021 at 12PM US eastern time to learn about data submission and processing improvements to dbGaP, NIH’s database of Genotype and Phenotype, which contains individual-level data associated with human research studies. You will see how we have made submission easier through the Submission Portal using automated preliminary validation and how you can use GaPTools, a stand-alone data validation tool, on your own submission to expedite the submission process. Join us to discover how dbGaP ensures integrity and high-quality in the genomic data that scientists can access to further their research.

    • Date: Wed, November 3, 2021
    • Time: 12:00 PM – 12:45 PM EDT
    • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.

View GEO, SRA, or dbGaP data tracks in NCBI’s Genome Data Viewer

Did you know that you can see epigenomic or other experimental data in NCBI’s Genome Data Viewer (GDV)?

You can easily add aligned study results from GEO, SRA, and dbGaP as data tracks to GDV browser view. Just go to the Tracks button on the toolbar and select the menu option to Configure Tracks. Navigate to the ‘Find Tracks’ tab on the pop-up Configure panel (Figure 1).

screenshot of genome data browser, showing 'Tracks' menu and 'Find Tracks' tab
Figure 1. Go to the ‘Tracks’ menu on the browser toolbar and select ‘Configure Tracks’ option. This will launch a panel where you can add, configure, remove, and search for data tracks. Go to the ‘Find Tracks’ tab to search for tracks to add to your browser view. Note: spaces act as AND operators in the search, and wildcards are accepted.

Continue reading “View GEO, SRA, or dbGaP data tracks in NCBI’s Genome Data Viewer”

Introducing GaPTools, a stand-alone data validation tool for dbGaP submissions

We have just launched GaPTools, a stand-alone data validation tool for NCBI’s database of Genotype and Phenotype (dbGaP) submissions. You can use GaPTools to validate your dbGaP submissions or submissions to other genomic data repositories. GaPTools checks for common data inconsistency and integrity issues and validates subject-sample ID mapping, subject consents, data dictionaries, and phenotype and genotype data. GaPTools is available as a docker image on Docker Hub.

Why Use GaPTools?

GaPTools will validate files before you submit (see Figure 1).  This means that by the time you formally submit, some of the pre-validation steps are already addressed.  This tool allows you to prepare your data quickly and ensures a faster processing cycle and a faster release of your individual-level research data.Figure 1: Flow chart depicting data submission and GaPTools validation

Continue reading “Introducing GaPTools, a stand-alone data validation tool for dbGaP submissions”

New feature in the dbGap submission portal: Automated study metadata

dbGaP has recently released a new feature to simplify submissions and provide study accessions faster. This video provides a quick overview of the new feature. 

Our new study config webform enables a study submitter to enter important study summary information including study description, inclusion/exclusion criteria, history, attribution, and associated publications online and instantly preview the study config content and study accession on their dbGaP study report page. Study design and type, PMIDsGenesMeSH terms, and associated Clinical Trials have built-in help and validation to ensure that the information provided is complete and searchable by users looking for that data. 

The database of Genotypes and Phenotypes (dbGaP) provides controlled-access to the data and results from studies that have investigated the interaction of genotype and phenotype in humans. dbGaP assigns stable, unique identifiers to studies and subsets of information from those studies, including documents, individual phenotypic variables, tables of trait data, sets of genotype data, computed phenotype-genotype associations, and groups of study subjects who have given similar consents for use of their data. 

Figure 1. dbGaP summary statistics

The submissions made to dbGaP represent the best and latest research in topic areas such as cardiovascular diseases, diabetes, autism spectrum disorders, precision medicine and many more. Submitters are central to the success of dbGaP and sharing of genomic research across the broader scientific community. Our submission portal serves as a central place to collect multiple components of a research study, including the metadata/summary and associated phenotype, genotype, and sequence data.