GRAF, a tool for finding duplicates and closely related samples in large genomic datasets


NCBI’s Genetic Relationship and Fingerprinting (GRAF) tool is a quality assurance tool that can quickly find duplicates and closely related subjects in your data using SNP genotypes.

The population tool GRAF-pop included in GRAF computes subject ancestries using genotypes and normalizes ancestry prediction in large datasets collected across different genotyping platforms, making it possible to generate population frequency based on more than a million dbGaP samples.

Who can use this?

GRAF is a tool for researchers; it is not designed to assess an individual’s ancestry or to find relatives.

You can use this tool against your own large datasets with results generated within hours or minutes, even when there is a very high genotype missing rate to the order of 99%. This tool can check genotype datasets obtained using different chips or platforms, plotting them in the same picture for comparison purposes.

Continue reading