In late December 2013, the Genome Reference Consortium (GRC) released an updated version of the human reference genome assembly, GRCh38, and submitted these new sequences to GenBank. This is the first time in four years that a new major version of the human genome has become available to the genomics community.
Perhaps you’ve been working on data mapped to the previous assembly (GRCh37) that became available in March 2009, or maybe you are still using an even earlier version, such as NCBI36 from March 2006. Is there a way to reduce the amount of time and effort required to reanalyze your data in the context of the new assembly?
Yes! It’s NCBI’s Genome Remapping Service, or NCBI Remap for short.
NCBI Remap is a tool that allows you to convert annotation data from one coordinate system to another, such as from GRCh37 to GRCh38. This remapping uses genomic alignments to project features from one sequence to the other. In a nutshell, you provide your own data based on the coordinates of a specific assembly and tell NCBI Remap to which assembly you’d like to convert the coordinates, and you’ll get back coordinate mapping files for your data.
The Remap tool is particularly helpful in mapping data for comparing your data with NCBI RefSeq annotations. The new annotation (version 106) corresponding to this new assembly is anticipated to be available later this month. In the meantime, data submitted with RefSeq identifiers will find their data mapped onto the GenBank record, for example: NC_000019.9:g.45411941T>C maps to CM000681.2:44908684.
If you have a small amount of data, you can just copy and paste the data into the large text box labeled ‘Paste data here (see figure 1). Otherwise, you can just upload a data file. NCBI Remap accepts several file formats that are commonly used in the bioinformatics community, for example:
For more information about NCBI’s Remapping Service, take a look at the following: