In October last year, we announced the launch of an exciting new collaboration between NCBI and EMBL-EBI called MANE (Matched Annotation from the NCBI and EMBL-EBI). As a first step, we began generating the MANE Select set, comprising a matched representative transcript for every human protein-coding gene. Now that our genome resources are integrated into a high-quality transcript set, you don’t need to choose between RefSeq and Ensembl/GENCODE datasets for genomic analyses.
Not only does the MANE Select set make it easier for you to exchange data or translate coordinates between RefSeq and Ensembl annotation results, but you’ll also be able to use the set with NGS-based sequencing technologies and other resources that use the latest and highest-quality reference human genome assembly available.
You can now test a beta version of MANE Select that covers 53% of protein-coding genes. Access this data on NCBI’s FTP website. The data are released in multiple file formats including gff, gtf and fna. A new display track, “Genes, MANE Project (version 0.5)” in NCBI’s Genome Data Viewer (GDV) displays the MANE Select transcripts for genes that are in the current version. Additionally, we have set up a track hub that can display MANE Select data in genome browsers, including GDV.
NCBI and EBI co-hosted a webinar in December to introduce the MANE project and provide information about the data model and methodology associated with the MANE Select set. This webinar is now available at NCBI’s YouTube channel for everyone to view and learn more about this collaborative project.
In building the MANE Select set, both NCBI and EMBL-EBI updated many transcripts in their respective annotation sets to create matching annotation. Bulk consumers of RefSeq and Ensembl-GENCODE annotations may want to take note of these updates as they have resulted in version changes to the updated accessions.
Our goal for this year is to expand the MANE Select set to include more protein-coding genes. As we work towards this goal, we expect to make improvements to the MANE Select pipeline. The expert curators at NCBI and EMBL-EBI will help in quality control of the data and in the review of difficult cases where a computational choice of the MANE Select transcript is not ideal. Both annotation groups plan to incorporate the MANE Select data in their annotation updates in the spring of this year.
We’re designing the MANE project to help a wide range of NCBI and Ensembl-GENCODE users who are looking for high-value, consistent annotations to provide a framework for clinical reporting, comparative genomics, and other scientific pursuits. We are eager to hear comments about this dataset at MANEemail@example.com or firstname.lastname@example.org.