New human genome annotation release with MANE Select and other improvements!


There’s a new RefSeq annotation available for the human genome, and it’s quite an update!

About the release

Annotation release 109.20190607 is the first release of our new bimonthly annotation schedule as announced in a previous post.   The annotated sequences are  the latest sequences for the GRCh38, patch 13 assembly, GRCh38.p13 (GCF_000001405.39). The chromosome backbone sequences remain the  same, but we’ve added 45 patch sequences representing novel and improved sequences that the Genome Reference Consortium will incorporate into the primary assembly in the future. The new annotation places the latest curated RefSeq transcripts and functional elements on the genome but keeps the same model dataset as in annotation release 109 except when the models have been replaced by curated RefSeqs or other review. We are also flagging MANE and other RefSeq Select transcripts.  Continue reading for more details on these improvements below. You can download the updated annotation here!

Continue reading

Human genome annotation will be updated every 2 months


NCBI will be updating the human genome RefSeq annotation more frequently to incorporate improvements made to genes and transcripts by RefSeq curation experts. Faster updates will allow us to include the latest datasets.

In the past, we’ve produced a full re-annotation of the human genome about once a year. The last full annotation, Homo sapiens Annotation Release 109, was in March 2018. A full annotation is produced by two main processes:

Continue reading

Human annotation release 109 for GRCh38.p12 is available in RefSeq


You can now download human annotation release 109 on FTP or explore it in the Genome Data Viewer, in the Gene database, and with BLAST.

Highlights in release 109:

  • A total of 20,203 protein-coding genes and 17,871 non-coding genes were annotated.
  • The number of annotated curated transcripts increased by 17% and genes with two or more curated alternative variants increased by 8%.
  • The annotation includes 6,862 features and 2,075 GeneIDs for non-genic functional elements, such as regulatory regions and known structural elements. For example, see the opsin locus control region (OPSIN-LCR).

Continue reading

5 NCBI articles in 2018 Nucleic Acids Research database issue


The 2018 Nucleic Acids Research database issue features several papers from NCBI staff that cover the status and future of databases including CCDS, ClinVar, GenBank and RefSeq. These papers are also available on PubMed. To read an article, click on the PMID number listed below.

Continue reading

dbSNP architecture redesign supports future human variation data expansion; changes to be introduced over the next year


To continue providing efficient and timely processing, annotation, and dissemination of data, dbSNP’s architecture and process flow have been redesigned. The technical redesign prepares the database for increasing data volumes and providing timely, effective and trustworthy reference SNP results as submission rates continue to increase.

Highlights of the new system include:

  • Use of data objects instead of a relational database
  • Improved algorithms for clustering data into unique Reference SNPs
  • Automation of the entire process to provide timely releases
  • Guaranteed data consistency across dbSNP data accessed using web-based products or downloaded content, such as VCF and FTP files

Continue reading

RefSeq Functional Elements now public


NCBI is pleased to announce the initial data release of RefSeq Functional Elements, a resource that provides RefSeq and Gene records for experimentally validated human and mouse non-genic functional elements. Data can be accessed via GeneNucleotideBLASTBioProjectGraphical Displays and FTP.

Continue reading

Designing exon-specific primers for the human genome


A common task facing geneticists is to assay for sequence changes at particular locations in genes. These assays are often looking for changes in the coding exon of genes, and the target sequences are typically amplified using PCR from genomic DNA using a pair of specific primers. In this article, we will show you how to use NCBI Reference Sequences and Primer-BLAST, NCBI’s primer designer and specificity checker, to design a pair of primers that will amplify a single exon (exon 15) of the human breast cancer 1 (BRCA1) gene.

Here are the steps to follow to design primers to amplify exon 15 from human BRCA1:

Continue reading

Sequence updates in human assembly GRCh38: improving gene annotation


In an earlier blog post, we discussed how sequence updates in GRCh38, the most recent version of the human reference genome, filled in a gap in human chromosome 17 near position 21,300K and expanded the region by 500K (500,000 base pairs). In this post, we will again consider this same region, but with an emphasis now on how GRCh38 also improved the gene annotation.

"Figure

Figure 1. Annotation of a region of chromosome 17 near the KCNJ12 and KCNJ18 genes. Top panel: Annotation release 105 on GRCh37.p13 represented by a configured graphic display of sequence record NC_000017.10. Bottom panel: Annotation release 106 on assembly GRCh38 represented by a configured graphic display of sequence record NC_000017.11. New gene models are circled. 

Continue reading

Sequence updates in human genome assembly GRCh38: filling in the gaps


In a previous blog post, we explained several important concepts about the human reference genome.  We presented a region of human chromosome 17 as an example of a location where the genome sequence was not fully assembled.  In this post, we are going to revisit the same gapped region to see how the Genome Reference Consortium (GRC) changed this part of the genome in GRCh38, the updated human reference assembly released in December 2013.  This region represents just one of the more than 1,000 changes and improvements that the GRC introduced in GRCh38.

Continue reading

NCBI’s Genome Remapping Service assists in the transition to the new human genome reference assembly (GRCh38)


In late December 2013, the Genome Reference Consortium (GRC) released an updated version of the human reference genome assembly, GRCh38, and submitted these new sequences to GenBank. This is the first time in four years that a new major version of the human genome has become available to the genomics community.

Perhaps you’ve been working on data mapped to the previous assembly (GRCh37) that became available in March 2009, or maybe you are still using an even earlier version, such as NCBI36 from March 2006. Is there a way to reduce the amount of time and effort required to reanalyze your data in the context of the new assembly?

Yes! It’s NCBI’s Genome Remapping Service, or NCBI Remap for short.

Continue reading