Designing exon-specific primers for the human genome


A common task facing geneticists is to assay for sequence changes at particular locations in genes. These assays are often looking for changes in the coding exon of genes, and the target sequences are typically amplified using PCR from genomic DNA using a pair of specific primers. In this article, we will show you how to use NCBI Reference Sequences and Primer-BLAST, NCBI’s primer designer and specificity checker, to design a pair of primers that will amplify a single exon (exon 15) of the human breast cancer 1 (BRCA1) gene.

Here are the steps to follow to design primers to amplify exon 15 from human BRCA1. Continue reading

Sequence updates in human assembly GRCh38: improving gene annotation


In an earlier blog post, we discussed how sequence updates in GRCh38, the most recent version of the human reference genome, filled in a gap in human chromosome 17 near position 21,300K and expanded the region by 500K (500,000 base pairs). In this post, we will again consider this same region, but with an emphasis now on how GRCh38 also improved the gene annotation.

"Figure

Figure 1. Annotation of a region of chromosome 17 near the KCNJ12 and KCNJ18 genes. Top panel: Annotation release 105 on GRCh37.p13 represented by a configured graphic display of sequence record NC_000017.10. Bottom panel: Annotation release 106 on assembly GRCh38 represented by a configured graphic display of sequence record NC_000017.11. New gene models are circled. 

Figure 1 shows a narrower area that corresponds to components AC068418.5 and AC233702.5 on GRCh38. The graphic display is configured so that it shows annotated gene models without the corresponding transcripts and proteins. The two assemblies share component AC068418.5 along with the five gene models annotated on it.  That the same sequence would have the same annotation over time might seem an obvious outcome, but this is not always the case. Annotations on the same sequence (same assembly) can change from one annotation release to another if new transcript data support a new gene model, and this process of gathering and presenting new evidence for gene models is one of the major purposes of new annotation releases on a given assembly.

Progressing from AC068418.5 towards the gap in GRCh37.p13, the gene annotation diverges. Obviously, nothing (or anything) can be annotated within the GRCh37.p13 gap. But in GRCh38/Annotation release 106, where this gap has been filled by AC233702.5 (along with other new sequences), a new gene designated as KCNJ18 now appears. KCNJ18 (Gene ID: 100134444), a member of the inwardly-rectifying channel subfamily that J. Devon Ryan and his colleagues from the University of California recently discovered. They also reported evidence that the gene is associated with a muscle disorder (PMCID: PMC2885139). The transcript sequence of the gene was deposited to GenBank in 2008 and updated in 2010 (FJ434338.2). Improvements in the new assembly now allow this transcript sequence to align to the assembled genome sequence, and thus KCNJ18 has found its place on the human genome.

One might predict from the above discussion that if a research had searched for FJ434338.2 in GRCh37.p13, they would have found nothing because of the gap in the assembly. In fact, the genomic sequence was available in GRCh37.p13 on NW_003315950.2, a separate sequence record that is one of the fix patches that GRC released during the five-year period between the releases of GRCh37 and GRCh38. A fix patch is a region where the sequence has been improved or corrected between assemblies. Now, in GRCh38, the sequence of the fix patch has been integrated into chromosome 17 in the region that we have just examined.

The KCNJ18 gene is one of numerous genes where a gap closure allowed placements of new gene models on the genome.  The following are other examples:

Gaps, however, are not the only problem in genomic assemblies. While small-scale deletions or insertions usually allow gene model placement on the genome, they often cause misalignments between the genome and transcript sequences.  Some examples where correcting deletions or insertions improved gene annotations in GRCh38 are the following:

We hope this provides a starting point for exploring the improvements in GRCh38. You can find more information about the new release at the links below.

NCBI’s 3 Newest Medical Genetics Resources: GTR, MedGen & ClinVar


GTR_ClinVar_MedGen imageNCBI has three relatively new online resources for information about genetic tests, genetic conditions, and genetic variations:

  • The Genetic Testing Registry, or GTR – a registry of genetic tests for heritable and somatic changes in humans
  • MedGen – a medical genetics portal that focuses on information about medical conditions with a genetic component
  • ClinVar – an archival database that contains reported assertions about the relationship between genetic variations and phenotypes

This blog will provide a very brief overview of the three resources by outlining some of their content features. For a more thorough introduction to the three resources, including the types of information available in each and how to use them, we recommend viewing this approximately hour-long webinar that we conducted in June 2014.

The GTR, MedGen and ClinVar databases are all integrated, making it simple to navigate between them to find related information. They are also integrated with a number of other databases, such as OMIM, GeneReviews, PubMed, Genetics Home Reference, and others.  This integration provides a rich information space for exploration, but it is nonetheless helpful to know where you might want to start based on the type of information you are seeking. Continue reading

Advice for NIH Grantees: How to comply with the NIH Public Access Policy


“The NIH public access policy requires scientists to submit final peer-reviewed journal manuscripts that arise from NIH funds to PubMed Central immediately upon acceptance for publication.” – http://publicaccess.nih.gov/

To comply with NIH Public Access Policy, here are the steps you should take:

Determine if the Public Access Policy applies to your publication

Generally, the NIH Public Access Policy applies to any peer-reviewed journal article that was accepted for publication on or after April 7, 2008 and that arose from NIH funding in Fiscal Year 2008 or later.

Determine Applicability for Your Publication

What does the NIH consider to be a ‘journal’?

Review your publication agreement

Before you sign a publication agreement or similar copyright transfer agreement, first make sure that the agreement allows the paper to be posted to PubMed Central (PMC) in accordance with the NIH Public Access Policy.

Continue reading

New SciENcv Features Allow Users To Create and Download Multiple Biosketches


NCBI’s recent update to the SciENcv feature in MyNCBI gives researchers the ability to create multiple biosketches for grants from federal agencies engaged in scientific research, allowing a more tailored and convenient approach to the grant application process.

What is SciENcv?

SciENcv (Science Experts Network Curriculum Vitae) is designed to help researchers assemble an NIH biosketch by extracting information from NIH eRA Commons and PubMed. The SciENcv interagency working group includes NIH, as well as DOD, DOE, EPA, NSF, USDA and the Smithsonian. You can access SciENcv if you have a My NCBI account. My NCBI accounts are free and offer many useful features, such as saving searches, automated e-mail alerts and My Bibliography.

 Create your biosketch

Based on user suggestions, we’ve made it possible to create biosketches in three ways: from scratch, from an external source, or by duplicating an existing profile (see Figure 1). While the eRA Commons data feed is currently the only external data option, we plan on adding other external data sources in a future release of SciENcv.

Figure 1. Three ways to create your NIH biosketches in SciENcv

Figure 1. Three ways to create your NIH biosketches in SciENcv

Continue reading

The Second Offering of “A Librarian’s Guide to NCBI” at NIH


NCBI, in collaboration with NLM and the National Network of Libraries of Medicine NLM Training Center (NTC) at the University of Utah, recently presented the second offering of A Librarian’s Guide to NCBI. Health Sciences Librarians from 17 universities and two federal agencies attended the five-day intensive course on the NIH campus. This second offering of the training continues to prepare health science librarians for supporting NCBI molecular databases and tools, and training patrons in the use of NCBI resources at their own institutions.

Participants and instructors from the 2014 “A Librarian’s Guide to NCBI” outside the National Library of Medicine.

Participants and instructors from the 2014 “A Librarian’s Guide to NCBI” outside of the National Library of Medicine.

As before, all the course materials are available online. Feel free to learn from them, adapt them for your own teaching, and share them with others. You can use the links below to access the updated 2014 course materials. These include the slide sets with demonstrations and practice problems.

Continue reading

Sequence updates in human genome assembly GRCh38: filling in the gaps


In a previous blog post, we explained several important concepts about the human reference genome.  We presented a region of human chromosome 17 as an example of a location where the genome sequence was not fully assembled.  In this post, we are going to revisit the same gapped region to see how the Genome Reference Consortium (GRC) changed this part of the genome in GRCh38, the updated human reference assembly released in December 2013.  This region represents just one of the more than 1,000 changes and improvements that the GRC introduced in GRCh38.

Continue reading

The Tasmanian Devil 2: The tumor and Tasmanian devil mitochondrial genomes


The Tasmanian devil (Sarcophilus harrisii), the last remaining large marsupial carnivore, now faces extinction because of a strange and deadly infection, a transmissible cancer known as Transmissible Devil Facial Tumor Disease (TDFTD).  In a previous NCBI Insights post, we discussed gene expression data from the tumors that established their neural origin and showed the tumors were likely derived from Schwann cells.  In this post, we’ll consider some of the genome sequencing projects in the NCBI databases and explore evidence that the tumor originated in a different individual than the affected animal supporting the idea that the tumor cells themselves are infectious agents. Continue reading

NCBI’s Genome Remapping Service assists in the transition to the new human genome reference assembly (GRCh38)


In late December 2013, the Genome Reference Consortium (GRC) released an updated version of the human reference genome assembly, GRCh38, and submitted these new sequences to GenBank. This is the first time in four years that a new major version of the human genome has become available to the genomics community.

Perhaps you’ve been working on data mapped to the previous assembly (GRCh37) that became available in March 2009, or maybe you are still using an even earlier version, such as NCBI36 from March 2006. Is there a way to reduce the amount of time and effort required to reanalyze your data in the context of the new assembly?

Yes! It’s NCBI’s Genome Remapping Service, or NCBI Remap for short.

Continue reading

A Librarian’s Guide to NCBI — an intensive training course for medical librarians to be offered April 2014


The NCBI in partnership with the National Library of Medicine Training Center (NTC) will offer the Librarian’s Guide to NCBI course on the NIH campus in April 2014. This will be the second presentation of the course; it was previously offered in the spring of 2013 (NCBI Insights April 11 and May 6, 2013). After the course, we will post lecture slides and hands-on practical exercises on the education area of the NCBI FTP site and video tutorials of the course lectures will be available on the NCBI YouTube channel. Materials from the 2013 course are available, as well as lecture videos for the expression module.
Continue reading