If you’ve been searching in Gene, Nucleotide, Protein, Genome or Assembly databases, you’ve probably noticed the new search experience we introduced in September to interpret several common language searches and offer improved results. We’re excited to announce we’ve added as-you-type suggestions to the search bar in these databases.
Here’s a peek at the new menu in the NCBI Gene database.
Figure 1. Typing into the search box brings up automatic suggestions of the most popular queries.
Next Wednesday, November 14, 2018, NCBI staff will show you how to use NCBI’s genome browsers and other resources to interpret variants. The graphical displays of Genome Data Viewer (GDV) and Variation Viewer offer an interactive experience that allows you to explore NCBI’s rich collection of annotations, datasets and literature for deciphering your variant-associated data. In this presentation, we’ll step through case studies and show you how to quickly display relevant NCBI track sets — including the new RefSeq Functional Elements track, upload a file or remotely-hosted dataset and display these as a track, and use browser tracks to identify known variants, then assess variant functional and clinical significance and allele frequency. You will also learn how to navigate from the browsers to NCBI resources such as ClinVar, dbSNP and PubMed, for additional variant information.
Date and time: Wed, Nov 14, 2018 12:00 PM – 12:45 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
The RefSeq project at the NCBI and the Ensembl/GENCODE project at EMBL-EBI have provided independent high-quality human reference gene datasets to biologists since the sequencing of the human genome.
Now we’re joining together on an exciting new project we’re calling Matched Annotation from the NCBI and EMBL-EBI or MANE, to provide a matched set of well-supported transcripts for human protein-coding genes and define one representative transcript for each gene. Both RefSeq and Ensembl will continue to provide a rich set of alternate transcripts per gene.
Earlier this year, we announced the release of a new and improved search feature that interprets plain language to give better results for common searches. This feature, originally developed in NCBI Labs and later released on the NCBI All Databases search, is now available across several NCBI resources: Nucleotide, Protein, Gene, Genome, and Assembly. Whether you are searching for a specific gene or for a whole genome, you will now retrieve NCBI’s best results regardless of the database you search.
The image below shows the results for a search for human INS in the Nucleotide database. Even though this is a Nucleotide search, the results include relevant information from Gene, Protein, Taxonomy, plus links to the NCBI reference sequences (RefSeq) as well as access to BLAST and the insulin gene region in NCBI’s genome browser, the Genome Data Viewer.Figure 1. The new natural language search result in the Nucleotide database from a search for human INS.
Try out this new search capability and let us know what you think. And keep visiting the NCBI Labs search page to try our latest experiments, which we’ll also announce here on NCBI Insights.
Professors, we know you’re busy — really, really busy. You have to develop and teach your courses and labs, coordinate and run your journal clubs and seminars, direct your lab’s research efforts, write grants and publications, counsel and mentor your students, and stay current on everything related to your teaching and research topics.
NCBI has information that can help with all of this, but there are so many interesting records and so little time to organize them. Sign up (Help) for or log in (Help) to your free NCBI Account and let us help you get started and get organized!
Read on – or watch the video embedded below – to learn more about what you can do with your NCBI Account.
The CCDS project is a collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality. The long-term goal is to support convergence towards a standard set of gene annotations.
A total of 20,203 protein-coding genes and 17,871 non-coding genes were annotated.
The number of annotated curated transcripts increased by 17% and genes with two or more curated alternative variants increased by 8%.
The annotation includes 6,862 features and 2,075 GeneIDs for non-genic functional elements, such as regulatory regions and known structural elements. For example, see the opsin locus control region (OPSIN-LCR).
A study (PMID: 28158543) published in the July 2017 issue of Bioinformatics collects, classifies and analyzes single nucleotide variants (SNVs) that may affect response to currently approved drugs. They identified 2,640 SNVs of interest, most of which occur rarely in populations (minor allele frequency <0.01).
The researchers used protein sequence alignment tools and mined open data from multiple information resources accessed through E-utilities including PubChem Compound (Kim et al., 2016 PMID: 26400175), NCBI Gene (Maglott D, et al., 2014. PMID: 25355515), NCBI Protein (Sayers, 2013), MMDB (Madej et al., 2012 PMID: 22135289), PDB (Berman et al., 2000 PMID: 10592235), dbSNP (Sherry et al., 2001 PMID: 11125122), and ClinVar (Landrum et al., 2016 PMID: 26582918).
Questions, comments, and other feedback may be sent to Yanli Wang.
Last February, we added gene expression data to Gene. Now, you can access these data in a few new ways.
Figure 1. The expression teaser text from the human CYP2C19 gene record. CYP2C19 is a phase-one drug-metabolism gene expressed in liver and other organs/tissues involved in metabolizing drugs and other xenobiotics.
Expression pattern “teasers” in Summary
We’ve added a brief sentence describing the expression pattern to the Summary section. This teaser sentence describes tissue-specific expression of the gene, with a link to the complete description that appears in the Expression section.