We have a new and improved search experience for viral genes from select human pathogens. When you search for a virus such as HIV-1 (more examples below), you now get an interactive graphical representation of the viral genome where you can see all the annotated viral proteins in context. Clicking on the gene / protein objects allows you to access sequences, publications, and analysis tools for the selected protein. This new feature is designed to help you quickly find information relevant to your research on clinically important viruses.Figure 1. Top: The virus genome graphic result for a search with HIV-1 with access to analysis tools, downloads, and relevant results in the Genome and Virus resources. Bottom: The result obtained by clicking the env gene graphic, which provides links to protein and nucleotide sequences, the literature, analysis tools, and downloads.
Try it out using the following example searches and let us know what you think!
NCBI is testing a new way to find and retrieve orthologous vertebrate genes. To find orthologs enter a gene symbol (e.g. RAG1) or a gene symbol combined with a taxonomic group (e.g. primate RAG1). Select the matching entry from the suggestions menu or you can select the orthologs option (e.g. Rag1 orthologs) to see all orthologs. Your search will return a results link to the set of orthologs provided by NCBI’s Gene resource. Click on the results link to see information for that ortholog group (Figure 1).
Figure 1. Search for Rag1 orthologs showing the link to the set of RAG1 genes from vertebrates.
If you’ve been searching in Gene, Nucleotide, Protein, Genome or Assembly databases, you’ve probably noticed the new search experience we introduced in September to interpret several common language searches and offer improved results. We’re excited to announce we’ve added as-you-type suggestions to the search bar in these databases.
Here’s a peek at the new menu in the NCBI Gene database.
Figure 1. Typing into the search box brings up automatic suggestions of the most popular queries.
Next Wednesday, November 14, 2018, NCBI staff will show you how to use NCBI’s genome browsers and other resources to interpret variants. The graphical displays of Genome Data Viewer (GDV) and Variation Viewer offer an interactive experience that allows you to explore NCBI’s rich collection of annotations, datasets and literature for deciphering your variant-associated data. In this presentation, we’ll step through case studies and show you how to quickly display relevant NCBI track sets — including the new RefSeq Functional Elements track, upload a file or remotely-hosted dataset and display these as a track, and use browser tracks to identify known variants, then assess variant functional and clinical significance and allele frequency. You will also learn how to navigate from the browsers to NCBI resources such as ClinVar, dbSNP and PubMed, for additional variant information.
Date and time: Wed, Nov 14, 2018 12:00 PM – 12:45 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
The RefSeq project at the NCBI and the Ensembl/GENCODE project at EMBL-EBI have provided independent high-quality human reference gene datasets to biologists since the sequencing of the human genome.
Now we’re joining together on an exciting new project we’re calling Matched Annotation from the NCBI and EMBL-EBI or MANE, to provide a matched set of well-supported transcripts for human protein-coding genes and define one representative transcript for each gene. Both RefSeq and Ensembl will continue to provide a rich set of alternate transcripts per gene.
Earlier this year, we announced the release of a new and improved search feature that interprets plain language to give better results for common searches. This feature, originally developed in NCBI Labs and later released on the NCBI All Databases search, is now available across several NCBI resources: Nucleotide, Protein, Gene, Genome, and Assembly. Whether you are searching for a specific gene or for a whole genome, you will now retrieve NCBI’s best results regardless of the database you search.
The image below shows the results for a search for human INS in the Nucleotide database. Even though this is a Nucleotide search, the results include relevant information from Gene, Protein, Taxonomy, plus links to the NCBI reference sequences (RefSeq) as well as access to BLAST and the insulin gene region in NCBI’s genome browser, the Genome Data Viewer.Figure 1. The new natural language search result in the Nucleotide database from a search for human INS.
Try out this new search capability and let us know what you think. And keep visiting the NCBI Labs search page to try our latest experiments, which we’ll also announce here on NCBI Insights.
Professors, we know you’re busy — really, really busy. You have to develop and teach your courses and labs, coordinate and run your journal clubs and seminars, direct your lab’s research efforts, write grants and publications, counsel and mentor your students, and stay current on everything related to your teaching and research topics.
NCBI has information that can help with all of this, but there are so many interesting records and so little time to organize them. Sign up (Help) for or log in (Help) to your free NCBI Account and let us help you get started and get organized!
Read on – or watch the video embedded below – to learn more about what you can do with your NCBI Account.
The CCDS project is a collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality. The long-term goal is to support convergence towards a standard set of gene annotations.
A total of 20,203 protein-coding genes and 17,871 non-coding genes were annotated.
The number of annotated curated transcripts increased by 17% and genes with two or more curated alternative variants increased by 8%.
The annotation includes 6,862 features and 2,075 GeneIDs for non-genic functional elements, such as regulatory regions and known structural elements. For example, see the opsin locus control region (OPSIN-LCR).