Earlier this year, we announced the release of a new and improved search feature that interprets plain language to give better results for common searches. This feature, originally developed in NCBI Labs and later released on the NCBI All Databases search, is now available across several NCBI resources: Nucleotide, Protein, Gene, Genome, and Assembly. Whether you are searching for a specific gene or for a whole genome, you will now retrieve NCBI’s best results regardless of the database you search.
The image below shows the results for a search for human INS in the Nucleotide database. Even though this is a Nucleotide search, the results include relevant information from Gene, Protein, Taxonomy, plus links to the NCBI reference sequences (RefSeq) as well as access to BLAST and the insulin gene region in NCBI’s genome browser, the Genome Data Viewer.Figure 1. The new natural language search result in the Nucleotide database from a search for human INS.
Try out this new search capability and let us know what you think. And keep visiting the NCBI Labs search page to try our latest experiments, which we’ll also announce here on NCBI Insights.
Professors, we know you’re busy — really, really busy. You have to develop and teach your courses and labs, coordinate and run your journal clubs and seminars, direct your lab’s research efforts, write grants and publications, counsel and mentor your students, and stay current on everything related to your teaching and research topics.
NCBI has information that can help with all of this, but there are so many interesting records and so little time to organize them. Sign up (Help) for or log in (Help) to your free NCBI Account and let us help you get started and get organized!
Read on – or watch the video embedded below – to learn more about what you can do with your NCBI Account.
Consistent protein nomenclature is indispensable for communication, literature searching and entry retrieval. NCBI, the European Bioinformatics Institute (EMBL-EBI), the Protein Information Resource (PIR) and the Swiss Institute for Bioinformatics (SIB) revised and reorganized previous guidelines from UniProt and NCBI. This joint effort produced universal guidelines in nomenclature and protein naming to promote clarity in communication and improve consistency in data retrieval across databases.
These guidelines are exclusively focused on nomenclature, providing rules about universal formatting and protein naming choices; they do not include best practices for identifying or predicting function. They cover usage of language, abbreviations, symbols, punctuation, notation, terms and style. Sources of protein names and options for protein naming are also discussed.
During the 2018 INSDC annual meeting, the three collaborating sequence databases (DDBJ, EBI and GenBank) agreed to recommend these guidelines to their submitters. The Protein Naming Guidelines working group plans to write a peer-reviewed publication about protein naming and to track future changes to this document in GitHub.
Do you ever want to see the flanking genes of a protein match from a BLAST search? On June 20th, we’ll show you how to see the genomic context of bacterial proteins using the identical protein report and the graphical sequence viewer. You will also learn to use these reports in detail and how to get these genomic contexts in batch for a set of protein matches using the identical proteins report and EDirect .
Date and time: Wed, June 20, 2018 12:00 PM – 12:30 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
A study (PMID: 28158543) published in the July 2017 issue of Bioinformatics collects, classifies and analyzes single nucleotide variants (SNVs) that may affect response to currently approved drugs. They identified 2,640 SNVs of interest, most of which occur rarely in populations (minor allele frequency <0.01).
The researchers used protein sequence alignment tools and mined open data from multiple information resources accessed through E-utilities including PubChem Compound (Kim et al., 2016 PMID: 26400175), NCBI Gene (Maglott D, et al., 2014. PMID: 25355515), NCBI Protein (Sayers, 2013), MMDB (Madej et al., 2012 PMID: 22135289), PDB (Berman et al., 2000 PMID: 10592235), dbSNP (Sherry et al., 2001 PMID: 11125122), and ClinVar (Landrum et al., 2016 PMID: 26582918).
Questions, comments, and other feedback may be sent to Yanli Wang.
On Wednesday, February 14, 2018, NCBI will present a webinar that will show you how to quickly retrieve sequences in any format from NCBI.
Date & time: Wed, Feb 14, 2018 12:00 PM – 12:30 PM EST
Ever need to quickly grab a protein or nucleotide sequence in FASTA or another format from NCBI? This NCBI Minute will show you how to accomplish this using the nucleotide and protein web pages, an NCBI URL, and – the most flexible way – through the commandline EDirect client that accesses the E-Utilities API.
BLAST is a powerful search tool, but often a search is just the beginning of the journey. We put ourselves in the shoes of a researcher who has just sequenced a handful of samples from the latest viral outbreak and tried to understand what information would be most useful. We also reached out to researchers in the field and asked: a) what questions do they really want to answer? and b) how can NCBI best provide the answers? Based on insights from those questions and answers, we developed the new Virus Sequence Search Interface (Fig. 1). The Search Interface is an NCBI Labs project, which means it is an experimental project, and we may modify the resource based on your feedback and experiences.
Figure 1. The Virus Sequence Selection Interface. The Virus Sequence Selection Interface accepts as input nucleotide and protein accessions, as well as FASTA and plain-text formatted sequences. The user selects either “Nucleotide” or “Protein,” depending on the sequence type, and selects the virus type from the pull-down menu below the text entry field.
Sequence Viewer 3.23 has several new features, improvements and bug fixes, including performance optimization for alignment renderings and improved tooltips in uploaded VCF files. For a full list of changes, see the Sequence Viewer release notes.
Sequence Viewer is a graphical view of sequences and color-coded annotations on regions of sequences stored in the Nucleotide and Protein databases.