We know it’s not always easy to find the sequence data you’re after at NCBI. Maybe it’s because you’re no expert at constructing queries, and you end up with no results or too many results. Or maybe you’re an Entrez wizard, but creating a query full of Booleans and filters seems like overkill when you could just write a short natural language query, like you’re used to doing in Google. The next time you search for a gene, transcript or genome assembly for a given organism, try the new search experience we’re piloting in NCBI Labs.
In NCBI Labs, you can now search for sequences using natural language and get the best results.
Figure 1. The new interface for specified transcript search.
The improved search experience now available in NCBI Labs addresses 3 types of queries that commonly fail in searches at NCBI: organism-gene (e.g. human BRCA1), organism-transcript (e.g. Mouse p53 transcripts) and organism-assembly (e.g. dog reference genome). For each of these query types in NCBI Labs, we now return NCBI’s highest quality sequence sets or reference and representative assemblies in an easy-to-view panel.
Example queries are shown below to get you started.
On Wednesday, November 1, 2017, we will present a webinar on GDV, NCBI’s full-featured genome browser. In this webinar, you’ll learn how to explore and analyze sequences and annotations for eukaryotic RefSeq genome assemblies. We’ll show you how to:
- Search across the entire assembly for genes, products and other markers or jump to a specific position or range
- Display any of seven preselected track sets highlighting various aspects of the assembly or create and load your own custom track sets from your NCBI account.
- Load and display submitted alignment data from NCBI’s GEO or SRA.
- Upload your own annotation and variant data
- Display BLAST or Primer-BLAST results on the assembly in the browser.
Date and time: Wednesday, November 1, 2017 12:00-12:30PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. After the live presentation, the webinar will be uploaded to the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
In June, the NCBI Eukaryotic Genome Annotation Pipeline released new annotations in RefSeq for the following organisms, including Danio rerio (zebrafish):
This blog post is directed toward Assembly users.
A new “Download assemblies” button is now available in the Assembly database. This makes it easy to download data for multiple genomes without having to write scripts.
For example, you can run a search in Assembly and use check boxes (see left side of screenshot below) to refine the set of genome assemblies of interest. Then, just open the “Download assemblies” menu, choose the source database (GenBank or RefSeq), choose the file type, and start the download. An archive file will be saved to your computer that can be expanded into a folder containing your selected genome data files.
NCBI is pleased to offer a direct entry point to the NCBI Genome Data Viewer (GDV) that supports the exploration, visualization and analysis of eukaryotic RefSeq genome assemblies.
The new GDV homepage includes an interactive interface for a quick overview of supported organisms, specific genome searches plus inter-connectivity to Assembly and RefSeq annotation resources. About 100 genome assemblies are now ready for GDV exploration with more on the way. Stay tuned!