Test drive a new sequence search experience at NCBI Labs

We know it’s not always easy to find the sequence data you’re after at NCBI. Maybe it’s because you’re no expert at constructing queries, and you end up with no results or too many results. Or maybe you’re an Entrez wizard, but creating a query full of Booleans and filters seems like overkill when you could just write a short natural language query, like you’re used to doing in Google.  The next time you search for a gene, transcript or genome assembly for a given organism, try the new search experience we’re piloting in NCBI Labs.

In NCBI Labs, you can now search for sequences using natural language and get the best results.

NCBI Labs transcript search interface

Figure 1. The new interface for specified transcript search.

The improved search experience now available in NCBI Labs addresses 3 types of queries that commonly fail in searches at NCBI: organism-gene (e.g. human BRCA1), organism-transcript (e.g. Mouse p53 transcripts) and organism-assembly (e.g. dog reference genome). For each of these query types in NCBI Labs, we now return NCBI’s highest quality sequence sets or reference and representative assemblies in an easy-to-view panel.

Example queries are shown below to get you started.

Genes (query by: species level organism name + gene symbol or alias)

Transcripts (query by: species level organism name + gene symbol or alias + mRNA/transcript/CDS)

Assemblies (query by: species level organism name + genome term OR assembly name)

We’re also measuring interest in accessing a single ‘selected’ representative transcript for a gene versus accessing all transcripts for a gene. The RefSeq Select label identifies a preliminary representative transcript. RefSeq Select is only provided for human genes now; the data set is not finalized. NCBI (RefSeq) and EMBL-EBI (Ensembl) are working together to provide a common minimal set of identical transcripts per human gene. RefSeq will extend the selection logic to other organisms in the future. We look forward to your feedback on these experiments.

We want to know what you think!

What’s working well? What’s not? What’s missing? Click the “Feedback” button at the bottom right of the NCBI Labs page to let us know.

Stay tuned to this blog for future announcements about the roll-out of the new search functionality on the main NCBI sites, as well the introduction of additional search improvements in NCBI Labs.

4 thoughts on “Test drive a new sequence search experience at NCBI Labs

  1. Pingback: Use natural language to find sequence data | The University of Chicago Library News

  2. Pingback: Test drive a new sequence search experience at NCBI Labs – Science

  3. Pingback: As-you-type-suggestions come to NCBI Labs | NCBI Insights

  4. Pingback: NCBI implements new, natural language sequence search | NCBI Insights

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s