As part of our ongoing effort to improve your search experience, we’ve made it easier for you to find the sequence of your favorite organelle genome plus all the information and data associated with it. To find organelle genomes, search for an organism name combined with an organelle description, for example human mitochondrion, tomato chloroplast or Toxoplasma gondii RH apicoplast.
A new results panel will appear with links to the organelle genome sequence, annotated genes, and related phylogenetic and population studies. The panel appears with these searches in an All Databases search or within any of NCBI’s sequence databases including Gene, Nucleotide, Protein, Genome, Assembly. For the human mitochondrial genome, a graphical schematic of the genome allows you to navigate to individual mitochondrial encoded genes (Figure 1).
Figure 1. The organelle genome results for a search with human mitochondrion. The panel provides access to analysis tools, downloads, and other relevant results. Clicking any of the gene objects on the genome graphic links leads to the relevant Gene record, for example Gene ID: 4512 in the case of COX1.
Try it out using the following example searches and let us know what you think!
On Wednesday, September 11, 2019 at 12 PM, NCBI staff will present a webinar for people with limited experience working with gene and sequence information. You will learn about the kinds of data available for genes and sequences, how to select the most informative records, and how to find related genes and sequences using pre-computed information and the BLAST sequence search service.
Date and time: Wed, Sep 11, 2019 12:00 PM – 12:30 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
As you may know, we have been offering a new BLAST results (Figure 1) as a test page since April. In response to your positive reception and after incorporating many improvements that you suggested, we made the new results the default today, August 1, 2019.
You will still be able to access to the traditional results for a several months. This will provide you additional time if you need it to adjust your workflows or teaching materials to the new display.
We thank all past and present submitters of EST and GSS data for the invaluable benefit these data have provided to numerous genomic sequencing projects over the years. Please let us know if you have any questions or concerns about these changes!
Primer-BLAST, NCBI’s primer-designer and specificity-checker, now offers a way to help you with irrelevant off-target matches.
Sometimes Primer-BLAST can’t design specific primers for your target sequence because of similar non-target sequences in the database. In some cases, you may know that these non-target matches are not important your research and are safe to ignore. Examples may include tissue-specific splice variants, redundant entries, and predicted sequences. To help in these cases, you can now choose to allow certain off-target matches. This gives Primer-BLAST greater freedom in primer selection and a better chance of finding highly specific primers.
In late May, we introduced a new type of search experience in NCBI Labs that uses natural language queries to make common tasks easier. The experience at NCBI Labs – where we experiment with potential new features and tools – proved successful. We’re pleased to announce that we added this simplified search capability to NCBI’s global search page. Some natural language queries now work in the “All Databases” search from the NCBI home page!
We know it’s not always easy to find the sequence data you’re after at NCBI. Maybe it’s because you’re no expert at constructing queries, and you end up with no results or too many results. Or maybe you’re an Entrez wizard, but creating a query full of Booleans and filters seems like overkill when you could just write a short natural language query, like you’re used to doing in Google. The next time you search for a gene, transcript or genome assembly for a given organism, try the new search experience we’re piloting in NCBI Labs.
In NCBI Labs, you can now search for sequences using natural language and get the best results.
The improved search experience now available in NCBI Labs addresses 3 types of queries that commonly fail in searches at NCBI: organism-gene (e.g. human BRCA1), organism-transcript (e.g. Mouse p53 transcripts) and organism-assembly (e.g. dog reference genome). For each of these query types in NCBI Labs, we now return NCBI’s highest quality sequence sets or reference and representative assemblies in an easy-to-view panel.
Example queries are shown below to get you started.
A paper in the January 2018 issue of Database describes the NCBI BioCollections database, a curated dataset of metadata for culture collections, museums, herbaria and other natural history collections connected to sequence records in GenBank. The BioCollections database was established to allow the association of specimen vouchers and related sequence records to their home institutions. This process also allows back-linking from the home institution for quick identification of all records originating from each collection.
The rapidly growing set of GenBank submissions frequently includes records that are derived from specimen vouchers. Correct identification of the specimens studied, along with a method to associate the sample with its institution, is critical to the outcome of related studies and analyses.
New repository records are added to the database if they are submitted to the International Nucleotide Sequence Database Collaboration (INSDC) along with sequence data. Each record now provides information about the institution that houses the collection, standard Institution Code, mailing address, and associated webpage if available.
The BioCollections database is maintained and curated by the Taxonomy group at NCBI.
UniVec, NCBI’s non-redundant database of vector sequences, has been updated to build 10.0, which enables searches run using NCBI’s VecScreen tool to detect more of the foreign sequences introduced during the cloning or sequencing process. UniVec build 10.0 is also available via FTP.
This build added 174 complete vector sequences and 214 adapter, primer and other sequences, including 133 RNA Spike-In sequences, bringing the total number of sequences represented in the UniVec database to 3,039.