RefSeq Release 99 is public

RefSeq release 99 is accessible online, via FTP and through NCBI’s Entrez programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available as of March 2, 2020, and contains 231,402,293 records, including 167,278,920 proteins, 29,869,155 RNAs, and sequences from 99,842 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

Other announcements: Continue reading

Improving the Display of Type Material in the NCBI TaxBrowser

Have you ever been confused by multiple taxonomic names for a single organism? You’re not alone! It’s one of the challenges in maintaining any biological database. Recently we updated the NCBI TaxBrowser to assist with this.

Let’s start with a brief word about how investigators name species in the first place. For any new species, the reporting author declares a “type.” They then deposit a specimen, or “type material,” in a publicly available biorepository. This type material is tied to the new species name and serves as a reference for future comparisons. Researchers can then use DNA sequences obtained from type material to identify other samples from the same species. NCBI currently uses such an approach to verify the taxonomic assignment of prokaryotic genomes.

Our Taxonomy group has been curating type material records in the Taxonomy database since 2013 using a common vocabulary accepted by our international partners (the INSDC). For example, the Entrez query “type material[prop]” in the Taxonomy database will return all type material at NCBI.

So what are the improvements to the TaxBrowser?

Continue reading

Novel coronavirus complete genome from the Wuhan outbreak now available in GenBank

Updated!

Get rapid access to Wuhan coronavirus (2019-nCoV) sequence data from the current outbreak as it becomes available. We will continue to update the page with newly released data.

The complete annotated genome sequence of the novel coronavirus associated with the outbreak of pneumonia in Wuhan, China is now available from GenBank for free and easy access by the global biomedical community. Figure 1 shows the relationship of the Wuhan virus to selected coronaviruses.

Wuhan-human-1_posterior-output2

Figure 1.  Phylogenetic tree showing the relationship of Wuhan-Hu-1 (circled in red) to selected coronaviruses. Nucleotide alignment was done with MUSCLE 3.8. The phylogenetic tree was estimated with MrBayes 3.2.6 with parameters for GTR+g+i.  The scale bar indicates estimated substitutions per site, and all branch support values are 99.3% or higher.

Continue reading

NCBI on YouTube: new videos on PubMed, My Bibliography, sequence data and more

Here are the latest videos on our YouTube channel. Subscribe to get alerts for new videos.

Introducing the Genome Submission Wizard in Genome Workbench v3.0

Genome Workbench version 3 is a major upgrade, including the addition of the Genome Submission Wizard. This video guides you through the wizard, from uploading your genome data file to completion of the submitter report, which is ready to submit to GenBank using tools such as Submission Portal or BankIt. Note: An on-line tutorial is under “Manuals” on the Genome Workbench home page.

Continue reading

Vector graphics downloads now available in NCBI genome browsers and sequence views

You can now download images in both PDF and Scaled Vector Graphics (SVG) formats from our Sequence Viewer and genome browsers such as the Genome Data Viewer!  SVG files are ideal for editing in image editors and provide high quality graphics for publications, posters, and presentations. Both the PDF and SVG files that you download contain vector graphics for high fidelity images.

You can download image files by choosing the “Printer-Friendly PDF/SVG” option under the Tools menu from any Graphical Sequence Viewer application (Figure 1).

SVG_GDVFigure 1. Printer friendly download options from the graphical view in the Genome Data Viewer.  You can download either PDF or SVG formats, which are easily edited in standard graphics applications. 

 

New search helps you find prokaryotic proteins

The latest improvement in the NCBI search experience is designed to help you quickly find microbial proteins. Now when you search for a prokaryotic protein name such as recombinase RecA in NCBI’s sequence databases or in the All databases search, a high-quality representative protein sequence is highlighted in a panel at the top of the results page (Figure 1).

The result panel also allows you to quickly link to related resources such as NCBI’s new pages for protein family models, Identical Protein Groups, and SPARCLE, NCBI’s protein domain architecture resource. We also provide as-you-type suggestions so you don’t have to type out some of the long names.

RecA

Figure 1.  The result for a search with recombinase RecA. The panel provides access to analysis tools, downloads, and relevant links to the protein family, the RefSeq protein, the identical protein group, and citations in PubMed.

Try these protein name searches, or your own, and use the as-you-type suggestions to assist your searches.

Please let us know how you like these results!

New results for organelle genome searches

As part of our ongoing effort to improve your search experience, we’ve made it easier for you to find the sequence of your favorite organelle genome plus all the information and data associated with it. To find organelle genomes, search for an organism name combined with an organelle description, for example human mitochondriontomato chloroplast or Toxoplasma gondii RH apicoplast.

A new results panel will appear with links to the organelle genome sequence, annotated genes, and related phylogenetic and population studies. The panel appears with these searches in an All Databases search or within any of NCBI’s sequence databases including Gene, Nucleotide, Protein, Genome, Assembly.  For the human mitochondrial genome, a graphical schematic of the genome allows you to navigate to individual mitochondrial encoded genes (Figure 1).

Organelle_sensor

Figure 1.  The organelle genome results for a search with human mitochondrion. The panel provides access to analysis tools, downloads, and other relevant results. Clicking any of the gene objects on the genome graphic links leads to the relevant Gene record, for example Gene ID: 4512 in the case of COX1.

Try it out using the following example searches and  let us know what you think!

September 11 Webinar: A beginner’s guide to genes and sequences at NCBI

On Wednesday, September 11, 2019 at 12 PM, NCBI staff will present a webinar for people with limited experience working with gene and sequence information. You will learn about the kinds of data available for genes and sequences, how to select the most informative records, and how to find related genes and sequences using pre-computed information and the BLAST sequence search service.

  • Date and time: Wed, Sep 11, 2019 12:00 PM – 12:30 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

The new BLAST results are now the default view

As you may know,  we have been offering a new BLAST results (Figure 1) as a test page since April.  In response to your positive reception and after incorporating many improvements that you suggested, we made the new results the default today,  August 1, 2019.

You will still be able to access to the traditional results for a several months. This will provide you additional time if you need it to adjust your workflows or teaching materials to the new display.

Continue reading

EST and GSS databases now retired

In July 2018, NCBI announced plans to retire the EST and GSS databases, and we have now implemented these changes. We will continue to accept submissions of EST and GSS sequences, but will no longer provide special processes for these sequence types. If you want to submit EST and GSS data, please use tbl2asn. For further details, please visit https://www.ncbi.nlm.nih.gov/genbank/dbest/ or https://www.ncbi.nlm.nih.gov/genbank/dbgss/ or contact gb-admin@ncbi.nlm.nih.gov.

We thank all past and present submitters of EST and GSS data for the invaluable benefit these data have provided to numerous genomic sequencing projects over the years. Please let us know if you have any questions or concerns about these changes!