August 14 Webinar: An updated PubMed is on its way!

On Wednesday, August 14, 2019 at 11AM, NCBI staff will show you PubMed Labs, a test site that will become the default PubMed early next year. You will get a preview of the new, modern interface, updated features including advanced search, clipboard, options for sharing results, and the new “cite” button. You’ll also learn about features that are still under development and how to give us your feedback on the new PubMed.

The August 14 webinar session is full. We will make the recording available and are offering an encore session on August 28, 2019. 

Register for the August 28 session.

Date: Wed, Aug 14, 2018
Time: 11:00 AM – 11:45 AM EDT

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

The UniGene web pages are now retired

As we previously announced,  we planned to retire the UniGene web pages at the end of July, 2019.   All UniGene pages now redirect to this post. We have also removed links to UniGene from the NCBI home page and other resources.

Although the web pages are no longer available, you will still be able to download the final UniGene builds as static content from the FTP site.  You will also be able to match UniGene cluster numbers to Gene records by searching Gene with UniGene cluster numbers. For best results, restrict to the “UniGene Cluster Number” field rather than all fields in Gene.  For example, a search with Mm.2108[UniGene Cluster Number] finds the mouse  transthyretin Gene record (Ttr).  You can use the advanced search page to help construct these searches. Keep in mind that the Gene record contains selected Reference Sequences and GenBank mRNA sequences rather than the larger set of expressed sequences in the UniGene cluster.

Please write to us with any comments, concerns, or if you need help using UniGene data.

Evidence for naming the protein now on non-redundant refseq records (WP_ accessions)

We are now showing the curated evidence used for assigning names and, if possible, gene symbols, publications, and Enzyme Commission numbers on nearly 70% (83 million) microbial RefSeq proteins. This evidence includes a hierarchical collection of curated Hidden Markov Model (HMM)-based and BLAST-based protein families, and conserved domain architectures.

Continue reading

EST and GSS databases now retired

In July 2018, NCBI announced plans to retire the EST and GSS databases, and we have now implemented these changes. We will continue to accept submissions of EST and GSS sequences, but will no longer provide special processes for these sequence types. If you want to submit EST and GSS data, please use tbl2asn. For further details, please visit or or contact

We thank all past and present submitters of EST and GSS data for the invaluable benefit these data have provided to numerous genomic sequencing projects over the years. Please let us know if you have any questions or concerns about these changes!

A new way to find an expanded set of similar genes

We recently showed you a new a way to search for and view sets of orthologous genes  from vertebrates. You can now get an additional set of search results that we are calling similar genes.  These are related through protein architecture to the orthologous gene set and include genes from all metazoans and selected plant, fungal, and protist species. You can quickly find related genes within a species, compare them to those from other annotated metazoan genomes, and have access to other useful gene resources. To find a set of similar genes, enter a gene symbol or select the gene symbol + orthologs option from the selections menu.

For example if you search for ‘AGO2 orthologs‘,  in addition to the  link to orthologs from vertebrates, you’ll get a link to a set of similar genes (Genes with similar protein architectures) across a broad evolutionary spectrum that includes genes from invertebrates, fungi, and green plants (Figure 1).

AGO2_Fig1Figure 1.  Genes with similar protein architectures to AGO2. The original search was AGO2 orthologs, which brings up the suggestion box with the links to similar genes as well as the AGO2 vertebrate orthologs. The similar genes include entries from a broad taxonomic range of eukaryotic organisms.

If you search for ‘GH1‘, you’ll get a link to similar genes that includes members of the growth hormone family that are not part of NCBI’s vertebrate ortholog set.

GH1_Fig2.pngFigure 2. The human subset of genes with similar protein architectures to GH1 showing other members (paralogs) of the GH1 gene family (GH2, CSH1, CSH2, CSHL1). These are not included in the ortholog set.

Try out the  following searches and follow the links to the Genes with similar protein architectures

Please  let us know what you think!

Attention GEO users: Use new GEO FTP subdirectories

On February 1, 2020, NCBI will decommission the following FTP subdirectories for GEO:


Continue reading

RefSeq release 95: naming evidence added to all relevant WP proteins

RefSeq release 95 is accessible online, via FTP and through NCBI’s Entrez programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available, as of July 8, 2019 and contains 206,416,381 records, including 146,381,777 proteins, 27,212,750 RNAs, and sequences from 93,618 organisms.

Continue reading

Primer-BLAST now offers help with irrelevant off-target matches

Primer-BLAST, NCBI’s primer-designer and specificity-checker, now offers a way to help you with irrelevant off-target matches.

Sometimes Primer-BLAST can’t design specific primers for your target sequence because of similar non-target sequences in the database. In some cases, you may know that these non-target matches are not important your research and are safe to ignore.  Examples may include tissue-specific splice variants, redundant entries, and predicted sequences.  To help in these cases, you can now choose to allow certain off-target matches. This gives Primer-BLAST greater freedom in primer selection and a better chance of finding highly specific primers.

Continue reading

Virus hunting in the cloud: A hackathon story at ASV 2019

Are you going to ASV 2019?

If you are, join us in a few days for a workshop on the virus hunting hackathon we helped run earlier this year.

Session: Workshop #19: Virus Discovery

Program Number: W-19-8

Time: Sunday, July 21, 7:00 PM CDT

Location: Mayo Auditorium

In this workshop, Dr. Rodney Brister will talk about how 41 scientists from 21 organizations worked to improve the usability of SRA data, identifying datasets that included known viruses and viral signals. Not only is that information now being integrated into a public search interface, but the approach used is also being refined in future hackathons so it can be applied to all SRA datasets.

We hope to see you there!

Have you tried OSIRIS, NCBI’s STR analysis tool?

More than 5 years ago, NCBI brought you OSIRIS (Open Source Independent Review and Interpretation System), a free, open-access tool for powerful and intelligent Short Tandem Repeat (STR) analysis.

Short Tandem Repeats (STRs) are repeated short stretches of DNA and are analyzed by measuring the length of the repeated region. They vary from individual to individual and are passed from parent to child.  STR analysis is broadly used in medicine, research and law enforcement – for stem cell transplants, diseases like Huntington’s, verifying research cell lines and samples, determining family relationships, and in criminal cases. In this blog post, we explore how you use OSIRIS in the real world and how your feedback has helped us improve this product. Continue reading