About NCBI Staff

The National Center for Biotechnology Information (NCBI), a division of the U.S. National Library of Medicine, provides access to scientific and biomedical databases, software tools for analyzing molecular data, and performs research in computational biology.

How well do you know GeneReviews®?


You may know . . .

  • We offer expert-authored, peer-reviewed chapters on more than 750 genetic disorders.
  • Our standardized format enables busy clinicians to readily find the information they need.
  • Molecular genetic testing strategies are presented in the context of clinical care and genetic counseling implications.
  • Tables link specific molecular genetic information to entries in OMIM (Online Mendelian Inheritance in Man), ClinVar, and genomic databases.
  • Resource lists connect families to information and support.
  • Links to actionable information for clinicians to find available Clinical Trials and genetic tests in the NIH’s Genetic Testing Registry (GTR).
  • Chapters are continually updated to reflect changes in clinically relevant information, such as test availability and treatment protocols.

But do you also know . . .

  • You can volunteer to create a GeneReviews® chapter in your area of expertise. Start by reading the information for prospective authors.
  • Our Educational Materials, designed for health care professionals of varying experience with clinical genetics, augment our glossary to clarify genetics concepts.
  • For genetics professionals, we summarize the latest information on:
    • Imprinting errors and uniparental disomy (UPD) not detectable by sequence analysis
    • Disorders caused by nucleotide repeat expansions/contractions
    • Disorders with highly homologous gene family members or pseudogenes
  • Founder variant tables compile, for the first time in one place, data to inform testing recommendations and clinical decision making for disorders more common in Finnish, Ashkenazi Jewish, Inuit, Yup’ik, Cree/Ojibway, and Navajo
  • A succinct, one-stop information page on direct-to-consumer genetic testing gives medical professionals information they need in order to advise patients who have pursued testing on their own.

Check our What’s New page for weekly new and updated postings.

The new BLAST results are now the default view


As you may know,  we have been offering a new BLAST results (Figure 1) as a test page since April.  In response to your positive reception and after incorporating many improvements that you suggested, we made the new results the default today,  August 1, 2019.

You will still be able to access to the traditional results for a several months. This will provide you additional time if you need it to adjust your workflows or teaching materials to the new display.

Thank you for the feedback on the new results.  We made several improvements to address issues or concerns that you pointed out.  You also told us about 17 additional features you would like us to add. We are working on incorporating these into the page and welcome additional suggestions.  Please let us know what you think.

blast_resultsFigure 1. The new BLAST results with filters directly on the page and a more concise tabbed output that includes the taxonomy report.  The link at the upper right (circled) retrieves the traditional BLAST results. 

August 14 Webinar: An updated PubMed is on its way!


On Wednesday, August 14, 2019 at 11AM, NCBI staff will show you PubMed Labs, a test site that will become the default PubMed early next year. You will get a preview of the new, modern interface, updated features including advanced search, clipboard, options for sharing results, and the new “cite” button. You’ll also learn about features that are still under development and how to give us your feedback on the new PubMed.

The August 14 webinar session is full. We will make the recording available and are offering an encore session on August 28, 2019. 

Register for the August 28 session.

Date: Wed, Aug 14, 2018
Time: 11:00 AM – 11:45 AM EDT

 

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

The UniGene web pages are now retired


As we previously announced,  we planned to retire the UniGene web pages at the end of July, 2019.   All UniGene pages now redirect to this post. We have also removed links to UniGene from the NCBI home page and other resources.

Although the web pages are no longer available, you will still be able to download the final UniGene builds as static content from the FTP site.  You will also be able to match UniGene cluster numbers to Gene records by searching Gene with UniGene cluster numbers. For best results, restrict to the “UniGene Cluster Number” field rather than all fields in Gene.  For example, a search with Mm.2108[UniGene Cluster Number] finds the mouse  transthyretin Gene record (Ttr).  You can use the advanced search page to help construct these searches. Keep in mind that the Gene record contains selected Reference Sequences and GenBank mRNA sequences rather than the larger set of expressed sequences in the UniGene cluster.

Please write to us with any comments, concerns, or if you need help using UniGene data.

Evidence for naming the protein now on non-redundant refseq records (WP_ accessions)


We are now showing the curated evidence used for assigning names and, if possible, gene symbols, publications, and Enzyme Commission numbers on nearly 70% (83 million) microbial RefSeq proteins. This evidence includes a hierarchical collection of curated Hidden Markov Model (HMM)-based and BLAST-based protein families, and conserved domain architectures.

On a protein record such as WP_004152100.1,  you can follow the link (NF033727.1) in the Evidence Accession field of the Evidence-For-Name-Assignment comment block (Figure 1) to find out more about the naming evidence, including the thresholds used for defining a match and access to all the prokaryotic proteins that match the evidence (Figure 2). WP_Evid_1Figure 1: The Evidence-For-Name-Assignment block on WP_004152100.1. The name “arsenite efflux transporter metallochaperone ArsD” is based on its match to the evidence NF033727.1, a Hidden Markov model that defines a family of arsenite efflux transporter metallochaperones. Proteins named for this evidence also inherit publications and a gene symbol (arsD) from NF033727.1.

HMM_topandbottomFigure 2: Naming evidence NF033727.1, a Hidden Markov model.  The top part of the page contains a short text description for the protein family defined by the evidence, the thresholds to be included in the family defined by the evidence, and the publications associated with the protein family.  The lower part of the page provides the RefSeq proteins in the family, named by the present evidence (left tab), or named using evidence with a higher-precedence (right tab). You can filter and download the list too!

Sixty-nine percent of available prokaryotic RefSeq proteins now have the Evidence-For-Name-Assignment comment block. The remaining 31% are not yet covered by the evidence system and are named based on BLAST hits to a non-curated collection of protein cluster representatives.

What does this mean for you?

  • You can better differentiate proteins with functional annotation that is based on curated evidence versus Blast hits to a non-curated database. The query “Evidence-For-Name-Assignment[Properties]” in the Protein resource returns all proteins with names based on a curated evidence.
  • You can find and download all archaeal and bacterial proteins that are matched to the same evidence.
  • You can get your publication cited on protein records by providing NCBI better names for a protein.

We welcome your input! Please send your suggestions and feedback to the NCBI Help Desk.

EST and GSS databases now retired


In July 2018, NCBI announced plans to retire the EST and GSS databases, and we have now implemented these changes. We will continue to accept submissions of EST and GSS sequences, but will no longer provide special processes for these sequence types. If you want to submit EST and GSS data, please use tbl2asn. For further details, please visit https://www.ncbi.nlm.nih.gov/genbank/dbest/ or https://www.ncbi.nlm.nih.gov/genbank/dbgss/ or contact gb-admin@ncbi.nlm.nih.gov.

We thank all past and present submitters of EST and GSS data for the invaluable benefit these data have provided to numerous genomic sequencing projects over the years. Please let us know if you have any questions or concerns about these changes!

A new way to find an expanded set of similar genes


We recently showed you a new a way to search for and view sets of orthologous genes  from vertebrates. You can now get an additional set of search results that we are calling similar genes.  These are related through protein architecture to the orthologous gene set and include genes from all metazoans and selected plant, fungal, and protist species. You can quickly find related genes within a species, compare them to those from other annotated metazoan genomes, and have access to other useful gene resources. To find a set of similar genes, enter a gene symbol or select the gene symbol + orthologs option from the selections menu.

For example if you search for ‘AGO2 orthologs‘,  in addition to the  link to orthologs from vertebrates, you’ll get a link to a set of similar genes (Genes with similar protein architectures) across a broad evolutionary spectrum that includes genes from invertebrates, fungi, and green plants (Figure 1).

AGO2_Fig1Figure 1.  Genes with similar protein architectures to AGO2. The original search was AGO2 orthologs, which brings up the suggestion box with the links to similar genes as well as the AGO2 vertebrate orthologs. The similar genes include entries from a broad taxonomic range of eukaryotic organisms.

If you search for ‘GH1‘, you’ll get a link to similar genes that includes members of the growth hormone family that are not part of NCBI’s vertebrate ortholog set.

GH1_Fig2.pngFigure 2. The human subset of genes with similar protein architectures to GH1 showing other members (paralogs) of the GH1 gene family (GH2, CSH1, CSH2, CSHL1). These are not included in the ortholog set.

Try out the  following searches and follow the links to the Genes with similar protein architectures

Please  let us know what you think!

Attention GEO users: Use new GEO FTP subdirectories


On February 1, 2020, NCBI will decommission the following FTP subdirectories for GEO:

  • ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/SOFT/
  • ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/
  • ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/MINiML/
  • ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/SeriesMatrix/
  • ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/annotation/

Continue reading

RefSeq release 95: naming evidence added to all relevant WP proteins


RefSeq release 95 is accessible online, via FTP and through NCBI’s Entrez programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available, as of July 8, 2019 and contains 206,416,381 records, including 146,381,777 proteins, 27,212,750 RNAs, and sequences from 93,618 organisms.

Continue reading