About NCBI Staff

The National Center for Biotechnology Information (NCBI), a division of the U.S. National Library of Medicine, provides access to scientific and biomedical databases, software tools for analyzing molecular data, and performs research in computational biology.

Tour the NCBI’s Genome Data Viewer, Bookshelf, Pathogen Isolates Detection Browser and other resources on YouTube


Several of the latest videos on the NCBI YouTube channel highlight NCBI resources. Subscribe to the channel to see all our new videos.

NCBI’s Genome Data Viewer – Introducing the BLAST Widget

A brief introduction into how the BLAST widget, a new addition to the Genome Data Viewer, helps you see your BLAST results in the context of assembled genome sequences.

Continue reading

RefSeq release 88 available


RefSeq release 88 is now accessible online, via FTP and through NCBI’s programming utilities. This full release incorporates genomic, transcript, and protein data available, as of May 14, 2018. It contains 160,224,355 records, including 110,333,800 proteins, 22,461,378 RNAs, and sequences from 79,448 organisms. The release is in several directories as a complete dataset and as divided by logical groupings.

This release incorporates dbSNP release 151, which nearly doubles the number of SNPs annotated on the human GRCh38 genome, with matching increases in the size of the human nucleotide flatfile (.gbff) records.

Starting in November 2018, SNP variation features will no longer be in RefSeq genome assembly records.  The RefSeq release notes have information about this change.

IgBLAST 1.9.0 release includes AIRR rearrangement reporting


IgBLAST 1.9.0 supports the adaptive immune receptor repertoire (AIRR) standard for sequence analysis results. The AIRR format is available on web IgBLAST as well as in the standalone IgBLAST tool, with the -outfmt 19 option.

Supporting this new schema in IgBLAST will enhance the increasing amount of repertoire studies that use next-generation sequencing technology to generate very large sets of Ig/T-cell receptor rearrangement analysis data.

IgBLAST facilitates the analysis of immunoglobulin and T cell receptor variable domain sequences. Get IgBLAST on FTP. A new manual is on GitHub.

Test drive a new sequence search experience at NCBI Labs


We know it’s not always easy to find the sequence data you’re after at NCBI. Maybe it’s because you’re no expert at constructing queries, and you end up with no results or too many results. Or maybe you’re an Entrez wizard, but creating a query full of Booleans and filters seems like overkill when you could just write a short natural language query, like you’re used to doing in Google.  The next time you search for a gene, transcript or genome assembly for a given organism, try the new search experience we’re piloting in NCBI Labs.

In NCBI Labs, you can now search for sequences using natural language and get the best results.

NCBI Labs transcript search interface

Figure 1. The new interface for specified transcript search.

The improved search experience now available in NCBI Labs addresses 3 types of queries that commonly fail in searches at NCBI: organism-gene (e.g. human BRCA1), organism-transcript (e.g. Mouse p53 transcripts) and organism-assembly (e.g. dog reference genome). For each of these query types in NCBI Labs, we now return NCBI’s highest quality sequence sets or reference and representative assemblies in an easy-to-view panel.

Example queries are shown below to get you started.

Continue reading

GenBank release 225: Over 1 billion sequence records stored!


GenBank release 225.0 (4/14/2018) has 208,452,303 traditional records (including non-bulk-oriented TSA) containing 260,189,141,631 base pairs of sequence data. In addition, there are 621,379,029 WGS records containing 2,784,740,996,536 base pairs of sequence data, 227,364,990 TSA records containing 205,232,396,043 base pairs of sequence data, and 14,782,654 TLS records containing 5,612,769,448 base pairs of sequence data.

During the 60 days between the close dates for GenBank releases 224.0 and 225.0, the traditional portion of GenBank grew by 6,558,433,533 base pairs and by 1,411,748 sequence records. During that same period, 86,960 records were updated – an average of 24,978 records added or updated per day.

Continue reading

May 16 webinar: Improved Standalone BLAST database and programs: now with taxonomic information


Next Wednesday, May 16, 2018, we’ll show you how to download and use the latest standalone BLAST databases, dbv5. You’ll learn how to use BLASTdbv5 and the new BLAST programs to limit searches to taxonomic groups and to retrieve sequences from the database by taxonomy.

Date and time: Wed, May 16, 2018 12:00 PM – 12:30 PM EDT

Register here: https://bit.ly/2qW7LLy

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

May 9 NCBI Minute: Integrating PubChem into Your Chemistry Teaching


Next Wednesday, May 9, 2018, NCBI staff will show you how to use PubChem as a cheminformatics education resource. In addition to learning about tools and services for chemical information search, analysis, and download, you will also see examples of how instructors incorporate PubChem in Cheminformatics OLCC (On-Line Chemistry Courses), an intercollegiate hybrid course.

Date and time: Wednesday, May 9, 2018 12:00 – 12:30 PM EDT

Register here: https://bit.ly/2q5wtsF

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.