GenBank will start using expanded accession formats by December 2018

By the end of 2018, GenBank and other INSDC members will expand the accession formats used for sequencing projects. We have assigned almost all the possible accession numbers using the current, shorter formats. Using these longer formats will allow us to expand accession ranges and give us greater capacity.

The expanded format for Whole Genome Shotgun (WGS), Transcriptome Shotgun Assembly (TSA), and Targeted Locus Study (TLS) sequencing projects will use a six-letter Project Code prefix and a two-digit Assembly-Version number followed by 7, 8, or 9 digits (for example, AAAAAA020000001).

Non-WGS/TLS/TSA nucleotide sequences currently use a “2+6” format, two-letter prefix followed by six digits. This format will be expanded to eight digits.

Protein sequences currently use a “3+5” accession format. By the end of 2018, this format will use seven digits.

You will need to adjust any processing methods to accommodate these new identifier formats.  Please write to the helpdesk with any questions about the new formats.

Improved Search Now Available Across NCBI Databases

Earlier this year, we announced the release of a new and improved search feature that interprets plain language to give better results for common searches. This feature, originally developed in NCBI Labs and later released on the NCBI All Databases search, is now available across several NCBI resources: Nucleotide, Protein, Gene, Genome, and Assembly. Whether you are searching for a specific gene or for a whole genome, you will now retrieve NCBI’s best results regardless of the database  you search.

The image below shows the results for a search for human INS in the Nucleotide database. Even though this is a Nucleotide search, the results include relevant information from Gene, Protein, Taxonomy,  plus links to the NCBI reference sequences (RefSeq) as well as access to BLAST and the insulin gene region in NCBI’s genome browser, the Genome Data Viewer.KIS_nuccore_smallFigure 1.  The new natural language search result in the Nucleotide database from a search for human INS.

Try out this new search capability and let us know what you think. And keep visiting the NCBI Labs search page to try our latest experiments, which we’ll also announce here on NCBI Insights.


September 12 NCBI Minute: Release Plan for NCBI API Keys

Update: Webinar is now on September 12!

If you already registered for the September 5 date, you are automatically registered for September 12. You do not need to re-register. We welcome anyone else who would like to register.

As previously announced, NCBI has introduced API keys for the E-utilities. You will soon want to start using API Keys in your E-Utilities API calls as these will allow the fastest access to NCBI databases. In this webinar, we will review how API Keys work and will provide you with a schedule of brief testing periods and the timing of the full release of API key functionality.

Date and time: Wed, Sep 12, 2018 12:00 PM – 12:30 PM EDT

Register here:

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

(Webinar re-scheduled to September 12 because the presenter was called away unexpectedly.)