As announced last month, NCBI will begin assigning larger (64-bit) numeric ‘GIs’ to the remaining sequence types that still receive these identifiers. This change is expected as soon as Nov. 15th, 2021 but could occur earlier if data submission volumes are unexpectedly high. This is a reminder that all organizations and developers using our products should review software for any remaining reliance on GIs and compatibility with these larger identifiers.
How do you know if your software or organization may be impacted?
If you have built custom software to interface with NCBI data and consume a sequence database UID (i.e. GI), process the GI from an ASN1 or XML product, or process the GI from any tabular product on FTP, you should review all code to ensure that the new, longer, 64-bit GIs will be handled properly. To ensure a smooth transition and the best overall experience, please update to the latest versions of NCBI-provided programmatic and command line tools. Alternatively, you could make updates to your code to use accession.version identifiers instead of GIs.
NCBI is here to help the community as we make this change. Stay tuned here or follow NCBI Twitter where we will share updates and additional information, such as a final confirmation of the projected cutover date.
Please contact email@example.com with any questions about this change or to determine if any software you are using is affected.
We have updated the bacterial and archaeal representative genome collection! The current collection contains over 13,000 assemblies selected from the 203,000 prokaryotic RefSeq assemblies to represent their respective species. The collection has increased by 11% since August 2020. We’ve included about 1,400 species for the first time, have used better assemblies for 1,177 species, and have removed 65 species because of changes in NCBI Taxonomy or uncertainty in their species assignment.
Get rapid access to Wuhan coronavirus (2019-nCoV) sequence data from the current outbreak as it becomes available. We will continue to update the page with newly released data.
The complete annotated genome sequence of the novel coronavirus associated with the outbreak of pneumonia in Wuhan, China is now available from GenBank for free and easy access by the global biomedical community. Figure 1 shows the relationship of the Wuhan virus to selected coronaviruses.
Figure 1. Phylogenetic tree showing the relationship of Wuhan-Hu-1 (circled in red) to selected coronaviruses. Nucleotide alignment was done with MUSCLE 3.8. The phylogenetic tree was estimated with MrBayes 3.2.6 with parameters for GTR+g+i. The scale bar indicates estimated substitutions per site, and all branch support values are 99.3% or higher.
NCBI released the Probe database in 2005 as a registry of nucleic acid reagents for biomedical research. At that time array-based assays were prevalent, but have since declined with the advent of short read sequencing. As a result, NCBI will retire the web interface for the Probe database in April 2020. You can continue accessing the content of the database on the NCBI FTP site, but it will no longer be updated. As of this announcement, Probe will no longer be accepting new submissions.
If you have questions or concerns about this retirement, we’d love to hear from you. Please comment here or contact us at firstname.lastname@example.org.
You can now download images in both PDF and Scaled Vector Graphics (SVG) formats from our Sequence Viewer and genome browsers such as the Genome Data Viewer! SVG files are ideal for editing in image editors and provide high quality graphics for publications, posters, and presentations. Both the PDF and SVG files that you download contain vector graphics for high fidelity images.
You can download image files by choosing the “Printer-Friendly PDF/SVG” option under the Tools menu from any Graphical Sequence Viewer application (Figure 1).
Figure 1. Printer friendly download options from the graphical view in the Genome Data Viewer. You can download either PDF or SVG formats, which are easily edited in standard graphics applications.
As part of our ongoing effort to improve your search experience, we’ve made it easier for you to find the sequence of your favorite organelle genome plus all the information and data associated with it. To find organelle genomes, search for an organism name combined with an organelle description, for example human mitochondrion, tomato chloroplast or Toxoplasma gondii RH apicoplast.
A new results panel will appear with links to the organelle genome sequence, annotated genes, and related phylogenetic and population studies. The panel appears with these searches in an All Databases search or within any of NCBI’s sequence databases including Gene, Nucleotide, Protein, Genome, Assembly. For the human mitochondrial genome, a graphical schematic of the genome allows you to navigate to individual mitochondrial encoded genes (Figure 1).
Figure 1. The organelle genome results for a search with human mitochondrion. The panel provides access to analysis tools, downloads, and other relevant results. Clicking any of the gene objects on the genome graphic links leads to the relevant Gene record, for example Gene ID: 4512 in the case of COX1.
Try it out using the following example searches and let us know what you think!
On Wednesday, September 11, 2019 at 12 PM, NCBI staff will present a webinar for people with limited experience working with gene and sequence information. You will learn about the kinds of data available for genes and sequences, how to select the most informative records, and how to find related genes and sequences using pre-computed information and the BLAST sequence search service.
Date and time: Wed, Sep 11, 2019 12:00 PM – 12:30 PM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
Primer-BLAST, NCBI’s primer-designer and specificity-checker, now offers a way to help you with irrelevant off-target matches.
Sometimes Primer-BLAST can’t design specific primers for your target sequence because of similar non-target sequences in the database. In some cases, you may know that these non-target matches are not important your research and are safe to ignore. Examples may include tissue-specific splice variants, redundant entries, and predicted sequences. To help in these cases, you can now choose to allow certain off-target matches. This gives Primer-BLAST greater freedom in primer selection and a better chance of finding highly specific primers.
Earlier this year, we announced the release of a new and improved search feature that interprets plain language to give better results for common searches. This feature, originally developed in NCBI Labs and later released on the NCBI All Databases search, is now available across several NCBI resources: Nucleotide, Protein, Gene, Genome, and Assembly. Whether you are searching for a specific gene or for a whole genome, you will now retrieve NCBI’s best results regardless of the database you search.
The image below shows the results for a search for human INS in the Nucleotide database. Even though this is a Nucleotide search, the results include relevant information from Gene, Protein, Taxonomy, plus links to the NCBI reference sequences (RefSeq) as well as access to BLAST and the insulin gene region in NCBI’s genome browser, the Genome Data Viewer.Figure 1. The new natural language search result in the Nucleotide database from a search for human INS.
Try out this new search capability and let us know what you think. And keep visiting the NCBI Labs search page to try our latest experiments, which we’ll also announce here on NCBI Insights.