As announced last month, NCBI will begin assigning larger (64-bit) numeric ‘GIs’ to the remaining sequence types that still receive these identifiers. This change is expected as soon as Nov. 15th, 2021 but could occur earlier if data submission volumes are unexpectedly high. This is a reminder that all organizations and developers using our products should review software for any remaining reliance on GIs and compatibility with these larger identifiers.
How do you know if your software or organization may be impacted?
If you have built custom software to interface with NCBI data and consume a sequence database UID (i.e. GI), process the GI from an ASN1 or XML product, or process the GI from any tabular product on FTP, you should review all code to ensure that the new, longer, 64-bit GIs will be handled properly. To ensure a smooth transition and the best overall experience, please update to the latest versions of NCBI-provided programmatic and command line tools. Alternatively, you could make updates to your code to use accession.version identifiers instead of GIs.
NCBI is here to help the community as we make this change. Stay tuned here or follow NCBI Twitter where we will share updates and additional information, such as a final confirmation of the projected cutover date.
Please contact firstname.lastname@example.org with any questions about this change or to determine if any software you are using is affected.
In 2016, NCBI announced that it was curtailing its display of its numeric ‘GI’ in popular sequence data formats such as FASTA and GenBank flatfiles. Due to the continued growth of GenBank, NCBI will soon begin assigning GIs exceeding the signed 32-bit threshold of 2,147,483,647 for those remaining sequence types that still receive these identifiers.
NCBI has updated products including Entrez system, GenBank (Nucleotide), BLAST™ and the C++ Toolkit to prepare for that moment by upgrading GI-related code and APIs to accept 64-bit integers. This change over is projected for late 2021. Stay tuned for additional communications from NCBI and take note of the following information if you think you may be impacted.
As you may have read in previous posts, NCBI is in the process of changing the way we handle GI numbers for sequence records.
In short, we are moving to a time when accession.version identifiers, rather than GI numbers, will be the primary identifiers for sequence records.
As part of this transition, an obvious question for any of you currently using GI numbers is how to convert a GI number to an accession.version, so that you can make appropriate updates. The good news is that it’s pretty easy if you have no more than a few thousand GIs to convert.