On April 1, 2022, Science published the first complete sequence of a human genome, known as T2T-CHM13. This notable scientific achievement comes two decades after the first human genome release from the Human Genome Project and offers an in situ look at biologically important regions, such as centromeres, telomeres, and segmental duplications, that were previously unassembled. Read on to learn more about how you can access this assembly and related resources at NCBI, or to access any one of the more than 1000 human genome assemblies now in GenBank.
Finding the new assembly at NCBI is easy. Starting at the NCBI homepage, entering “T2T-CHM13” will bring up a knowledge panel that links you to Assembly. On the record page, you’ll find links to download the assembly and associated data, assembly statistics, and links to the corresponding BioProject and publication, as well as the corresponding FTP site and BLAST database.
RefSeq annotation for T2T-CHM13 is available and also accessible via the web page above. Stay tuned for our upcoming blog article on this new annotation data set to learn more about its features and how it differs from annotation on GRCh38.p14, the current human reference assembly. Visualization of genomic data is a powerful tool for analysis, and NCBI also provides a genome browser for T2T-CHM13 with tracks for sequence, RefSeq gene and biological region annotation, RNAseq data, repeats, alignments to GRCh38.p14, GC content, and more. You can also upload or stream your own data or add data from public Track Hubs into this browser for visualization alongside the NCBI-provided tracks. If you’re using T2T-CHM13 and other human genome assemblies in your analyses, you will find the NCBI Genome Remapping Service a valuable tool. You can map genome features, including genes and variation, between T2T-CHM13 and GRCh38.p14 as well as GRCh37.p13, the prior reference assembly still used in many clinical studies. Together, these resources provide a powerful suite of tools for exploring this new assembly.
If your genome interests extend beyond T2T-CHM13, or even beyond human assemblies, NCBI Datasets is a resource for exploring and downloading genome data at NCBI through web, command line or programmatic interfaces. With links to genome record pages for all assemblies at NCBI, you can gain access to corresponding annotations. Peruse our collection of annotated eukaryotic genomes to take a look at any one of the over 1000 supported genome assemblies. Click the feedback button on any of the corresponding genome pages to let us know what you think!