As previously announced, NCBI’s Assembly and Genome record pages will be redirected to new NCBI Datasets pages in June 2023. The NCBI Datasets Command Line Interface (CLI) tools provide easy, straightforward programmatic downloads of assembled genome sequence data. We invite you to check them out and let us know what you think!
Features & Benefits of NCBI Datasets
- Get assembled genome sequence, annotation, and metadata, including transcripts and proteins, in one easy step.
- Querying is easy and flexible! Retrieve data using organism name, assembly accession, or BioProject accession.
- Request data for multiple assemblies in one request – it is now simpler and faster to download large amounts of data.
- Metadata is derived from multiple databases and metadata schemas are documented.
For example, to get sequence and metadata for the human reference genome, you could use the datasets command line tool with the following command:
datasets download genome taxon human --reference --filename human_genomes_dataset.zip
E-Utilities & EDirect remain available as additional command line options
Don’t worry! You can still search using Entrez syntax from the Assembly and Genome homepages. There will be no changes to how you access the Assembly and Genome databases using E-Utilities or EDirect.
For more details about NCBI Datasets programmatic access to genome assembly data, see our how-to guide.
Stay up to date
NCBI Datasets is a part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration.
If you have questions or would like to provide feedback, please reach out to us at email@example.com.