NCBI Datasets: Easily Access and Download Sequence Data and Metadata

NCBI Datasets: Easily Access and Download Sequence Data and Metadata

Effective May 2024, NCBI Datasets will replace legacy Genome and Assembly web resources 

As part of our ongoing effort to enhance your experience and modernize our services, NCBI will gradually replace the legacy Genome and Assembly resources with the newly introduced NCBI Datasets resource. NCBI Datasets is a continually evolving platform designed to provide easy and intuitive access to NCBI’s sequence data and metadata. 

  • The legacy Genome and Assembly web resources will no longer be available after May 2024
  • There will be no changes to how you access the databases using E-Utilities or EDirect 

Why are we making this change? 
  • To provide a streamlined experience that integrates genome, organism, and gene information 
  • To help you retrieve large datasets that enable big data analyses 
  • To deliver data and metadata together and support better reuse and attribution 
  • To provide you with a single entry point to genome datasets
Features & Benefits of NCBI Datasets 
  1. Comprehensive Data: Access assembled genome sequences, annotations, and metadata, including transcripts and proteins from a single webpage 
  2. Flexible Search Options: Easily retrieve data using organism names, assembly, WGS, or BioProject accessions 
  3. Scalable Data Retrieval: Request data for multiple genomes and file types in a single request, simplifying and expediting the download of large datasets 
  4. Well-Documented Metadata: Metadata is sourced from multiple databases, and metadata schemas are thoroughly documented
  5. Interoperable metadata formats: Metadata formats are machine-readable and easily converted to human-readable forms
  6. Taxonomy-Focused Portal: Access genes, genomes, and other NCBI resources through a taxonomy-focused portal
  7. Consistency: Enjoy access to consistent data access across web and programmatic interfaces 
Stay up to date 

NCBI Datasets is part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration.     

Join our mailing list to keep up to date with NCBI Datasets and other CGR news. 


If you have questions or would like to provide feedback, please reach out to us at   


Leave a Reply