Vertebrate Genome Project genome assemblies annotated by NCBI

Vertebrate Genome Project genome assemblies annotated by NCBI

NCBI is an active partner of the Vertebrate Genomes Project (VGP), who recently published a series of papers on the initial results of their efforts to sequence all 70,000 vertebrate species.  See the VGP press release  for more details. To date, this project has submitted over 130 diploid chromosome-level assemblies to NCBI’s GenBank  and the European Nucleotide Archive.  NCBI has annotated 94 of the VGP assemblies from 85 species using the NCBI Eukaryotic Genome Annotation Pipeline.

These sequence and annotation data are available through NCBI web resources including Gene, Assembly, Nucleotide, Protein, and Datasets and are included in the GenBank and RefSeq releases. You can browse the assemblies in the Genome Data Viewer  and  download metadata, sequence, and annotation data for the latest assemblies in the VGP BioProject using the NCBI Datasets command-line tools  as shown below.

Downloading VGP data with Datasets

The following  command-line with the datasets tool will download a data report with detailed metadata for the latest VGP assemblies:

datasets download genome accession PRJNA489243 --dehydrated --filename

To retrieve the sequence and annotation data, simply unzip and rehydrate:

datasets rehydrate --directory vgp_archive 

Contact us if you are sequencing your own assemblies and interested in NCBI producing an annotation!

Leave a Reply