Tag: Datasets

Updated Design! NCBI Datasets Homepage

Updated Design! NCBI Datasets Homepage

The updated NCBI Datasets homepage has a fresh new look and feel, making it easier for you to use. Now more prominent at the top of the page, you can enter and select the scientific or common name of the species you’re interested in and go directly to the NCBI Datasets Taxonomy page for that species. 

We added a “How to use NCBI Datasets” section, providing you an overview of what’s available in NCBI Datasets. You can see example species with links to NCBI Datasets pages relevant to that species. For example, for Ursos arctos (brown bear), we include links to the Taxonomy page, the genome table showing all available genomes, the reference genome page for UrsArc1.0, as well as connections to BLAST and the Ursos arctos gene table. 

You can still use the tab bar at the top of the homepage to easily navigate to our genome and gene tables or check out our documentation.  Continue reading “Updated Design! NCBI Datasets Homepage”

RefSeq Release 218

RefSeq Release 218

RefSeq release 218 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of May 1, 2023, this full release incorporates genomic, transcript, and protein data containing:

New annotations in RefSeq!

New annotations in RefSeq!

In February and March, the NCBI Eukaryotic Genome Annotation Pipeline released forty-two new annotations in RefSeq for the organisms listed below. Additionally, interim builds for over sixty species were run during that time period to fix some issues with gene symbol assignment.

New Way to View and Download Related Genes

New Way to View and Download Related Genes

Effective June 2023, the HomoloGene records will redirect to the Datasets Gene Table

Do you use HomoloGene to view and download data? You can now access updated homology data from NCBI Datasets through the Datasets Gene Table with connections to NCBI Orthologs. Go directly from a HomoloGene record to the Datasets Gene Table that will give you access to up-to-date sequence data and metadata. NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.

The Datasets Gene Table provides connections to the NCBI Ortholog interface (Figure 1) that provides the following data: 

  • Orthology data based on an updated algorithm that identifies orthologs spanning > 500 vertebrate species 
  • Similar gene data based on protein architectures that spans all eukaryotes 

Continue reading “New Way to View and Download Related Genes”

RefSeq Release 217

RefSeq Release 217

RefSeq release 217 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of March 8, 2023, this full release incorporates genomic, transcript, and protein data, containing:

  • 348,351,219 records
  • 254,500,694 proteins
  • 50,975,429 RNAs
  • sequences from 130,837 organisms

The release is provided in several directories as a complete dataset and divided by logical groupings. Continue reading “RefSeq Release 217”

New & Improved NCBI Datasets Genome and Assembly Pages

New & Improved NCBI Datasets Genome and Assembly Pages

Legacy pages will be redirected effective June 2023

In June 2023, NCBI’s Assembly and Genome record pages will be redirected to new Datasets pages as part of our ongoing effort to modernize and improve your user experience. NCBI Datasets is a new resource that makes it easier to find and download genome data 

We will update the following pages:
  • The NCBI Assembly pages will be redirected to the new DatasetsGenome pages that describe assembled genomes and provide links to related NCBI tools such as Genome Data Viewer and BLAST. 
  • The NCBIGenome pages will be redirected to the DatasetsTaxonomy pages that provide a taxonomy-focused portal to genes, genomes and additional NCBI resources.  
  • During this transition, you will have the option to return to the legacy Genome and Assembly pages. 

Continue reading “New & Improved NCBI Datasets Genome and Assembly Pages”

New annotations in RefSeq!

New annotations in RefSeq!

In December and January, the NCBI Eukaryotic Genome Annotation Pipeline released twenty-nine new annotations in RefSeq for the following organisms:

  • Acinonyx jubatus (cheetah)
  • Anopheles cruzii (mosquito)
  • Anopheles moucheti (mosquito)
  • Bicyclus anynana (squinting bush brown)
  • Budorcas taxicolor (takin)
  • Carassius gibelio (silver crucian carp)
  • Citrus sinensis (sweet orange)
  • Crassostrea angulata (Portugese oyster)
  • Culex pipiens pallens (northern house mosquito)
  • Drosophila gunungcola (fruit fly)
  • Galleria mellonella (greater wax moth)
  • Gossypium arboreum (tree cotton)
  • Gossypium raimondii (Peruvian cotton)
  • Harpia harpyja (harpy eagle)
  • Hemicordylus capensis (graceful crag lizard)
  • Lactuca sativa (garden lettuce)
  • Mercenaria mercenaria (northern quahog)
  • Mya arenaria (softshell)
  • Octopus bimaculoides (California two-spot octopus)
  • Oncorhynchus keta (chum salmon)
  • Pangasianodon hypophthalmus (striped catfish)
  • Panonychus citri (citrus red mite)
  • Panthera uncia (snow leopard) (pictured)
  • Peromyscus californicus insignis (California mouse)
  • Podarcis raffonei (Aeolian wall lizard)
  • Populus trichocarpa (black cottonwood)
  • Scomber japonicus (chub mackerel)
  • Tympanuchus pallidicinctus (lesser prairie-chicken)
  • Vigna angularis (adzuki bean)

Continue reading “New annotations in RefSeq!”

NIH Comparative Genomics Resource project

NIH Comparative Genomics Resource project

The potential impact of emerging model organisms on human health

Comparative genomics is a science that compares genomic data either within a species or across species to answer questions in biomedicine. Laboratory experiments can then investigate the functional impact of those genomics similarities and differences. The history of comparative genomics goes back to the mid-1990s, but comparative genomics is now accelerating. A flood of new data is emerging as DNA sequencing technology becomes cheaper and commoditized. While this growth poses many challenges to current tools and approaches, it also offers immense opportunity for scientific research and understanding. These insights continue to reveal novel model organisms that can further the impact of comparative genomics on human health. Continue reading “NIH Comparative Genomics Resource project”

New RefSeq Annotations!

New RefSeq Annotations!

In October and November, the NCBI Eukaryotic Genome Annotation Pipeline released thirty-one new annotations in RefSeq for the following organisms:

  • Acanthochromis polyacanthus (spiny chromis)
  • Acomys russatus (golden spiny mouse)
  • Andrographis paniculata (eudicot)
  • Antechinus flavipes (yellow-footed antechinus)
  • Apodemus sylvaticus (European woodmouse)
  • Apus apus (common swift)
  • Arachis duranensis (eudicot)
  • Continue reading “New RefSeq Annotations!”
Join NCBI at PAG 30

Join NCBI at PAG 30

San Diego, January 13-18, 2023 

NCBI is looking forward to seeing you in person at the International Plant and Animal Genome Conference (PAG 30), January 13-18, 2023 in San Diego, California.  

We’re especially excited to share our recent efforts on the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources on biomedical research.  

We also want to hear from you! If you’re interested in sharing your feedback on your needs and experiences involving comparative genomics tools to inform CGR, consider joining our Feedback Session.

Check out NCBI’s schedule of activities and events:  

Continue reading “Join NCBI at PAG 30”