Tag: Datasets
New annotations in RefSeq!
In February and March, the NCBI Eukaryotic Genome Annotation Pipeline released forty-two new annotations in RefSeq for the organisms listed below. Additionally, interim builds for over sixty species were run during that time period to fix some issues with gene symbol assignment.
- Agelaius phoeniceus (red-winged blackbird)
- Anastrepha ludens (Mexican fruit fly)
- Anopheles marshallii (mosquito)
- Anopheles nili (mosquito)
- Anoplopoma fimbria (sablefish)
- Artibeus jamaicensis (Jamaican fruit-eating bat)
- Bombina bombina (fire-bellied toad) Continue reading “New annotations in RefSeq!”
New Way to View and Download Related Genes
Effective June 2023, the HomoloGene records will redirect to the Datasets Gene Table
Do you use HomoloGene to view and download data? You can now access updated homology data from NCBI Datasets through the Datasets Gene Table with connections to NCBI Orthologs. Go directly from a HomoloGene record to the Datasets Gene Table that will give you access to up-to-date sequence data and metadata. NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.
The Datasets Gene Table provides connections to the NCBI Ortholog interface (Figure 1) that provides the following data:
- Orthology data based on an updated algorithm that identifies orthologs spanning > 500 vertebrate species
- Similar gene data based on protein architectures that spans all eukaryotes
Continue reading “New Way to View and Download Related Genes”
RefSeq Release 217
RefSeq release 217 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.
What’s included in this release?
As of March 8, 2023, this full release incorporates genomic, transcript, and protein data, containing:
- 348,351,219 records
- 254,500,694 proteins
- 50,975,429 RNAs
- sequences from 130,837 organisms
The release is provided in several directories as a complete dataset and divided by logical groupings. Continue reading “RefSeq Release 217”
New & Improved NCBI Datasets Genome and Assembly Pages
Legacy pages will be redirected effective July 2023
In July 2023, NCBI’s Assembly and Genome record pages will be redirected to new Datasets pages as part of our ongoing effort to modernize and improve your user experience. NCBI Datasets is a new resource that makes it easier to find and download genome data.
We will update the following pages:
- The NCBI Assembly pages will be redirected to the new Datasets Genome pages that describe assembled genomes and provide links to related NCBI tools such as Genome Data Viewer and BLAST.
- The NCBI Genome pages will be redirected to the Datasets Taxonomy pages that provide a taxonomy-focused portal to genes, genomes and additional NCBI resources.
- During this transition, you will have the option to return to the legacy Genome and Assembly pages.
Continue reading “New & Improved NCBI Datasets Genome and Assembly Pages”
New annotations in RefSeq!
In December and January, the NCBI Eukaryotic Genome Annotation Pipeline released twenty-nine new annotations in RefSeq for the following organisms:
- Acinonyx jubatus (cheetah)
- Anopheles cruzii (mosquito)
- Anopheles moucheti (mosquito)
- Bicyclus anynana (squinting bush brown)
- Budorcas taxicolor (takin)
- Carassius gibelio (silver crucian carp)
- Citrus sinensis (sweet orange)
- Crassostrea angulata (Portugese oyster)
- Culex pipiens pallens (northern house mosquito)
- Drosophila gunungcola (fruit fly)
- Galleria mellonella (greater wax moth)
- Gossypium arboreum (tree cotton)
- Gossypium raimondii (Peruvian cotton)
- Harpia harpyja (harpy eagle)
- Hemicordylus capensis (graceful crag lizard)
- Lactuca sativa (garden lettuce)
- Mercenaria mercenaria (northern quahog)
- Mya arenaria (softshell)
- Octopus bimaculoides (California two-spot octopus)
- Oncorhynchus keta (chum salmon)
- Pangasianodon hypophthalmus (striped catfish)
- Panonychus citri (citrus red mite)
- Panthera uncia (snow leopard) (pictured)
- Peromyscus californicus insignis (California mouse)
- Podarcis raffonei (Aeolian wall lizard)
- Populus trichocarpa (black cottonwood)
- Scomber japonicus (chub mackerel)
- Tympanuchus pallidicinctus (lesser prairie-chicken)
- Vigna angularis (adzuki bean)
NIH Comparative Genomics Resource project
The potential impact of emerging model organisms on human health
Comparative genomics is a science that compares genomic data either within a species or across species to answer questions in biomedicine. Laboratory experiments can then investigate the functional impact of those genomics similarities and differences. The history of comparative genomics goes back to the mid-1990s, but comparative genomics is now accelerating. A flood of new data is emerging as DNA sequencing technology becomes cheaper and commoditized. While this growth poses many challenges to current tools and approaches, it also offers immense opportunity for scientific research and understanding. These insights continue to reveal novel model organisms that can further the impact of comparative genomics on human health. Continue reading “NIH Comparative Genomics Resource project”
New RefSeq Annotations!
In October and November, the NCBI Eukaryotic Genome Annotation Pipeline released thirty-one new annotations in RefSeq for the following organisms:
- Acanthochromis polyacanthus (spiny chromis)
- Acomys russatus (golden spiny mouse)
- Andrographis paniculata (eudicot)
- Antechinus flavipes (yellow-footed antechinus)
- Apodemus sylvaticus (European woodmouse)
- Apus apus (common swift)
- Arachis duranensis (eudicot)
- Continue reading “New RefSeq Annotations!”
Join NCBI at PAG 30
San Diego, January 13-18, 2023
NCBI is looking forward to seeing you in person at the International Plant and Animal Genome Conference (PAG 30), January 13-18, 2023 in San Diego, California.
We’re especially excited to share our recent efforts on the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources on biomedical research.
We also want to hear from you! If you’re interested in sharing your feedback on your needs and experiences involving comparative genomics tools to inform CGR, consider joining our Feedback Session.
Check out NCBI’s schedule of activities and events:
New annotations in RefSeq!
In August and September, the NCBI Eukaryotic Genome Annotation Pipeline released thirty-eight new annotations in RefSeq for the following organisms:
- Adelges cooleyi (spruce gall adelgid)
- Aethina tumida (small hive beetle)
- Anopheles aquasalis (mosquito)
- Anopheles maculipalpis (mosquito)
- Anthonomus grandis grandis (boll weevil)
- Aphis gossypii (cotton aphid)
- Bactrocera neohumeralis (fly)
- Bombus affinis (bee)
- Bombus huntii (bee)
- Cataglyphis hispanica (ant)
- Cygnus atratus (black swan) (pictured) Continue reading “New annotations in RefSeq!”
RefSeq release 218 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.
What’s included in this release?
As of May 1, 2023, this full release incorporates genomic, transcript, and protein data containing: