Tag: Eukaryotic genome annotation

New RefSeq Annotations Now Available!

In February and March, the NCBI Eukaryotic Genome Annotation Pipeline released forty-six new annotations in RefSeq!

New Annotations

Aedes albopictus (Asian tiger mosquito)
Anolis carolinensis (green anole)
Armigeres subalbatus (mosquito)
Bacillus rossius redtenbacheri (walking stick)
Bolinopsis microptera (comb jelly)
Bombyx mori (domestic silkworm)
Bubalus kerabau (carabao)
Candoia aspera (snake)
Cavia porcellus (domestic guinea pig)
Continue reading “New RefSeq Annotations Now Available!” →

Now Available: RefSeq Release 223

Check out RefSeq release 223, now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of March 4, 2024, this full release incorporates genomic, transcript, and protein data containing:

425,594,654 records
316,329,937 proteins
60,886,133 RNAs
sequences from 147,591 organisms

Continue reading “Now Available: RefSeq Release 223” →

Gene Ontology (GO) Terms for NCBI RefSeq Eukaryotic Genomes

Are you interested in more functional information about protein-coding genes? We’ve expanded NCBI RefSeq’s Eukaryote Genome Annotation Pipeline (EGAP) to include Gene Ontology (GO) terms computed for most protein-coding genes. We are using the latest version of InterProScan, which now includes analysis based on PANTHER reference trees, on all NCBI RefSeq eukaryotic genomes. That means having comprehensive GO data with inferred biological process, molecular function, and cellular component terms matched with high-quality RefSeq annotations across hundreds of taxa to help drive your research. The data is available on individual records in NCBI’s Gene resource, NCBI Gene FTP, or in community standard .gaf formatted files with each RefSeq genome release on our FTP site. Continue reading “Gene Ontology (GO) Terms for NCBI RefSeq Eukaryotic Genomes” →

RefSeq Release 221

RefSeq release 221 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of November 6, 2023, this full release incorporates genomic, transcript, and protein data containing:

404,657,610 records
300,054,945 proteins
57,882,313 RNAs
sequences from 143,819 organisms

Continue reading “RefSeq Release 221” →

New Annotations in RefSeq!

In July, August, and September, the NCBI Eukaryotic Genome Annotation Pipeline released fifty-six new annotations in RefSeq!

New Annotations

Achroia grisella (moth)
Acipenser ruthenus (sterlet)
Ahaetulla prasina (snake)
Alligator mississippiensis (American alligator)
Ammospiza caudacuta (bird)
Ammospiza nelsoni (bird)
Anopheles bellator (mosquito)
Anopheles coustani (mosquito)
Anopheles ziemanni (mosquito)
Arachis stenosperma (eudicot)
Carassius carassius (crucian carp)
Centropristis striata (black seabass)
Cornus florida (flowering dogwood) (pictured)
Corylus avellana (European hazelnut)
Corythoichthys intestinalis (scribbled pipefish) Continue reading “New Annotations in RefSeq!” →

RefSeq Release 220

RefSeq release 220 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of September 5, 2023, this full release incorporates genomic, transcript, and protein data containing:

391,350,361 records
289,333,423 proteins
56,423,426 RNAs
sequences from 141,099 organisms

Continue reading “RefSeq Release 220” →

New Annotations in RefSeq!

In April, May, and June, the NCBI Eukaryotic Genome Annotation Pipeline released eighty-two new annotations in RefSeq!

Highlights:

Homo sapiens (human) T2T-CHM13v2.0 now includes many more alternative splice variants
Homo sapiens (human) GRCh38.p14 includes all transcripts from MANE v1.2, and includes over 78,000 new RefSeq Functional Element (RefSeqFE) features added since our last annotation in 2022
Mus musculus (house mouse) GRCm39 integrates curation for over 3,000 genes and 14,000 transcripts since September 2020
Rattus norvegicus (Norway rat) mRatBN7.2, including curation of over 5000 genes since our last annotation in 2021

New annotations: Continue reading “New Annotations in RefSeq!” →

Revolutionize your research with the NIH Comparative Genomics Resource (CGR)

Unlock the full potential of eukaryotic research organisms and their genomic data with the National Institutes of Health (NIH) Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses through community collaboration as well as an NCBI toolkit of interconnected, interoperable data and tools.  

Comparative genomics is a field of study that uses the genomes of many different organisms to help us understand basic biological processes and human disease. NCBI is developing CGR to help researchers take full advantage of the rapidly growing number of eukaryotic organisms that, due to recent technological advances, now have sequenced genomes and associated data that can be used in these types of studies. Its NCBI toolkit offers new and modern resources for such analyses, and its emphasis on community collaboration brings new opportunities to share and connect data. Continue reading “Revolutionize your research with the NIH Comparative Genomics Resource (CGR)” →

RefSeq Release 218

RefSeq release 218 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of May 1, 2023, this full release incorporates genomic, transcript, and protein data containing:

356,619,635 records
260,776,371 proteins
52,503,423 RNAs
sequences from 133,740 organisms Continue reading “RefSeq Release 218” →

New annotations in RefSeq!

In February and March, the NCBI Eukaryotic Genome Annotation Pipeline released forty-two new annotations in RefSeq for the organisms listed below. Additionally, interim builds for over sixty species were run during that time period to fix some issues with gene symbol assignment.

Agelaius phoeniceus (red-winged blackbird)
Anastrepha ludens (Mexican fruit fly)
Anopheles marshallii (mosquito)
Anopheles nili (mosquito)
Anoplopoma fimbria (sablefish)
Artibeus jamaicensis (Jamaican fruit-eating bat)
Bombina bombina (fire-bellied toad) Continue reading “New annotations in RefSeq!” →