Tag: Eukaryotic genome annotation

New annotations in RefSeq!

New annotations in RefSeq!

In August and September, the NCBI Eukaryotic Genome Annotation Pipeline released thirty-eight new annotations in RefSeq for the following organisms:

  • Adelges cooleyi (spruce gall adelgid)
  • Aethina tumida (small hive beetle)
  • Anopheles aquasalis (mosquito)
  • Anopheles maculipalpis (mosquito)
  • Anthonomus grandis grandis (boll weevil)
  • Aphis gossypii (cotton aphid)
  • Bactrocera neohumeralis (fly)
  • Bombus affinis (bee)
  • Bombus huntii (bee)
  • Cataglyphis hispanica (ant)
  • Cygnus atratus (black swan) (pictured) Continue reading “New annotations in RefSeq!”
RefSeq release 214 is available!

RefSeq release 214 is available!

RefSeq release 214 is now available online, from the FTP site, and through NCBI’s Entrez programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available as of September 12, 2022, and contains 328,588,569 records, including 239,609,016 proteins, 47,387,931 RNAs, and sequences from 123,394 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

Foreign contamination screening
Introducing the new Foreign Contamination Screen (FCS) tool! If you produce assembled genomes, check out FCS, a tool you can run yourself to improve your genome assemblies and facilitate high-quality data submissions to GenBank. FCS is part of the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms. See our previous blog post to learn how FCS enhances contaminant detection sensitivity. Continue reading “RefSeq release 214 is available!”

New annotations in RefSeq

New annotations in RefSeq

In June and July, the NCBI Eukaryotic Genome Annotation Pipeline released twenty-six new annotations in RefSeq for the following organisms:

  • Anopheles coluzzii (mosquito)
  • Anopheles funestus (African malaria mosquito)
  • Astyanax mexicanus (Mexican tetra)
  • Athalia rosae (coleseed sawfly)
  • Bactrocera dorsalis (oriental fruit fly)
  • Brassica napus (rape)
  • Brienomyrus brachyistius (bony fish)
  • Canis lupus dingo (dingo) (pictured)
  • Caretta caretta (Loggerhead turtle)
  • Dendroctonus ponderosae (mountain pine beetle)
  • Epinephelus fuscoguttatus (brown-marbled grouper)
  • Lagopus muta (rock ptarmigan)
  • Marmota marmota marmota (Alpine marmot)
  • Nematostella vectensis (starlet sea anemone)
  • Ostrea edulis (bivalve)
  • Panthera uncia (snow leopard)
  • Plutella xylostella (diamondback moth)
  • Pyrus x bretschneideri (Chinese white pear)
  • Rhincodon typus (whale shark)
  • Rhipicephalus sanguineus (brown dog tick)
  • Solanum stenotomum (eudicot)
  • Solanum verrucosum (eudicot)
  • Sphaerodactylus townsendi (lizard)
  • Stegostoma fasciatum (shark)
  • Triticum urartu (monocot)
  • Ziziphus jujuba (common jujube)

Continue reading “New annotations in RefSeq”

RefSeq release 213

RefSeq release 213

RefSeq release 213 is now available online, from the FTP site and through NCBI’s Entrez programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available as of July 11, 2022, and contains 321,282,996 records, including 234,520,053 proteins, 45,781,716 RNAs, and sequences from 121,461 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings. Continue reading “RefSeq release 213”

New RefSeq annotations are available!

New RefSeq annotations are available!

In April and May, the NCBI Eukaryotic Genome Annotation Pipeline released twenty-eight new annotations in RefSeq for the following organisms:

Gapless Telomere to Telomere human genome (T2T-CHM13) now available

Gapless Telomere to Telomere human genome (T2T-CHM13) now available

On April 1, 2022, Science published the first complete sequence of a human genome, known as T2T-CHM13. This notable scientific achievement comes two decades after the first human genome release from the Human Genome Project and offers an in situ look at biologically important regions, such as centromeres, telomeres, and segmental duplications, that were previously unassembled. Read on to learn more about how you can access this assembly and related resources at NCBI, or to access any one of the more than 1000 human genome assemblies now in GenBank. Continue reading “Gapless Telomere to Telomere human genome (T2T-CHM13) now available”

RefSeq release 212 is available!

RefSeq release 212 is available!

RefSeq release 212 is now available online, from the FTP site and through NCBI’s Entrez
programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available as of May 2, 2022, and contains 314,915,153 records, including 229,417,182 proteins, 44,805,833 RNAs, and sequences from 119,373 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings.

Human genome Annotation Release 110

Annotation Release 110 is the first new annotation of human in four years, including all latest curated RefSeqs, and recalculation of models using over 80M long reads and 9B Illumina RNA-seq reads. AR 110 includes annotation of two human assemblies: Continue reading “RefSeq release 212 is available!”

New RefSeq annotations!

New RefSeq annotations!

In February and March, the NCBI Eukaryotic Genome Annotation Pipeline released thirty-seven new annotations in RefSeq for the following organisms:

  • Belonocnema kinseyi (wasp)
  • Daphnia pulex (common water flea)
  • Daphnia pulicaria (crustacean)
  • Dermatophagoides farinae (American house dust mite)
  • Diprion similis (hymenopteran)
  • Drosophila willistoni (fly)
  • Equus quagga burchellii (Burchell’s zebra) (pictured)
  • Gallus gallus (chicken)
  • Haliotis rubra (blacklip abalone)
  • Haliotis rufescens (red abalone)
  • Helicoverpa zea (corn earworm)
  • Homalodisca vitripennis (glassy-winged sharpshooter)
  • Hydra vulgaris (swiftwater hydra)
  • Hypomesus transpacificus (delta smelt)
  • Ictalurus punctatus (channel catfish)
  • Ischnura elegans (damselfly)
  • Lolium rigidum (monocot)
  • Lucilia cuprina (Australian sheep blowfly)
  • Lynx rufus (bobcat)
  • Marmota monax (woodchuck)
  • Meles meles (Eurasian badger)
  • Micropterus dolomieu (smallmouth bass)
  • Neodiprion fabricii (hymenopteran)
  • Neodiprion lecontei (redheaded pine sawfly)
  • Neodiprion pinetum (white pine sawfly)
  • Neodiprion virginiana (hymenopteran)
  • Oncorhynchus gorbuscha (pink salmon)
  • Osmia bicornis bicornis (red mason bee)
  • Scatophagus argus (bony fish)
  • Schistocerca americana (American grasshopper)
  • Schistocerca piceifrons (Central American locust)
  • Silurus meridionalis (bony fish)
  • Ursus americanus (American black bear)
  • Vanessa cardui (painted lady)
  • Vespa crabro (European hornet)
  • Vigna umbellata (eudicot)
  • Xenia sp. Carnegie-2017 (soft coral)

View the full list of annotated eukaryotes available in the Genome Data Viewer (GDV) browser.

New RefSeq annotations!

New RefSeq annotations!

In December and January, the NCBI Eukaryotic Genome Annotation Pipeline released twenty-four new annotations in RefSeq for the following organisms:

    • Aegilops tauschii (monocot)
    • Camelus bactrianus (Bactrian camel)
    • Colias croceus (clouded yellow)
    • Echinops telfairi (small Madagascar hedgehog)
    • Harmonia axyridis (beetle)
    • Lemur catta (Ring-tailed lemur)
    • Leopardus geoffroyi (Geoffroy’s cat)
    • Macaca fascicularis (crab-eating macaque)
    • Maniola jurtina (meadow brown)
    • Meles meles (Eurasian badger)
    • Melitaea cinxia (Glanville fritillary) (pictured) 

Continue reading “New RefSeq annotations!”

RefSeq release 210 is available

RefSeq release 210 is available

RefSeq release 210 is now available online, from the FTP site and through NCBI’s Entrez
programming utilities, E-utilities.

This full release incorporates genomic, transcript, and protein data available as of January 3, 2022, and contains 302,482,881 records, including 220,595,192 proteins, 42,453,222 transcripts, and sequences from 115,929 organisms. The release is provided in several directories as a complete dataset and also as divided by logical groupings. Continue reading “RefSeq release 210 is available”