Tag: Datasets

NIH Comparative Genomics Resource project

NIH Comparative Genomics Resource project

The potential impact of emerging model organisms on human health

Comparative genomics is a science that compares genomic data either within a species or across species to answer questions in biomedicine. Laboratory experiments can then investigate the functional impact of those genomics similarities and differences. The history of comparative genomics goes back to the mid-1990s, but comparative genomics is now accelerating. A flood of new data is emerging as DNA sequencing technology becomes cheaper and commoditized. While this growth poses many challenges to current tools and approaches, it also offers immense opportunity for scientific research and understanding. These insights continue to reveal novel model organisms that can further the impact of comparative genomics on human health. Continue reading “NIH Comparative Genomics Resource project”

New RefSeq Annotations!

New RefSeq Annotations!

In October and November, the NCBI Eukaryotic Genome Annotation Pipeline released thirty-one new annotations in RefSeq for the following organisms:

  • Acanthochromis polyacanthus (spiny chromis)
  • Acomys russatus (golden spiny mouse)
  • Andrographis paniculata (eudicot)
  • Antechinus flavipes (yellow-footed antechinus)
  • Apodemus sylvaticus (European woodmouse)
  • Apus apus (common swift)
  • Arachis duranensis (eudicot)
  • Continue reading “New RefSeq Annotations!”
Join NCBI at PAG 30

Join NCBI at PAG 30

San Diego, January 13-18, 2023 

NCBI is looking forward to seeing you in person at the International Plant and Animal Genome Conference (PAG 30), January 13-18, 2023 in San Diego, California.  

We’re especially excited to share our recent efforts on the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources on biomedical research.  

We also want to hear from you! If you’re interested in sharing your feedback on your needs and experiences involving comparative genomics tools to inform CGR, consider joining our Feedback Session.

Check out NCBI’s schedule of activities and events:  

Continue reading “Join NCBI at PAG 30”

New annotations in RefSeq!

New annotations in RefSeq!

In August and September, the NCBI Eukaryotic Genome Annotation Pipeline released thirty-eight new annotations in RefSeq for the following organisms:

  • Adelges cooleyi (spruce gall adelgid)
  • Aethina tumida (small hive beetle)
  • Anopheles aquasalis (mosquito)
  • Anopheles maculipalpis (mosquito)
  • Anthonomus grandis grandis (boll weevil)
  • Aphis gossypii (cotton aphid)
  • Bactrocera neohumeralis (fly)
  • Bombus affinis (bee)
  • Bombus huntii (bee)
  • Cataglyphis hispanica (ant)
  • Cygnus atratus (black swan) (pictured) Continue reading “New annotations in RefSeq!”
Now Available! Updated NCBI Datasets Command-Line Tools 

Now Available! Updated NCBI Datasets Command-Line Tools 

NLM’s NCBI Datasets announces the release of version 14 of our command-line (CLI) tools, datasets, and dataformat. This release (CLI v14.0.0) contains many improvements that are inspired by your feedback. It’s now easier than ever to browse and format metadata, generate customized tables, and download data packages. We hope these updates will improve your experience! 

NCBI Datasets CLIv14 includes changes to the command syntax, data package contents, and data report schemas that are not backwards-compatible. Commands written for CLI versions prior to version 14 may fail after the latest update. For more details see our FAQs.   Continue reading “Now Available! Updated NCBI Datasets Command-Line Tools “

Connect with NCBI at ASHG 2022

Connect with NCBI at ASHG 2022

Join us October 25-29 in Los Angeles, CA

We are looking forward to seeing you in-person at the American Society of Human Genetics (ASHG) annual meeting, October 25-29, 2022, in Los Angeles, California.

We will present a variety of talks and posters featuring our clinical and human genetic resources, as well as genome products and tools. We are excited to introduce the NIH Comparative Genomics Resource (CGR), a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. If you’re interested in providing feedback that will be used to help drive CGR forward, consider joining our round table discussion.  

Check out NCBI’s schedule of activities and events: 

Continue reading “Connect with NCBI at ASHG 2022”

Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)

Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)

In October 2022, NCBI Datasets will release version 14 of our datasets and dataformat command-line tools. This release will contain breaking changes to the command syntax, content of the data packages and data reports. Thank you for your feedback that inspired these new features. We hope they will improve your experience!

We will continue to support CLI v13.x, although new features and improvements will be exclusive to CLI v14.0.0 release and up.

NCBI Datasets supports the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms. Join our mailing list to keep up to date with NCBI Datasets and other CGR news.

More details

How is version 14 of the Datasets command-line tools (CLI v14.x) different from CLI v13.x and previous versions?  Continue reading “Coming soon! Changes to NCBI Datasets command-line tool in version 14 (CLIv14.0.0)”

Join NCBI virtually at the Biodiversity Genomics 2022 conference

Join NCBI virtually at the Biodiversity Genomics 2022 conference

Learn about the NIH Comparative Genomics Resource (CGR) Project

The Biodiversity Genomics conference will take place virtually, October 2-7, 2022. This event is hosted by the Earth BioGenome Project and is open and free for all to attend.

NCBI staff will present a variety of recorded talks and posters highlighting various elements of the NIH Comparative Genomics Resource (CGR), including NCBI Datasets and the Comparative Genome Viewer (CGV). CGR is a multi-year National Library of Medicine (NLM) project to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. NCBI is charged with leading CGR development and engaging genomics communities. The CGR project will facilitate reliable comparative genomics analyses for all eukaryotic organisms in collaboration with the genomics community.

Check out NCBI’s schedule of activities to learn more about CGR: Continue reading “Join NCBI virtually at the Biodiversity Genomics 2022 conference”

New annotations in RefSeq

New annotations in RefSeq

In June and July, the NCBI Eukaryotic Genome Annotation Pipeline released twenty-six new annotations in RefSeq for the following organisms:

  • Anopheles coluzzii (mosquito)
  • Anopheles funestus (African malaria mosquito)
  • Astyanax mexicanus (Mexican tetra)
  • Athalia rosae (coleseed sawfly)
  • Bactrocera dorsalis (oriental fruit fly)
  • Brassica napus (rape)
  • Brienomyrus brachyistius (bony fish)
  • Canis lupus dingo (dingo) (pictured)
  • Caretta caretta (Loggerhead turtle)
  • Dendroctonus ponderosae (mountain pine beetle)
  • Epinephelus fuscoguttatus (brown-marbled grouper)
  • Lagopus muta (rock ptarmigan)
  • Marmota marmota marmota (Alpine marmot)
  • Nematostella vectensis (starlet sea anemone)
  • Ostrea edulis (bivalve)
  • Panthera uncia (snow leopard)
  • Plutella xylostella (diamondback moth)
  • Pyrus x bretschneideri (Chinese white pear)
  • Rhincodon typus (whale shark)
  • Rhipicephalus sanguineus (brown dog tick)
  • Solanum stenotomum (eudicot)
  • Solanum verrucosum (eudicot)
  • Sphaerodactylus townsendi (lizard)
  • Stegostoma fasciatum (shark)
  • Triticum urartu (monocot)
  • Ziziphus jujuba (common jujube)

Continue reading “New annotations in RefSeq”

Announcing the NCBI Datasets SARS-CoV-2 taxonomy page

Announcing the NCBI Datasets SARS-CoV-2 taxonomy page

Need SARS-CoV-2 assembled genome sequences or specific SARS-CoV-2 protein sequences? You can find them on the new SARS-CoV-2 taxonomy page brought to you by NCBI Datasets.

The NCBI Datasets SARS-CoV-2 taxonomy page brings you both SARS-CoV-2 genomes and proteins, basic information about SARS-CoV-2, and connections to related NCBI pages, all in one place (see Figures 1 and 2).

Figure 1. NCBI Datasets SARS-CoV-2 taxonomy page. For command-line access, try the datasets command-line tool (top box). For customized filtering options, check out NCBI Virus (bottom box).

If you scroll down the taxonomy page you will find a table of SARS-CoV-2 proteins, each with “Actions” that provide the option to download a package of protein sequences from all annotated SARS-CoV-2 genomes (Figure 2), as well as links to NCBI Gene and the protein sequence from the reference genome.

Figure 2. NCBI Datasets SARS-CoV-2 taxonomy page (cont’d). Click the blue download button to download a package of all SARS-CoV-2 genomes (6 M and counting as of 7/15/22), or just the SARS-CoV-2 reference genome (top box). Below that you see a table of SARS-CoV-2 proteins, each with “Actions” available through the three-dot menu that provides the option to download a package of protein sequences from all annotated SARS-CoV-2 genomes (bottom boxes).

We want to hear from you! Check out the new SARS-CoV-2 taxonomy page and let us know what you think. Contact us with questions or feedback.

Join our mailing list to keep up to date with Datasets and other NCBI news.