To enhance machine access to biomedical literature and drive impactful analyses and reuse, the National Library of Medicine (NLM) is pleased to announce the availability of the PubMed Central (PMC) Article Datasets on Amazon Web Services (AWS) Registry of Open Data as part of AWS’s Open Data Sponsorship Program (ODP). These datasets collectively span 4 million of PMC’s 7 million … Continue reading PubMed Central Article Datasets are Now Available on the Cloud
Search Results for: datasets
As part of the NIH Comparative Genomics Resource (CGR) project, NLM’s NCBI Datasets is introducing an all-new modern experience making it easier for you to browse and download genome sequence and metadata, and navigate to tools such as the Genome Data Viewer (GDV) and BLAST. To get started, search NCBI Datasets by assembly accession (e.g., … Continue reading Introducing NLM’s new NCBI Datasets genome page!
The updated NCBI Datasets Genomes page now has genome data for all domains of life, including bacterial and viral genomes. The genomes table (Figure 1) now offers filters for: Reference genomes — switch it on to only show reference or representative genomes Annotated — switch it on to only show annotated genomes Assembly level — use the assembly level … Continue reading Introducing the new NCBI Datasets Genomes page
As they diverge from a common ancestor, species accumulate differences in their DNA sequences. Differences within a protein-coding region are classified in two types. Non-synonymous substitutions change the amino acid sequence of the protein, while synonymous substitutions do not. Synonymous substitutions are largely invisible to natural selection and tend to accumulate at a constant rate. On the other hand, … Continue reading An Introduction to Molecular Evolutionary Analysis with NCBI Datasets and Python
NCBI Datasets introduces species pages and species browser! The species pages summarize taxon information and provide access to genomic data, including reference genomes. For example, see Figure 1, the Nothobranchius furzeri (turquoise killifish) species page. Figure 1: Nothobranchius furzeri species page. The browse species button will take you to the species browser.
NCBI Datasets, the new set of services for downloading genome assembly and annotation data (previous Datasets posts), has redesigned and reorganized web pages to make it easier to find and access the services and documentation you need. NCBI Datasets has a fresh new homepage (Figure 1) highlighting the types of data available through our tools. Available … Continue reading New NCBI Datasets home and documentation pages provide easier access
You can now get gene ortholog data using the NCBI Datasets command-line tool using a gene ID, gene symbol, or RefSeq nucleotide or protein accession. Data are available for vertebrates and insects. The vertebrate orthologs includes a specialized set for fish. (See our recent post for more information on the orthologs for fish and insects.) You … Continue reading The Datasets command-line tool now provides ortholog data
You can now retrieve genome data using the NCBI Datasets command-line tool and API by simply providing a BioProject accession. You can go directly from a BioProject accession to genome data even when the BioProject accession is the parent of multiple BioProjects (Figure 1). Figure 1. Command-lines using BioProject accessions with the datasets command-line tool and sample metadata. Top … Continue reading Retrieve genome data by BioProject using the Datasets command-line tool
Missed a few videos on YouTube? Here’s the latest from our channel. Customize the MSA Viewer to Make Your Analysis Easier We’re constantly improving the Multiple Sequence Alignment (MSA) Viewer. This video demonstrates several new and popular features, including the ability to change data columns, hide selected rows, analyze polymorphisms, and more.
Join us on September 22, 2021 at 12PM eastern time learn to use the datasets command-line tools (datasets and dataformat) to access, filter, download, and format data and metadata for genomes. Through examples from eukaryotes and the SARS-CoV-2 coronavirus, you will see how to use metadata to filter for genome sequences with desired properties such … Continue reading Sept 22 Webinar: Using NCBI Datasets command-line tools to access data and metadata for genomes