Tag: Gene

New NCBI Datasets home and documentation pages provide easier access

NCBI Datasets, the new set of services for downloading genome assembly and annotation data (previous Datasets posts), has redesigned and reorganized web pages to make it easier to find and access the services and documentation you need.

NCBI Datasets has a fresh new homepage (Figure 1) highlighting the types of data available through our tools. Available data include genome assemblies, genes, and SARS-CoV-2 genomic and protein data.  You can easily access these from the new page or learn more with our new documentation pages.

Figure 1. Features of the new Datasets homepage with quick access to help documentation including the Quickstart and How-to guides as well as access to Genome, Gene, and Coronavirus Data, and the Datasets and Dataformat command-line tools. Continue reading “New NCBI Datasets home and documentation pages provide easier access”

The Datasets command-line tool now provides ortholog data

You can now get gene ortholog data using the NCBI Datasets command-line tool using a gene ID, gene symbol, or RefSeq nucleotide or protein accession. Data are available for vertebrates and insects. The vertebrate orthologs includes a specialized set for fish.  (See our recent post for more information on the orthologs for fish and insects.)

You can retrieve metadata for gene orthologs in JSON Format, or you can download a compressed (zip) archive containing both metadata and sequences (Figure 1).

Figure 1. Command-lines  that use a gene symbol (BRCA1) to retrieve mammalian ortholog metadata (top, JSON metadata shown in part in the image) and sequences (bottom). 

Continue reading “The Datasets command-line tool now provides ortholog data”

Announcing the RefSeq annotation of rat mRatBN7.2!

Announcing the RefSeq annotation of rat mRatBN7.2!

NCBI RefSeq has finished its initial annotation of the new rat reference assembly, mRatBN7.2, recently released by the Darwin Tree of Life Project at the Wellcome Sanger Institute. This is the first coordinate-changing update to the rat reference since the 2014 release of Rnor_6.0 from the Rat Genome Sequencing Consortium and brings the rat assembly into the modern age with a nearly 300x increase in contig N50 and 9x increase in scaffold N50 lengths. It’s a major improvement!

Continue reading “Announcing the RefSeq annotation of rat mRatBN7.2!”

Programmatic access to Gene data using Datasets command-line and API

In March, we announced NCBI Datasets, a new resource that lets you easily retrieve and download data from across NCBI databases. Did you know you can now fetch NCBI Gene data programmatically using the NCBI Datasets API or command-line tool?  Quickly retrieve both metadata and gene sequence data for multiple Gene records including transcripts and proteins in one shell command or API request. The API documentation is a good way to get started with programmatic access (Figure 1).

Figure 1. The Datasets API documentation showing a demonstration retrieving Gene metadata using RefSeq mRNA accessions. The API returns a readily processed JSON object.

Continue reading “Programmatic access to Gene data using Datasets command-line and API”

NCBI Datasets now provides downloads of gene data for more than 30 thousand organisms

NCBI Datasets now offers Gene tables: customizable tables of the genes you specify, with key gene information, and the ability to easily download a dataset of genomic, transcript and protein sequences.

Drag and drop a list of Gene IDs or gene symbols, and the data table shows your genes with up to 15 columns of metadata, including genomic coordinates, RefSeq transcript and protein accessions, Ensembl IDs and UniProt accessions, and other gene information. You can browse and select items in your table on the web, or download everything to your computer for later analysis (Figure 1).

Figure 1. The Data tables web download. Top panel. Enter or upload a list of gene identifiers or symbols. Bottom panel. The resulting table display allows you to browse results, download the table or the sequence data for the genes (genomic, transcripts, proteins).  Continue reading “NCBI Datasets now provides downloads of gene data for more than 30 thousand organisms”

The latest in COVID-19 related human gene annotation now in NCBI RefSeq and Gene

Interested in human genes involved in COVID-19 biology? NCBI’s RefSeq group has been hard at work compiling a set of human genes with roles in coronavirus infection and disease. You can now see and search for these genes and their regulatory elements in NCBI Gene and RefSeq.

Figure 1. Top section of the human ACE2 record in the Gene database. COVID-19 information can be found in the Summary and Annotation information sections.

Continue reading “The latest in COVID-19 related human gene annotation now in NCBI RefSeq and Gene”

New interaction data, downloads and track hub available for RefSeq Functional Elements 

We’ve added several new enhancements to the RefSeq Functional Elements dataset, which provides genome annotation and richly annotated RefSeq and Gene records for experimentally validated non-genic functional regions in human and mouse. Read on to see what we’ve done!

Continue reading “New interaction data, downloads and track hub available for RefSeq Functional Elements “

CCDS Release 23 for Mouse Now in Entrez Gene

Are you interested in high quality genomic annotations for human and mouse? Check out the Consensus Coding Sequence (CCDS) project! Release 23 of the CCDS project is now available in Entrez Gene. This release compares NCBI’s Mus musculus annotation release 108 to Ensembl’s annotation release 98. This update adds 1,570 new CCDS records and 175 genes to the mouse CCDS dataset. In total, release 23 includes 27,219 CCDS records that correspond to 20,486 genes.

Continue reading “CCDS Release 23 for Mouse Now in Entrez Gene”

NCBI on YouTube: new videos on PubMed, My Bibliography, sequence data and more

Here are the latest videos on our YouTube channel. Subscribe to get alerts for new videos.

Introducing the Genome Submission Wizard in Genome Workbench v3.0

Genome Workbench version 3 is a major upgrade, including the addition of the Genome Submission Wizard. This video guides you through the wizard, from uploading your genome data file to completion of the submitter report, which is ready to submit to GenBank using tools such as Submission Portal or BankIt. Note: An on-line tutorial is under “Manuals” on the Genome Workbench home page.

Continue reading “NCBI on YouTube: new videos on PubMed, My Bibliography, sequence data and more”

September 11 Webinar: A beginner’s guide to genes and sequences at NCBI

September 11 Webinar: A beginner’s guide to genes and sequences at NCBI

On Wednesday, September 11, 2019 at 12 PM, NCBI staff will present a webinar for people with limited experience working with gene and sequence information. You will learn about the kinds of data available for genes and sequences, how to select the most informative records, and how to find related genes and sequences using pre-computed information and the BLAST sequence search service.

  • Date and time: Wed, Sep 11, 2019 12:00 PM – 12:30 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.