Category: What’s New

NCBI Virus Extends Dashboard Visualizations to all Virus Sequences!

NCBI Virus Extends Dashboard Visualizations to all Virus Sequences!

Do you want to be able to quickly filter your virus search results based on important attributes? Good news, now you can! We are pleased to announce the extension of Dashboard Visualizations for any virus in the NCBI Virus collection (Figure 1). Dashboard Visualizations allow data to be quickly visualized in a graphical presentation based on a few highly sought-after attributes to prefilter your dataset. 

What are Dashboard Visualizations?

Dashboard Visualizations allow you to filter your search by geographic location, collection time, and release time. Each feature on the Dashboard is interactive, so when a filter is applied, it limits the data shown in the other features. When using these filters, the top summary section updates to provide you a snapshot of the number of records in NCBI RefSeq, Nucleotide, and Protein that fit the combined conditions of your search in the NCBI Virus database.    Continue reading “NCBI Virus Extends Dashboard Visualizations to all Virus Sequences!”

New Improvements! Try out our Foreign Contamination Screen (FCS) Tool

New Improvements! Try out our Foreign Contamination Screen (FCS) Tool

Want to submit high-quality data quickly and easily to GenBank? Check out our Foreign Contamination Screen (FCS) tool, a quality assurance process that you can run yourself. FCS offers enhanced contaminant detection sensitivity to improve your genome assemblies and facilitate high-quality data submissions to GenBank. We recently made several improvements to make the tool even easier to use! 

What’s New?
  • Now quicker and easier to run!  
  • Decontaminate your genome with just one extra step. 
    • Save the removed sequences in a separate file, if desired.  
  • More accurate!  
  • Find more contaminants with improved coverage of prokaryotes, protists, and more. 
  • Screen your genome on the cloud in minutes. 

Continue reading “New Improvements! Try out our Foreign Contamination Screen (FCS) Tool”

NCBI’s Genome Decoration Page (GDP) to Retire in September 2023

As of September 2023, NCBI’s Genome Decoration Page (GDP) will no longer be available. Due to low usage of GDP, we are focusing our development efforts on our more popular resources and tools.  

If you are using GDP to view your data mapped to genomes, we encourage you to check out our Genome Data Viewer (GDV) if you haven’t already. You can upload your data for display in GDV and export PDF or SVG images of your view. 

Stay up to date 

Follow us on Twitter @NCBI and join our mailing list to keep up to date with our visualization tools and other NCBI news.   

Questions? 

Feel free to contact our help desk at info@ncbi.nlm.nih.gov if you have any questions or concerns. 

New Way to View and Download Related Genes

New Way to View and Download Related Genes

Effective June 2023, the HomoloGene records will redirect to the Datasets Gene Table

Do you use HomoloGene to view and download data? You can now access updated homology data from NCBI Datasets through the Datasets Gene Table with connections to NCBI Orthologs. Go directly from a HomoloGene record to the Datasets Gene Table that will give you access to up-to-date sequence data and metadata. NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases.

The Datasets Gene Table provides connections to the NCBI Ortholog interface (Figure 1) that provides the following data: 

  • Orthology data based on an updated algorithm that identifies orthologs spanning > 500 vertebrate species 
  • Similar gene data based on protein architectures that spans all eukaryotes 

Continue reading “New Way to View and Download Related Genes”

Read About NCBI Resources in 2023 Nucleic Acids Research Database Issue

Read About NCBI Resources in 2023 Nucleic Acids Research Database Issue

The 2023 Nucleic Acids Research Database Issue features papers from NCBI staff on GenBank, Conserved Domain Database, and more. The citations are available in PubMed with full-text available in PubMed Central (PMC). To read an article, click on the PMCID number listed below.  Continue reading “Read About NCBI Resources in 2023 Nucleic Acids Research Database Issue”

RefSeq Release 217

RefSeq Release 217

RefSeq release 217 is now available online and from the FTP site. You can access RefSeq data through NCBI Datasets.

What’s included in this release?

As of March 8, 2023, this full release incorporates genomic, transcript, and protein data, containing:

  • 348,351,219 records
  • 254,500,694 proteins
  • 50,975,429 RNAs
  • sequences from 130,837 organisms

The release is provided in several directories as a complete dataset and divided by logical groupings. Continue reading “RefSeq Release 217”

Streamlining Access to SRA COVID-19 Datasets on the Cloud

Streamlining Access to SRA COVID-19 Datasets on the Cloud

To make it easier for you to find and access Sequence Read Archive (SRA) data, we are re-organizing and improving our cloud storage systems.  

Beginning April 2023, we will move the SARS-CoV-2 normalized data and source files from the COVID-19 data buckets on Amazon Web Services (AWS) and Google Cloud Platform (GCP) to the NIH NCBI SRA on AWS registry. We will also remove the SARS-CoV-2 original format data from AWS and GCP COVID-19 buckets and make them available in AWS cold storage. If you need these data, you can request them using the Cloud Data Delivery Service (CDDS). 

Where and how will I be able to access SARS-CoV-2 normalized data after this change occurs?

To ensure a smooth transition, we want you to have enough time to adjust your scripts and pipelines to minimize disruption to your analyses.   Continue reading “Streamlining Access to SRA COVID-19 Datasets on the Cloud”

3+ Ways NCBI is Enhancing the SRA Database

3+ Ways NCBI is Enhancing the SRA Database

Do you submit or access Sequence Read Archive (SRA) data? In an ongoing effort to enhance your experience, NCBI is making several improvements to our widely used SRA database. SRA is the largest publicly available repository of high throughput sequencing data. The archive accepts data from all organisms as well as metagenomic and environmental surveys. SRA stores raw sequencing data and alignment information to enable reproducibility and facilitate new discoveries through data analysis. 

What improvements is NCBI making?

  • More transparent: We recently launched the GenBank and SRA Data processing page to help you better understand how sequence data are submitted, processed, and made publicly available. 
  • More efficient: Faster data transfers, downloads, and analyses! We will be incrementally streamlining how you access SRA data as SRA Lite becomes the standard SRA file format. This simplified format reduces the average file size for more efficient analysis and storage of large datasets. 
  • More reliable: A trusted source! SRA is a trustworthy database, and we are continuously improving our processes to ensure system reliability.   
  • And more!  

Continue reading “3+ Ways NCBI is Enhancing the SRA Database”

New & Improved NCBI Datasets Genome and Assembly Pages

New & Improved NCBI Datasets Genome and Assembly Pages

Legacy pages will be redirected effective June 2023

In June 2023, NCBI’s Assembly and Genome record pages will be redirected to new Datasets pages as part of our ongoing effort to modernize and improve your user experience. NCBI Datasets is a new resource that makes it easier to find and download genome data 

We will update the following pages:
  • The NCBI Assembly pages will be redirected to the new DatasetsGenome pages that describe assembled genomes and provide links to related NCBI tools such as Genome Data Viewer and BLAST. 
  • The NCBIGenome pages will be redirected to the DatasetsTaxonomy pages that provide a taxonomy-focused portal to genes, genomes and additional NCBI resources.  
  • During this transition, you will have the option to return to the legacy Genome and Assembly pages. 

Continue reading “New & Improved NCBI Datasets Genome and Assembly Pages”

GenBank Release 254.0 is Available!

GenBank Release 254.0 is Available!

GenBank release 254.0 (2/19/2022) is now available on the NCBI FTP site. This release has 22.52 trillion bases and 3.37 billion records. The current release has 241,830,635 traditional records containing 1,731,302,248,418 base pairs of sequence data. There are also 2,337,838,461 WGS records containing 20,116,642,176,263 base pairs of sequence data, 672,261,981 bulk-oriented TSA records containing 630,615,054,587 base pairs of sequence data, and 121,067,644 bulk-oriented TLS records containing 46,465,508,548 base pairs of sequence data. Continue reading “GenBank Release 254.0 is Available!”