May 20 webinar: Exploring SRA metadata in the cloud with BigQuery

Join us on May 20th to learn how to use Google’s BigQuery to quickly search the data from the Sequence Read Archive (SRA) in the cloud to speed up your bioinformatic research and discovery projects. BigQuery is a tool for exploring cloud-based data tables with SQL-like queries. In this webinar, we’ll introduce you to using BigQuery to mine SRA submitter-supplied metadata and the results of taxonomic analysis for SRA runs. You’ll see real-world case studies that demonstrate how to find key information about SRA runs and identify data sets for your own analysis pipelines.

  • Date and time: Wed, May 20, 2020 12:00 PM – 12:45 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

April 8 Webinar: Accelerate genomics discovery with SRA in the cloud

On Wednesday, April 8, 2019 at 12 PM, NCBI staff will show you how to leverage the cloud to speed up your research and discovery. You’ll be introduced to new and existing tools and data including BigQuery, SRA Toolkit, and more. You’ll hear about real workflows in the cloud featuring an example of the work NCBI was able to accomplish in the cloud using SRA data and a case study from an SRA cloud customer

By the end of this webinar, you will know where to look for new cloud products from NCBI, access help information to get you started, and will see how to run your analyses efficiently in the cloud.

  • Date and time: Wed, Apr 8, 2020 12:00 PM – 12:45 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

The entire corpus of the Sequence Read Archive (SRA) now live on two cloud platforms!

The National Library of Medicine (NLM) is pleased to announce that all controlled-access and publicly available data in SRA is now available through Google Cloud Platform (GCP) and Amazon Web Services (AWS). To access the data please visit our SRA in the Cloud webpage where you will find links to our new SRA Toolkit and other access methods.

The SRA data available in the two clouds currently totals more than 14 petabytes and consists of all data in the SRA format as well as some data in its original submission format.  Since May 2019, NCBI has been putting all submitted SRA data on the GCP and AWS clouds in both the submitted format and our converted SRA format. We have also been moving previously submitted original format data to the clouds and expect to complete that process in 2021. Continue reading

NCBI on YouTube: Get the most out of NCBI resources with these videos

Check out the latest videos on YouTube to learn how to best use NCBI graphical viewers, SRA, PGAP, and other resources.

Genome Data Viewer: Analyzing Remote BAM Alignment Files and Other Tips

This video shows you how to upload remote BAM files, and succinctly demonstrates handy viewer settings, such as Pileup display options, and highlights the very helpful tooltips in the Genome Data Viewer (GDV). There’s also a brief blog post on the same topic.

Continue reading

Users of the SRA FTP site: Try the SRA Toolkit!

If you download data from the SRA (Sequence Read Archive) FTP site, we would encourage you to try the SRA Toolkit. This is particularly true if you use the SRA Fuse/FTP site at ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant, which the SRA team will decommission on December 1, 2019.

The SRA Toolkit offers several advantages for downloading SRA data, including greater flexibility in specifying the data you need as well as access to public SRA data in the cloud. If you’re new to the Toolkit, you may want to start with these instructions.

If you have any questions or concerns about downloading SRA data, please contact sra@ncbi.nlm.nih.gov. We’d love to hear from you!

Structural Variant Hackathon

NCBI is pleased to announce a Structural Variant Hackathon at the Baylor College of Medicine, Houston Texas, immediately before ASHG on October 11-13, 2019.

We’re specifically looking for folks who have experience in working with structural variants, complex disease, precision medicine, and similar genomic analysis.  If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments (please note that the event itself will focus on open access public human data).

Potential topics include:

  • Mapping structural variants to public databases
  • Calculating the heritability of different types of structural variants
  • CNV effect on isoform expression
  • Assembly accuracy for metagenomics
  • Quality assessment in large cohorts

The hackathon runs from 9 am – 6 pm each day, with the potential to extend into the evening hours each day. There will also be optional social events at the end of each day. Working groups of five to six individuals, with various backgrounds and expertise, will be formed into five to eight teams with an experienced leader. These teams will build pipelines and tools to analyze large datasets within a cloud infrastructure. Each day, we will come together to discuss progress on each of the topics, bioinformatics best practices, coding styles, etc.

There will be no registration fee associated with attending this event.

Note: Participants will need to bring their own laptop to this program. No financial support for travel, lodging, or meals is available for this event.

Continue reading

Try our new SRA data management tools!

Have you ever needed to correct or improve SRA metadata after submitting, change the release date for your data or share your data with reviewers? Now you can perform these tasks yourself using the SRA data management features now LIVE in Submission Portal!

If you have an SRA submission and associated BioProject and BioSample, you can log into the Submission Portal, go to the Manage data tab, click into that BioProject and easily perform the following common tasks (Figure 1).

Continue reading