Users of the SRA FTP site: Try the SRA Toolkit!


If you download data from the SRA (Sequence Read Archive) FTP site, we would encourage you to try the SRA Toolkit. This is particularly true if you use the SRA Fuse/FTP site at ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant, which the SRA team will decommission on December 1, 2019.

The SRA Toolkit offers several advantages for downloading SRA data, including greater flexibility in specifying the data you need as well as access to public SRA data in the cloud. If you’re new to the Toolkit, you may want to start with these instructions.

If you have any questions or concerns about downloading SRA data, please contact sra@ncbi.nlm.nih.gov. We’d love to hear from you!

Structural Variant Hackathon


NCBI is pleased to announce a Structural Variant Hackathon at the Baylor College of Medicine, Houston Texas, immediately before ASHG on October 11-13, 2019.

We’re specifically looking for folks who have experience in working with structural variants, complex disease, precision medicine, and similar genomic analysis.  If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments (please note that the event itself will focus on open access public human data).

Potential topics include:

  • Mapping structural variants to public databases
  • Calculating the heritability of different types of structural variants
  • CNV effect on isoform expression
  • Assembly accuracy for metagenomics
  • Quality assessment in large cohorts

The hackathon runs from 9 am – 6 pm each day, with the potential to extend into the evening hours each day. There will also be optional social events at the end of each day. Working groups of five to six individuals, with various backgrounds and expertise, will be formed into five to eight teams with an experienced leader. These teams will build pipelines and tools to analyze large datasets within a cloud infrastructure. Each day, we will come together to discuss progress on each of the topics, bioinformatics best practices, coding styles, etc.

There will be no registration fee associated with attending this event.

Note: Participants will need to bring their own laptop to this program. No financial support for travel, lodging, or meals is available for this event.

Continue reading

Try our new SRA data management tools!


Have you ever needed to correct or improve SRA metadata after submitting, change the release date for your data or share your data with reviewers? Now you can perform these tasks yourself using the SRA data management features now LIVE in Submission Portal!

If you have an SRA submission and associated BioProject and BioSample, you can log into the Submission Portal, go to the Manage data tab, click into that BioProject and easily perform the following common tasks (Figure 1).

Continue reading

NCBI to assist in Southern California genomics hackathon in January


From January 10-12, 2018, the NCBI will help with a bioinformatics hackathon in Southern California hosted by San Diego State University. The hackathon will focus on advanced bioinformatics analysis of next generation sequencing data, proteomics, and metadata. This event is for researchers, including students and postdocs, who have already engaged in the use of bioinformatics data or in the development of pipelines for bioinformatics analyses from high-throughput experiments. Some projects are available to other non-scientific developers, mathematicians, or librarians.

The event is open to anyone selected for the hackathon and willing to travel to SDSU (see below).  Applications are due Monday, December 11th, 2017 by 3 pm PT (6PM EST).

Continue reading

IgBLAST 1.8.0 release


A new version of IgBLAST is now available on FTP, along with a new manual on GitHub. This release has the following improvements:

  1. The igblastn executable can now multi-thread much more efficiently for large sets of queries. The default number of threads is now four, but can be changed with the -num_threads option.
  2. The igblastn executable can now take an SRA accession as the query input. The search runs on the local machine, but the queries are retrieved from the SRA repository at the NCBI. Use the -sra rather than the -query option to enable.
  3. A lower default nucleotide mismatch penalty values for finding D and J genes (from -4 to -2 and from -3 to -2, respectively). This improves accuracy in finding the best D and J gene hits for moderately mutated sequences.

Our web IgBLAST page also uses the new default nucleotide mismatch penalty values (i.e., -2 for finding both D and J genes).

IgBLAST facilitates the analysis of immunoglobulin and T cell receptor variable domain sequences.

November 1 webinar: Introducing the Genome Data Viewer (GDV)


On Wednesday, November 1, 2017, we will present a webinar on GDV, NCBI’s full-featured genome browser. In this webinar, you’ll learn how to explore and analyze sequences and annotations for eukaryotic RefSeq genome assemblies. We’ll show you how to:

  • Search across the entire assembly for genes, products and other markers or jump to a specific position or range
  • Display any of seven preselected track sets highlighting various aspects of the assembly or create and load your own custom track sets from your NCBI account.
  • Load and display submitted alignment data from NCBI’s GEO or SRA.
  • Upload your own annotation and variant data
  • Display BLAST or Primer-BLAST results on the assembly in the browser.

Date and time: Wednesday, November 1, 2017 12:00-12:30PM EDT

After registering, you will receive a confirmation email with information about attending the webinar. After the live presentation, the webinar will be uploaded to the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Magic-BLAST 1.3.0 released with new features and improvements


The newest version of Magic-BLAST (v. 1.3.0) offers improved sensitivity and faster run-times as well as a number of other new features and improvements. These include the ability to set the alignment cut-off score as a function of read length, a maximum edit distance option and optional local cacheing for SRA files. For more information on these and other improvements, see the release notes. You can download the new executables from the NCBI FTP site.

Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Read more here.

August 23 NCBI Minute: Using the Run Selector to Find Relevant Next-Generation Sequencing (NGS) Datasets


Do you have trouble searching the NCBI webpage for relevant datasets? Wish you could filter the search results more precisely? You can with SRA Run Selector.

In this NCBI Minute, you’ll learn how to filter the SRA database using the metadata details captured for each submitted dataset. This is easily done in a spreadsheet format that displays all recorded metadata for each SRA Run. The user-friendly interface allows you to selectively filter datasets down to the most relevant data for your research question and then export it in a spreadsheet.

Date and time: Wednesday, August 23, 2017 12:00 PM – 12:30 PM EDT

After registering, you will receive a confirmation email with information about attending the webinar. After the live presentation, the webinar will be uploaded to the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.

dbGaP 10th Anniversary Symposium June 9, 2017


dbGaP (the NIH database of Genotypes and Phenotypes) is celebrating its 10th Anniversary this year! We are proud to support over 850 studies and 1.6 million samples.

We invite you to join us at the dbGaP 10th Anniversary Symposium to be held on June 9, 2017; 1:30-3:00 PM Wilson Hall, Building-1 on the NIH Bethesda campus. For information on Campus access and security, NIH Visitor Center, Parking, and directions to NIH, see the NIH Visitor Information page.

Continue reading