Primer-BLAST now offers help with irrelevant off-target matches

Primer-BLAST, NCBI’s primer-designer and specificity-checker, now offers a way to help you with irrelevant off-target matches.

Sometimes Primer-BLAST can’t design specific primers for your target sequence because of similar non-target sequences in the database. In some cases, you may know that these non-target matches are not important your research and are safe to ignore.  Examples may include tissue-specific splice variants, redundant entries, and predicted sequences.  To help in these cases, you can now choose to allow certain off-target matches. This gives Primer-BLAST greater freedom in primer selection and a better chance of finding highly specific primers.

Continue reading

Virus hunting in the cloud: A hackathon story at ASV 2019

Are you going to ASV 2019?

If you are, join us in a few days for a workshop on the virus hunting hackathon we helped run earlier this year.

Session: Workshop #19: Virus Discovery

Program Number: W-19-8

Time: Sunday, July 21, 7:00 PM CDT

Location: Mayo Auditorium

In this workshop, Dr. Rodney Brister will talk about how 41 scientists from 21 organizations worked to improve the usability of SRA data, identifying datasets that included known viruses and viral signals. Not only is that information now being integrated into a public search interface, but the approach used is also being refined in future hackathons so it can be applied to all SRA datasets.

We hope to see you there!

Have you tried OSIRIS, NCBI’s STR analysis tool?

More than 5 years ago, NCBI brought you OSIRIS (Open Source Independent Review and Interpretation System), a free, open-access tool for powerful and intelligent Short Tandem Repeat (STR) analysis.

Short Tandem Repeats (STRs) are repeated short stretches of DNA and are analyzed by measuring the length of the repeated region. They vary from individual to individual and are passed from parent to child.  STR analysis is broadly used in medicine, research and law enforcement – for stem cell transplants, diseases like Huntington’s, verifying research cell lines and samples, determining family relationships, and in criminal cases. In this blog post, we explore how you use OSIRIS in the real world and how your feedback has helped us improve this product. Continue reading

Publication on NCBI’s web-based structure viewer iCn3D

A recent article by Wang J., et al. describes the features and applications of iCn3D, NCBI’s web-based 3D viewer (Figure 1), and shows how you can use it for interactive structural analysis.

Wang J, Youkharibache P, Zhang D, Lanczycki CJ, Geer RC, Madej T, Phan L, Ward M, Lu S, Marchler GH, Wang Y, Bryant SH, Geer LY, Marchler-Bauer A. iCn3D, a Web-based 3D Viewer for Sharing 1D/2D/3D Representations of Biomolecular StructuresBioinformatics. 2019 June 20; pii: btz502. doi: 10.1093/bioinformatics/btz502. (PMID: 31218344

iCN3DFigure 1.  NCBI’s web-based structure viewer iCn3D displaying the TP53 structure 1TUP.

We welcome your input! Please send your suggestions and feedback on the iCn3D viewer to the NCBI Help Desk.

The BLAST programs and databases are available in Docker and cloud-ready

In modern biomedical research, you often need to analyze very large datasets. This may require computing and storage capacity that exceeds what you have available locally. Working in a cloud environment where you can provision nearly limitless computing power, gain access to enormous data sets, and pay for only what you need is a great option in these cases.

To help with these tasks, NCBI is now providing a Docker version of NCBI BLAST that you can use on the cloud. This implementation will help you work with large volumes of sequence data and the set of NCBI BLAST databases. The BLAST Docker image makes using BLAST on the cloud much more convenient.

  • Installation and maintenance of the BLAST programs and databases is all handled by Docker.
  • Integration with other tools in your pipelines is easier.
  • NCBI BLAST databases are pre-loaded on the Google Cloud, providing fast access.

While we have tested the Docker image on the Google Cloud, the Docker image will allow BLAST to run equally well on any Docker-enabled platform, such as another cloud platform or on your local computer  — and you can still can use the cloud-installed  BLAST databases.

See the  BLAST in the Cloud and  database information documentation to get started.

Genome Workbench 3.0, now with support for preparing GenBank genome submissions

Genome Workbench version 3.0 (release notes) is now available. An important new feature is the submission preparation wizard that allows you to prepare prokaryotic and eukaryotic genome sequences for submission to GenBank. This wizard is the first step toward offering a better alternative to the Sequin submission tool.

You simply load your sequences into Genome Workbench and use the submission wizard to enter information about your submission through a set of dialog boxes and then save a submission-ready data file.  The package also includes tools for editing your sequences, annotation, and metadata.

See the tutorial video on our YouTube channel or the Genome Workbench documentation for more details on how to enable the wizard and prepare a submission.

Try our new SRA data management tools!

Have you ever needed to correct or improve SRA metadata after submitting, change the release date for your data or share your data with reviewers? Now you can perform these tasks yourself using the SRA data management features now LIVE in Submission Portal!

If you have an SRA submission and associated BioProject and BioSample, you can log into the Submission Portal, go to the Manage data tab, click into that BioProject and easily perform the following common tasks (Figure 1).

Continue reading

New human genome annotation release with MANE Select and other improvements!

There’s a new RefSeq annotation available for the human genome, and it’s quite an update!

About the release

Annotation release 109.20190607 is the first release of our new bimonthly annotation schedule as announced in a previous post.   The annotated sequences are  the latest sequences for the GRCh38, patch 13 assembly, GRCh38.p13 (GCF_000001405.39). The chromosome backbone sequences remain the  same, but we’ve added 45 patch sequences representing novel and improved sequences that the Genome Reference Consortium will incorporate into the primary assembly in the future. The new annotation places the latest curated RefSeq transcripts and functional elements on the genome but keeps the same model dataset as in annotation release 109 except when the models have been replaced by curated RefSeqs or other review. We are also flagging MANE and other RefSeq Select transcripts.  Continue reading for more details on these improvements below. You can download the updated annotation here!

Continue reading

Microbial Virulence in the Cloud hackathon August 13 – 15 2019

From August 13 – 15 2019, the NCBI will run a bioinformatics hackathon on the NIH campus!

We’re specifically looking for folks who have experience in working with computational microbial genomics, evolutionary biology, antimicrobial resistance, and similar genomic analysis.  If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments (please note that the event itself will focus on open access public human).

Continue reading

GenBank release 232

GenBank release 232.0 (6/20/2019) is now available on the NCBI FTP site. This release has 5.47 terabases and 1.58 billion records.

The release has 213 million traditional records containing 329.8 billion base pairs of sequence data. There are also 1 billion WGS records containing 4.8 trillion base pairs of sequence data, 319.9 million bulk-oriented TSA records containing 285.3 trillion base pairs of sequence data, and 25 million bulk-oriented TLS records containing 10 billion base pairs of sequence data.

Continue reading