Primer-BLAST, NCBI’s primer-designer and specificity-checker, now offers a way to help you with irrelevant off-target matches.
Sometimes Primer-BLAST can’t design specific primers for your target sequence because of similar non-target sequences in the database. In some cases, you may know that these non-target matches are not important your research and are safe to ignore. Examples may include tissue-specific splice variants, redundant entries, and predicted sequences. To help in these cases, you can now choose to allow certain off-target matches. This gives Primer-BLAST greater freedom in primer selection and a better chance of finding highly specific primers.
Are you going to ASV 2019?
If you are, join us in a few days for a workshop on the virus hunting hackathon we helped run earlier this year.
Session: Workshop #19: Virus Discovery
Program Number: W-19-8
Time: Sunday, July 21, 7:00 PM CDT
Location: Mayo Auditorium
In this workshop, Dr. Rodney Brister will talk about how 41 scientists from 21 organizations worked to improve the usability of SRA data, identifying datasets that included known viruses and viral signals. Not only is that information now being integrated into a public search interface, but the approach used is also being refined in future hackathons so it can be applied to all SRA datasets.
We hope to see you there!
More than 5 years ago, NCBI brought you OSIRIS (Open Source Independent Review and Interpretation System), a free, open-access tool for powerful and intelligent Short Tandem Repeat (STR) analysis.
Short Tandem Repeats (STRs) are repeated short stretches of DNA and are analyzed by measuring the length of the repeated region. They vary from individual to individual and are passed from parent to child. STR analysis is broadly used in medicine, research and law enforcement – for stem cell transplants, diseases like Huntington’s, verifying research cell lines and samples, determining family relationships, and in criminal cases. In this blog post, we explore how you use OSIRIS in the real world and how your feedback has helped us improve this product. Continue reading
In modern biomedical research, you often need to analyze very large datasets. This may require computing and storage capacity that exceeds what you have available locally. Working in a cloud environment where you can provision nearly limitless computing power, gain access to enormous data sets, and pay for only what you need is a great option in these cases.
To help with these tasks, NCBI is now providing a Docker version of NCBI BLAST that you can use on the cloud. This implementation will help you work with large volumes of sequence data and the set of NCBI BLAST databases. The BLAST Docker image makes using BLAST on the cloud much more convenient.
- Installation and maintenance of the BLAST programs and databases is all handled by Docker.
- Integration with other tools in your pipelines is easier.
- NCBI BLAST databases are pre-loaded on the Google Cloud, providing fast access.
While we have tested the Docker image on the Google Cloud, the Docker image will allow BLAST to run equally well on any Docker-enabled platform, such as another cloud platform or on your local computer — and you can still can use the cloud-installed BLAST databases.
See the BLAST in the Cloud and database information documentation to get started.
Genome Workbench version 3.0 (release notes) is now available. An important new feature is the submission preparation wizard that allows you to prepare prokaryotic and eukaryotic genome sequences for submission to GenBank. This wizard is the first step toward offering a better alternative to the Sequin submission tool.
You simply load your sequences into Genome Workbench and use the submission wizard to enter information about your submission through a set of dialog boxes and then save a submission-ready data file. The package also includes tools for editing your sequences, annotation, and metadata.
See the tutorial video on our YouTube channel or the Genome Workbench documentation for more details on how to enable the wizard and prepare a submission.
Have you ever needed to correct or improve SRA metadata after submitting, change the release date for your data or share your data with reviewers? Now you can perform these tasks yourself using the SRA data management features now LIVE in Submission Portal!
If you have an SRA submission and associated BioProject and BioSample, you can log into the Submission Portal, go to the Manage data tab, click into that BioProject and easily perform the following common tasks (Figure 1).
There’s a new RefSeq annotation available for the human genome, and it’s quite an update!
About the release
Annotation release 109.20190607 is the first release of our new bimonthly annotation schedule as announced in a previous post. The annotated sequences are the latest sequences for the GRCh38, patch 13 assembly, GRCh38.p13 (GCF_000001405.39). The chromosome backbone sequences remain the same, but we’ve added 45 patch sequences representing novel and improved sequences that the Genome Reference Consortium will incorporate into the primary assembly in the future. The new annotation places the latest curated RefSeq transcripts and functional elements on the genome but keeps the same model dataset as in annotation release 109 except when the models have been replaced by curated RefSeqs or other review. We are also flagging MANE and other RefSeq Select transcripts. Continue reading for more details on these improvements below. You can download the updated annotation here!
From August 13 – 15 2019, the NCBI will run a bioinformatics hackathon on the NIH campus!
We’re specifically looking for folks who have experience in working with computational microbial genomics, evolutionary biology, antimicrobial resistance, and similar genomic analysis. If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments (please note that the event itself will focus on open access public human).
GenBank release 232.0 (6/20/2019) is now available on the NCBI FTP site. This release has 5.47 terabases and 1.58 billion records.
The release has 213 million traditional records containing 329.8 billion base pairs of sequence data. There are also 1 billion WGS records containing 4.8 trillion base pairs of sequence data, 319.9 million bulk-oriented TSA records containing 285.3 trillion base pairs of sequence data, and 25 million bulk-oriented TLS records containing 10 billion base pairs of sequence data.