We recently updated the version 5 BLAST protein databases, (dbV5), on our FTP site to be completely accession-based. As we described in a previous post, this means they now contain the gi-less proteins from the NCBI Pathogen Project and other high-throughput projects. The v5 databases are also compatible with proteins from PDB structures with multi-character chain identifiers and will include these as they become available in our other protein systems. Only the latest version of BLAST+ (2.9.0, download) will work with the updated v5 databases and allow you to access all of the most recent protein data. At the end of September 2019, we will stop updating the version 4 BLAST databases and offer the v5 databases as the default for download.
Next Wednesday, May 15, 2019 at 11AM, NCBI staff will show you how to use the latest version of standalone BLAST+ (2.9.0) and the new accession-based DBv5 databases with built-in taxonomy information. You will learn how to limit searches to taxonomic groups and to retrieve sequences from the database by taxonomy without having to download an identifier list. You will also learn about additional improvements in the BLAST databases and programs that make them compatible with the new PDB identifiers and gi-less proteins from the Pathogen Detection Project.
Date and time: Wed, May 15, 2018 11:00 AM – 11:30 AM EDT
After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.
Need a refresher of what NCBI offers? Or just feel you aren’t taking full advantage of NCBI resources? Check out some of NCBI’s most recent recordings of NCBI Minute webinars up on the NCBI YouTube channel.
QuickBLASTP, an accelerated version of BLASTP, adds a new pre-processing step to the non-redundant (nr) protein database. In a matter of seconds, QuickBLASTP will find approximately 97% of the database sequences with 70% or more identity to your query and around 98% of the database sequence with 80% or more identity to your query.
BLAST (Basic Local Alignment Search Tool) is a popular tool for finding sequences in a given database that are similar to a query sequence. Traditionally, BLAST displays these results as a sorted list of matches between the query and each database sequence. While this display is useful for examining how each subject sequence matches the query, it treats all subject sequences the same, regardless of the quality of the sequence data or its annotation, and also does not allow easy comparisons between different subject sequences.
For example, the subject sequences may fall into multiple groups of similar sequences, or all of the subject sequences may be more similar to each other than to the query. A common way to obtain this information is to construct a multiple sequence alignment of the query and some or all of the subject sequences, but to this point, BLAST has not provided such alignments directly.
Enter SmartBLAST! SmartBLAST is a new and experimental NCBI tool that makes it easier to complete common sequence analysis tasks, such as finding a candidate protein name for a sequence, locating regions of high sequence conservation, or identifying regions covered by database sequences but missing from the query.