Tag: Protein BLAST (BLASTp)

New ClusteredNR database: faster searches and more informative BLAST results

New ClusteredNR database: faster searches and more informative BLAST results

Reduced redundancy. Faster searches. More diverse proteins and organisms in your BLAST results. Check out our new ClusteredNR database – derived from the default BLAST protein nr database by clustering sequences at 90% identity / 90% length (details below).  Get quicker results and access to information about the distribution of your hits across a wider range of organisms and evolutionary distances.

Searching ClusteredNR

You can choose the ClusteredNR database in the Choose Search Set section of the BLAST submission form where you normally pick the BLAST database.  Simply select the Experimental databases radio button.  You can also select the checkbox to search both ClusteredNR and the standard nr at the same time allowing you to compare results (Figure 1).

Figure 1. The ‘Choose Search Set’ section of the BLAST submission form. Selecting the Experimental databases radio button chooses ClusteredNR. You can also perform simultaneous searches against the clustered and the standard nr by checking ‘Select to compare standard and experimental database.’ Continue reading “New ClusteredNR database: faster searches and more informative BLAST results”

BLAST+ 2.13.0 now available with SRA BLAST, ARM Linux executables, and database metadata

BLAST+ 2.13.0 now available with SRA BLAST, ARM Linux executables, and database metadata

BLAST+ 2.13.0  includes several important new features including SRA BLAST programs, ARM Linux executables, and the ability to produce database metadata as well as some important improvements, and a few bug fixes.  You can download the new BLAST release from the FTP site.

New features

SRA / WGS BLAST (blastn_vdb, tblastn_vdb)

Beginning with this release, the BLAST distribution now includes the SRA BLAST programs  blastn_vdb and tblastn_vdb that can directly search SRA and WGS projects without the need to build a BLAST database. See the BLAST documentation on how to use these programs with WGS projects.

ARM Linux executables

This release also includes executables compiled under ARM Linux for the first time. Please let us know if you find any issues with ARM Linux programs.

Database metadata in JSON format

Starting with BLAST+ 2.13.0, the makeblastdb program generates an additional file with the file extension .njs for nucleotide databases or .pjs  for protein databases. These files contain BLAST database metadata in JSON format. See the BLAST database metadata section in the BLAST User Manual for an example. This file can be easily read by many tools and makes the BLAST database more compliant with FAIR principles.

See the release notes for more details on improvements and bug fixes for the release.

Important reminder about usage reporting

As we announced previously, BLAST can report limited usage information back to NCBI. This information shows us whether BLAST+ is being used by the community, and therefore is worth being maintained and developed.  It also allows us to focus our development efforts on the most used aspects of BLAST+.  Please help us improve BLAST by allowing BLAST to share information about your search. The BLAST privacy statement  provides details on the information collected, how it is used, and how to opt-out of reporting if you don’t want to participate.

Updated prokaryotic representative genomes collection includes 685 new species!

We are happy to announce an updated bacterial and archaeal representative genomes collection. The current collection contains a total of 15,507 assemblies selected from 236,000 prokaryotic RefSeq assemblies to represent their respective species. The collection has grown by five percent since August 2021. A total of 685 species are represented for the first time. In addition, 370 species are represented by a better assembly, and 84 species were removed because of changes in NCBI Taxonomy or uncertainty in their species assignment.

We updated the database on the Microbial Nucleotide BLAST page as well as the basic nucleotide BLAST RefSeq Representative genomes database (fourth in the menu) to reflect these changes. Finally, remember that you can now run BLAST searches against the proteins annotated on representative genomes (second in the menu). Find more information here.

BLAST+ 2.12.0 now available with more efficient multithreaded searches

BLAST+ 2.12.0  programs feature better multithreaded searches and support a different threading model, threading by query, that can be more efficient in some situations.  The new release is also fully compatible the increase in the numeric range for the GI identifier, which will take effect in the nucleotide database later this year.  The list below shows details of the new features and bug fixes.  You can download the new BLAST release from the FTP site.

Continue reading “BLAST+ 2.12.0 now available with more efficient multithreaded searches”

April 7 Webinar: Recent and upcoming enhancements to NCBI BLAST and Primer-BLAST services!

April 7 Webinar: Recent and upcoming enhancements to NCBI BLAST and Primer-BLAST services!

Join us on April 7, 2021 at 12PM eastern time to learn about new web BLAST and Primer-BLAST enhancements that improve your BLAST experience. You’ll also see a preview of some planned improvements to the databases that make it easier to find relevant matches.

Recent changes to web BLAST include added data columns on the descriptions table, so you can quickly find and sort your matches. Primer-BLAST now offers direct links from genome assembly pages, so you can easily select the specificity database. Primer-BLAST also now accepts multiple target templates making it easy to design primers that can amplify several similar sequences such as all splice variants of gene or the same target (16S, COI) from different strains or species.

  • Date and time: Wed, April 7, 2021 12:00 PM – 12:45 PM EDT
  • Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI webinars playlist on the NLM YouTube channel. You can learn about future webinars on the Webinars and Courses page.

Protein BLASTDBs are accession-based

The version 5 BLAST (dbV5) protein databases are now accession-based. You can access these databases and the nucleotide BLASTDBs on our FTP site.

As we described in a previous post, this means they now contain the GI-less proteins from the  NCBI Pathogen Project and other high-throughput projects. The v5 databases are also compatible with proteins from PDB structures with multi-character chain identifiers and will include these as they become available in our other protein systems. Only the latest version of BLAST+ (2.9.0, download) will work with the updated v5 databases and allow you to access all of the most recent protein and nucleotide data. In the winter of 2019, we will stop updating the version 4 BLAST databases and offer the v5 databases as the default for download.

In addition, makeblastdb will be updated in BLAST 2.10.0, due out in October 2019, so by default it creates dbV5 formatted databases.

For more information on the new database version and BLAST+ (2.9.0), see the previous NCBI Insights article and the recording of our recent webinar.

The new BLAST results are now the default view

As you may know,  we have been offering a new BLAST results (Figure 1) as a test page since April.  In response to your positive reception and after incorporating many improvements that you suggested, we made the new results the default today,  August 1, 2019.

You will still be able to access to the traditional results for a several months. This will provide you additional time if you need it to adjust your workflows or teaching materials to the new display.

Continue reading “The new BLAST results are now the default view”

New BLAST results for specialized searches now available for testing

As you probably know,  BLAST has been offering a new results page as an option for standard BLAST for you to test and provide feedback since April. See our post from earlier this spring for more details. We have just added new results pages (Figure 1) for the following four specialized BLAST services for you to test.

  1. PSI-BLAST
  2. PHI-BLAST
  3. DELTA-BLAST
  4. Align two or more sequences

Continue reading “New BLAST results for specialized searches now available for testing”

Have you tried BLAST+ (2.9.0) and version 5 BLAST databases (dbV5)?

We recently updated the version 5 BLAST protein databases, (dbV5), on our FTP site to be completely accession-based.  As we described in a previous post, this means they now contain the gi-less proteins from the  NCBI Pathogen Project and other high-throughput projects. The v5 databases are also compatible with proteins from PDB structures with multi-character chain identifiers and will include these as they become available in our other protein systems. Only the latest version of BLAST+ (2.9.0, download) will work with the updated v5 databases and allow you to access all of the most recent protein data. At the end of September 2019, we will stop updating the version 4 BLAST databases and offer the v5 databases as the default for download.

For more information on the new database version and BLAST+ (2.9.0), see the previous NCBI Insights article and the recording of our recent webinar.

May 15, 2019 Webinar: Using taxonomic information and other improvements in standalone BLAST+ (2.9.0) and the v5 databases

May 15, 2019 Webinar: Using taxonomic information and other improvements in standalone BLAST+ (2.9.0) and the v5 databases

Next Wednesday, May 15, 2019 at 11AM, NCBI staff will show you how to use the latest version of standalone BLAST+ (2.9.0) and the new accession-based DBv5 databases with built-in taxonomy information. You will learn how to limit searches to taxonomic groups and to retrieve sequences from the database by taxonomy without having to download an identifier list. You will also learn about additional improvements in the BLAST databases and programs that make them compatible with the new PDB identifiers and gi-less proteins from the Pathogen Detection Project.

Date and time: Wed, May 15, 2018 11:00 AM – 11:30 AM EDT

Register

After registering, you will receive a confirmation email with information about attending the webinar. A few days after the live presentation, you can view the recording on the NCBI YouTube channel. You can learn about future webinars on the Webinars and Courses page.