BLAST ClusteredNR Database is Now Available for Download!

BLAST ClusteredNR Database is Now Available for Download!

Now available! You can download the ClusteredNR protein database, previously only available on the BLAST web application. As recently introduced, our ClusteredNR database allows you to get quicker BLAST results and access to information about the distribution of your hits across a wider range of organisms and evolutionary distances. The package includes the ClusteredNR BLAST database, an SQLite3 database, and several scripts for accessing cluster information and members.  

Features & Benefits
  • Reduced redundancy 
  • Faster searches 
  • More diverse proteins and organisms in your BLAST results 

Requirements
  • Linux or macOS operating system 
  • SQLite3 version 3.35.4 or higher 
  • BLAST+ 2.13 or higher
  • Minimum 128GB RAM and 343GB disk space 
Example

screenshot of ClusteredNR demo for an example

This example shows results from a search against the standalone ClusteredNR database (nr_clustered_seq) using the pig uricase protein (NP_999435) as a query. The count-clustermembers.sh script returns the number of sequences for the cluster represented by the Cavia porcellus uricase (XP_012998554.1). The get-cluster-members.sh script returns the protein accessions, the taxonomy IDs, and the titles of the member proteins in the cluster. 

Learn more

Get more information about ClusteredNR, including step-by-step instructions on how to use it and a summary of all the scripts included with the package.   

Stay up to date

BLAST is a part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration.      

Follow us on social @NCBI and join our mailing list to keep up to date with BLAST and other CGR news.    

Questions?

If you have questions or would like to provide feedback, please write to our help desk. 

Leave a Reply