Now available! You can download the ClusteredNR protein database, previously only available on the BLAST web application. As recently introduced, our ClusteredNR database allows you to get quicker BLAST results and access to information about the distribution of your hits across a wider range of organisms and evolutionary distances. The package includes the ClusteredNR BLAST database, an SQLite3 database, and several scripts for accessing cluster information and members.
Features & Benefits
- Reduced redundancy
- Faster searches
- More diverse proteins and organisms in your BLAST results
- Linux or macOS operating system
- SQLite3 version 3.35.4 or higher
- BLAST+ 2.13 or higher
- Minimum 128GB RAM and 343GB disk space
This example shows results from a search against the standalone ClusteredNR database (nr_clustered_seq) using the pig uricase protein (NP_999435) as a query. The count-clustermembers.sh script returns the number of sequences for the cluster represented by the Cavia porcellus uricase (XP_012998554.1). The get-cluster-members.sh script returns the protein accessions, the taxonomy IDs, and the titles of the member proteins in the cluster.
Get more information about ClusteredNR, including step-by-step instructions on how to use it and a summary of all the scripts included with the package.
Stay up to date
BLAST is a part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration.
If you have questions or would like to provide feedback, please write to our help desk.