Now Available! BLAST ClusteredNR database for blastx and PSI-BLAST searches

ClusteredNR, the new protein database that provides results with a better overview of protein homologs in a wider range of organisms, is now available for blastx (translated nucleotide query) and PSI-BLAST (Position Specific Iterative BLAST) searches (Figure 1). Simply select ClusteredNR in the database section of the BLAST form. You can even search standard nr at the same time to compare results.

Figure 1. Composite image from the BLAST search forms. The ClusteredNR database is available now for blastx and PSI-BLAST searches in addition to blastp. For all types of searches, you can choose to search both ClusteredNR and standard nr at the same time so you can compare results

ClusteredNR is especially useful with blastx for finding more distant homologs when searching with queries from over-represented groups. For PSI-BLAST, the greater taxonomic scope of ClusteredNR database allows you to work more effectively with the default number target sequences in the first round. The two searches described below highlight these advantages of ClusteredNR.

Conserved Domain Database version 3.20 is available!

A new version of the Conserved Domain Database (CDD) is now available. Version 3.20 contains 1,614 new or updated NCBI/CDD-curated domains and now mirrors Pfam version 34 as well as new models from the NCBIfam collection. Fine-grained classifications of the [(+)ssRNA] virus RNA-dependent RNA polymerase catalytic domain, RING-finger/U-box, dimerization/docking domains of the cAMP-dependent protein kinase regulatory subunit, and Galactose/rhamnose-binding lectin domain superfamily have been added, along with many other new models.

We have significantly increased the fraction of CD-Search and interactive BATCH CD-Search queries that yield results showing conserved domain architecture information and attributes that further characterize protein function through links to information-rich resources such as Enzyme Commission (EC) numbers , Gene Ontology (GO) terms, PubMed IDs, and identifiers from the CaZY, TCDB, and MEROPS databases. See our earlier post for additional details. You can access CDD and find updated content on the CDD FTP site at CDD version 3.20.

 Database statistics for CDD version 3.20:

Models Source
64,234 Total models from all Source Databases

Organized into 4,541 multi-model Superfamilies

18,882 NCBI CDD curation effort
1,125 NCBIfams
1,009 SMART v6.0
19,178 PFAM v34
4,871 COGs v1.0
10,140 NCBI Protein Clusters
4,488 TIGRFAM v15
59,693 Total models form the default CD-Search database

CD Search is part of the NIH Comparative Genomics Resource (CGR), an NLM project to establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms.

