We recently showed you a new a way to search for and view sets of orthologous genes from vertebrates. You can now get an additional set of search results that we are calling similar genes. These are related through protein architecture to the orthologous gene set and include genes from all metazoans and selected plant, fungal, and protist species. You can quickly find related genes within a species, compare them to those from other annotated metazoan genomes, and have access to other useful gene resources. To find a set of similar genes, enter a gene symbol or select the gene symbol + orthologs option from the selections menu.
For example if you search for ‘AGO2 orthologs‘, in addition to the link to orthologs from vertebrates, you’ll get a link to a set of similar genes (Genes with similar protein architectures) across a broad evolutionary spectrum that includes genes from invertebrates, fungi, and green plants (Figure 1).
Figure 1. Genes with similar protein architectures to AGO2. The original search was AGO2 orthologs, which brings up the suggestion box with the links to similar genes as well as the AGO2 vertebrate orthologs. The similar genes include entries from a broad taxonomic range of eukaryotic organisms.
If you search for ‘GH1‘, you’ll get a link to similar genes that includes members of the growth hormone family that are not part of NCBI’s vertebrate ortholog set.
Figure 2. The human subset of genes with similar protein architectures to GH1 showing other members (paralogs) of the GH1 gene family (GH2, CSH1, CSH2, CSHL1). These are not included in the ortholog set.
Try out the following searches and follow the links to the Genes with similar protein architectures
Please let us know what you think!
NCBI is testing a new way to find and retrieve orthologous vertebrate genes. To find orthologs enter a gene symbol (e.g. RAG1) or a gene symbol combined with a taxonomic group (e.g. primate RAG1). Select the matching entry from the suggestions menu or you can select the orthologs option (e.g. Rag1 orthologs) to see all orthologs. Your search will return a results link to the set of orthologs provided by NCBI’s Gene resource. Click on the results link to see information for that ortholog group (Figure 1).
Figure 1. Search for Rag1 orthologs showing the link to the set of RAG1 genes from vertebrates.
If you’re a protein researcher, one thing you may want to do is to find homologs for a protein of interest on the basis of its sequence. This can provide insights into what the protein does and how it does it, and may identify proteins with known three-dimensional structures that can serve as models for the protein of interest. The Conserved Domains Database (CDD) groups proteins that have strong sequence similarity to protein domain fingerprints and allows you to search these groups with any protein sequence. Such searches are often more sensitive than standard BLAST searches since the scoring matrices used are tuned to locate important functional sites and sequence motifs that are highly conserved within the domain. You can then use the results to explore the evolutionary relationships of these proteins or identify these important sequence and structural features.
Here is a method to find protein sequences from many organisms that contain a particular conserved domain: