Comparative genomics is a science that compares genomic data either within a species or across species to answer questions in biomedicine. Laboratory experiments can then investigate the functional impact of those genomics similarities and differences. The history of comparative genomics goes back to the mid-1990s, but comparative genomics is now accelerating. A flood of new data is emerging as DNA sequencing technology becomes cheaper and commoditized. While this growth poses many challenges to current tools and approaches, it also offers immense opportunity for scientific research and understanding. These insights continue to reveal novel model organisms that can further the impact of comparative genomics on human health.
Even before the emergence of comparative genomics as a scientific discipline, several organisms were identified as notable model organisms for specific aspects of human health. These animals are typically easy to maintain and breed in laboratory settings and have systems or other biological characteristics similar to human systems. For example, house mouse (Mus musculus), brown rat (Ratticus norveigcus), and nematode (Caenorhabditis elegans) are significant organisms for modeling disease. Zebrafish (Danio rerio) and western clawed frog (Xenopus tropicalis) are commonly used for developmental studies and those on cellular mechanisms due to their external embryo development. Fruit fly (Drosophila melanogaster) was one of the first model systems identified in laboratory science and has served as a staple to study a range of disciplines from fundamental genetics to the development of tissues and organs because of its rapid reproduction cycle. Yeast (S. cerevisiae) cells share many properties with human cells making them excellent models for biochemical mechanisms and disease at a cellular level.
As comparative genomics advances, these established model organisms continue to be well-researched and frequently relied upon for studying a wide range of applications to human health. However, without as much reliance on laboratory maintenance and reproduction, and with the continually expanding library of available, high-quality genomes, comparative genomics is now being used to explore potentially more beneficial model organisms. These emerging models may not have been well-researched in the past, but their recently characterized genomes can be leveraged in comparative genomics studies to impact far-reaching aspects of human health.
The National Institutes of Health (NIH) is helping to harness the power of comparative genomics as a tool for scientific discovery through the NIH Comparative Genomics Resource (CGR) project. CGR is a multi-year project implemented by the National Library of Medicine (NLM) to maximize the impact of eukaryotic research organisms and their genomic data resources to biomedical research. The National Center for Biotechnology Information (NCBI), as part of NLM, is charged with engaging genomics communities on their comparative genomics needs and leading development of tools, knowledge bases, and repositories to meet those needs.
CGR will establish an ecosystem to facilitate reliable comparative genomics analyses for all eukaryotic organisms through an interoperable suite of NCBI repositories and knowledge bases offering high value data, tools, and interfaces compatible with community-provided organism resources. By providing equal access to genomic data and tools for all eukaryotic research organisms—including those emerging model organisms not well-represented by organism-specific resources—and improving the connectivity of their data, NCBI will increase their potential contributions to research in support of greater scientific discovery.
This article highlights the impact of the emerging model organisms on biomedical research and human health, identify use cases for emerging model organisms researched through comparative genomics, and explore areas where the CGR project may have an impact.
Emerging model organism use cases
Pig (Sus scrofa domesticus)
Xenotransplantation is the use of organs from other species to fill the unmet need for human organs for transplantation. More than 105,000 people are currently reported as waiting for organ donors. The time from diagnosis to transplantation is important, since the disease has not progressed as far, and timeliness can lead to better clinical outcomes. Kidney failure is the most acute need, with 83 percent waiting for organs. Approximately 17 people per day die awaiting organ transplants across all organs. And unfortunately, the use of organs from other species, including pigs, is limited by organ rejection by the host immune response and by the potential transmission of viruses from the donor species.
Comparative genomics can improve these outcomes by identifying differences between host and donor species and target those regions with gene editing using CRISPR (Clustered Regularly Interspersed Short Palindromic Repeats). This can be done by adding human genes to the donor species in the germ line as well as removing genes from the donor species responsible for immune rejection. CRISPR can also be used to eliminate pig viruses identified through comparative genomics research which could cause human disease outbreaks.
Among ongoing research in this field, CRISPR is being used to modify multiple pig genes specifically involved in tissue rejection, such as MHC genes and glycosylation sites. In one success story, University of Maryland Medical Center physicians worked with Revivicor to transplant a pig heart to a human recipient under an FDA compassionate access provision. The patient survived for over two months, which was the longest successful pig heart transplant.
Syrian Golden Hamster (Mesocricetus auratus)
Early in the COVID-19 pandemic, comparative genomics was used to identify a range of mammals potentially infected by SARS-CoV-2 via their ACE2 (angiotensin converting enzyme-2) proteins, a cellular receptor for viral entry into host cells. These species could represent a potential route of animal-to-human transmission and could serve as emerging model organisms for further study of SARS-CoV-2. For example, Syrian Golden Hamsters—already commonly used in the research of respiratory viruses and sharing similar laboratory maintenance traits as other previously mentioned rodents—were identified as having similar ACE2 proteins to humans. Therefore, the hamster is an excellent model specifically for studying the pathogenesis of SARS-CoV-2 infections both at a systems and cellular level.
This emerging model organism has since been used in gene expression studies of cytokine and chemokine profiles in the lungs to investigate the clinical pathology of infection, in antibody studies to research the transmissibility and replicative abilities of SARS-CoV-2, to identify gender-based and age-based differences in outcomes and treatment responses, and in knock-out models impeding adaptive immunity to research severe disease outcomes of COVID-19. Additionally, hamsters are now used to investigate the drivers of long COVID organ changes following SARS-CoV-2 infection.
Dog (Canis familiaris)
Dog genomes have been extensively studied and characterized, with causative gene mutations known for many canine hereditary diseases. Comparative genomics has taken this research a step further by identifying many of those genetic mutations as analogous to human conditions with similar clinical and molecular presentations. Among these diseases, several cancers are shared between both species, including sarcomas. Different dog breeds also exhibit different rates of cancers. Scottish terriers, for instance, have a higher rate of bladder cancer (transitional cell carcinoma) than many other breeds. Furthermore, several cancers that are rare in humans, such as osteosarcoma and angiosarcoma, occur commonly and spontaneously in dogs, offering an opportunity for more rapid and extensive investigation and therapeutic development. One such treatment being developed for sarcoma involves using genetically-modified listeria to trigger an immune response against cancer cells. Additionally, these genetic similarities are mutually beneficial to both species; as discoveries are made in human oncology, similar advances may be made in the field of veterinary oncology.
Thirteen-lined ground squirrel (Ictidomys tridecemlineatus)
Therapeutic hypothermia is used to treat cardiac arrest and other inflammatory conditions in humans. A primary motivation for studying hibernation is avoiding cell death in extreme physiological states including hypothermia, hypoxia, and hypercapnia (excess CO2). Understanding the molecular triggers that induce hibernation could allow the development of drugs to induce a hibernation-like state in patients without excessive side effects.
The thirteen-lined ground squirrel is an emerging model organism for studying metabolism, hibernation, and vision. The ability of the squirrel to survive for over six months without food or water is remarkable, as is its ability to lower its body temperature to near freezing during periods of torpor. The hibernation period consists primarily of the torpor state where metabolic, respiratory, and heart rates are lowered. Torpor is interrupted by brief states of arousal where the body temperature is close to normal. During the season change to hibernation, the metabolism is switched from glucose base to lipid based with changes in gene expression at mRNA and protein levels.
Additionally, the ability to maintain bone macrostructure during inactivity is important to human spaceflight or other bone loss diseases. The squirrel’s ability to maintain localization of nNOS (neuronal Nitric Oxide Synthetase) enzymes within muscle membranes during torpor is relevant to understanding muscular dystrophy and other human neuromuscular disorders where nNOS is mis-localized. The resistance to neurological damage during torpor may also be relevant to models of neurological disease.
The squirrel is also becoming a model organism for studying the visual system since it has color vision similar to humans. Studying squirrel vasculature during torpor can aid in understanding how the visual system can tolerate the body temperature decrease from 37 °C to 2 to 10 °C and heart rate decrease from 200-300 bpm to 2-10 bpm.
Killifish (Nothobranchius furzeri and other members of the Cyprinodontiformes order)
Killifish are emerging as a model organism for aging and lifespan studies. Other short-lived model organisms, such as yeast, worms, and flies, have helped identify pathways that play a role in human aging. The killifish represents one of the shortest life spans (four to six months in captivity) among vertebrates that can be bred in laboratory conditions. Among ongoing research, one comparative genomics study sought to characterize the killifish genome and investigated genes that were positively selected in the short-lived killifish in comparison to other longer-lived fish species. This research identified genetic components specifically related to signal transduction pathways, metabolism, development, proteostasis, and immunity. Twenty-two previously identified aging-related genes were found in the African Turquoise killifish, including genes associated with the human conditions of dyskeratosis congenita and Hutchinson-Gilford Progeria Syndrome, as well as insulin receptors associated with human longevity. The killifish is also evolutionarily adapted to survive in harsh environments with physiological variations between killifish in distinct environments. Comparative genomics is being used to explore differences in mitochondrial genetics associated with some these physiological distinctions.
Bats (members of the Chiroptera order)
Bats have been implicated in several spillover events of viral diseases to humans, including SARS, Marburg virus disease, Ebola virus disease, COVID-19, and others. Their immune system is adapted to tolerate certain viruses that are pathogenic in humans. There is evidence that bats have less inflammatory reaction in response to infection versus humans, which may account for some of the human pathology. This reduced inflammatory response in bats is also likely responsible for a slowdown in aging and age-related diseases, including cancer. In humans the NLRP3 (NOD-, LRR- and pyrin domain-containing protein 3) can trigger inflammation and the release of inflammatory cytokines, but in bats the NLRP3 response is lower, potentially explaining bats’ ability to coexist with viruses and act as a reservoir for disease. Some bat species hibernate or use diurnal torpor to conserve energy when they are not in flight. Bats may have adapted to modulate their immune response to tolerate infection during hibernation when the energy necessary to mount a response may be required to survive the hibernation period.
The low incidence of cancer in bats may also be mediated by particular microRNAs that can upregulate tumor suppression and downregulate carcinogenesis; these microRNA expressions have adapted so the incidence of malignancies in bats decreases with age. Also, the telomeres of the bat do not get shorter with age in bats, and evidence suggests that telomere shortening causes cellular apoptosis, senescence, and oncogenic transformation. These adaptations are associated with bats living up to three and a half times longer than mammals with comparable body mass, further establishing bats as a potential model organism for aging.
Comparative genomics studies can have a profound impact on the scientific advancement of human health and biomedical research. However, emerging model organisms are often not as well-characterized in data repositories, and there are limited tools to increase their use. NIH’s CGR project could help address this issue by amplifying emerging model organism data for use in comparative genomic research.
The CGR project aims to promote reliable comparative genomics analyses, accelerate new discoveries, and offer a seamless user experience. Newly accessible and improved NCBI tools will provide the genomics community with a core foundation of uncontaminated and consistently annotated eukaryotic genomes. For example, the NCBI Foreign Contamination Screen (FCS) tool puts contamination removal in the hands of the genomics community prior to submission to NCBI. Furthermore, the NCBI Eukaryotic Genome Annotation Pipeline (EGAP) will become publicly available to enable the genomics community to create and submit consistent, high-quality annotation across species.
In addition to the FCS and EGAP tools promoting high-quality genomic data, NCBI will improve existing tools and develop new ones to improve and simplify comparative analyses. NCBI Datasets offers web and programmatic interfaces to genome associated data. A new ClusteredNR database for BLAST provides reduced redundancy, faster searchers, and more informative results. And a new Comparative Genome Viewer (CGV) allows quick comparison between different assemblies and makes it easier to visualize genomic structure changes.
By providing equal access to genomic data and tools for all eukaryotic research organisms—including those not represented by organism-specific resources—and improving the connectivity of their data, NCBI is increasing their potential contributions to research. NCBI is curating content, adding features to make it more usable for comparative analyses. CGR efforts will also enhance NCBI-held content with community supplied content and connect NCBI resources with community-provided resources to amplify the impact of such data and resources in support of greater scientific discovery.
The CGR project will maximize the impact of eukaryotic research organisms, such as many of those highlighted in this article, offering new opportunities for scientific advancement in the field of comparative genomics.