Evolutionary processes affecting evolution of immune receptors
Genetic diversity is important for survival and adaptability of species in changing environments. To decide which evolutionary forces shaped variation in specific genes is a difficult task because populations and species have to adapt to the abiotic environment (e.g. temperature, sunlight, pollution), to other species with which they interact (e.g. prey, predators, competitors, parasites) or more often to the combination of both. Moreover we should not underestimate neutral or non-adaptive evolutionary forces such as mutation, recombination and random genetic drift (Andrews 2010; Honnay 2013). Understanding processes which drive evolution of immune receptors is a fascinating research subject and big challenge.
According to the Red Queen hypothesis (the famous “running as fast as you can to stay in the same place”), firstly proposed by Leigh Van Valen (1973), organisms are running the arms race with other biological “partners” such as predators, source competitors at the intra- or interspecific levels or parasites. According to generally accepted theories the adaptation imposed by pathogens belongs among the most dynamic – continuous adaptive changes (Lederberg 1999; Zimmer 2001). Parasites co-evolving with their hosts are viewed there as a key factor modulating different life traits of their hosts (e.g. population genetic structure, demographical changes, mating system, sexual dimorphism etc.) (Sheldon 1998). Generally microbes have an advantage during the arms race due to their shorter generation time and high mutation rates, which enhance genetic novelty and evolutionary potential, and so give better opportunity to update their invasion strategies tricking immune system of their hosts (Meyer 1991). Some bacteria generate surface proteins that bind to antibodies, rendering them ineffective; examples include Streptococcus pneumoniae (protein G), Staphylococcus aureus (protein A), and Peptostreptococcus magnus (protein L). Others are able to knock out or kill phagocytes. Some microbes can eventually mimic host cells and then block the interferon (IFN) production pathway or construct protective capsules (Mycobacterium) (Fortune et al. 2004). Another example is the ability of Mycobacterium leprae to suppress cell-mediated immunity or to play with immune system “hide-and-seek” inside host cells to avoid their detection (Maizels et al. 2004). Extracellular pathogens often alter their antigens (surface proteins), so we can imagine that it is like a thief who is escaping police by changing the coat and wig all the time and police is still one step behind (e.g. 84 known serotypes of Streptococcus pneumonia, glycoprotein of trypanosomes, surface flagellin of Salmonella typhimurium, pilin protein of Neisseria gonorrhoeae, or LPS from H. pylori and P. gingivalis) (Andersen-Nissen et al. 2005). The same strategy, i.e. antigenic variation and antigenic diversity, is used also by protozoan parasites (e.g. Plasmodium falciparum causing malaria) (Reeder and Brown 1996).
Experts and prominent specialists in evasion strategies are viruses. Their short generation times and relatively high mutation rates give them huge advantage in red queen running with host defence mechanism and allows them quick adaptation to changing environment of their host. The best known example is influenza virus with constant replacement of its surface envelope proteins. Cytomegalovirus, another cunning invader, is able to evade the host defence system by expressing its MHC class I homologues thus pretending to be part of its body (Reyburn et al. 1997). Viruses can become also invisible for the immune system when they enter the latency state during which they are inactive (i.e. without replication). In this state, avirus particle does not cause a disease, but also does not produce any viral peptides which normally attract attention of different immune receptors. Viruses in the latent form can not be eliminated by the immune system and hence are sources of potential recurrent illnesses. Most famous for such a strategy are Herpes simplex viruses (Bowie and Unterholzner 2008). Another viral strategy is inhibition of immune response. Paramyxovirus for example can inhibit the type I IFN response which is included in the RIG-I signalling cascade. The most insolent viruses even use PRRs as entry ports (Yamada et al. 2005). At first glance the only fair players seem to be parasitic worms, which evolve slower and therefore give a putative chance to the immune system to adapt. However, they are even bigger (literally) swindlers than their fast evolving cousins. During many millions of years of close coexistence with their hosts (parasites are often species specific) they have had time to evolve a sophisticated weapon arsenal designed to evade and modulate the host immune system (Wakelin 1996; Zaccone et al. 2006). Due to their generally bigger bodies it is very difficult for the immune system to eliminate them. There is a known pack of chameleons (e.g. coating with host proteins by schistosomes or filarial nematodes), the squad of chemical terrorists causing immunosuppression (e.g. hookworms producing a protein which binds the ß integrin CR3 and inhibits neutrophil extravasation or immunosuppression made by Burgia spp. or Nocardia brasiliensis) and also a group of nomads which avoids local inflammatory reactions by migrations through the host body (hookworms, Brugia spp., Wuchereria bancrofti, or microfilariae of Onchocerca volvulus) (Pearce and Sher 1987; Wakelin 1996; MacDonald et al. 2002). On the other hand long-term coevolution sometimes leads to a relatively stable relationship such as commensalism or mutualism. Moreover, according to the hygiene hypothesis absolute elimination and subsequent absence of parasites can lead to autoimmune disorders (Zaccone et al. 2006).
As a consequence of co-evolution with microorganisms the immunity genes belong to the fastest evolving genes (Nielsen et al. 2005; Barreiro et al. 2009; Barreiro and Quintana-Murci 2010; Quach et al. 2013; Quintana-Murci and Clark 2013). Since immune receptors are chief gatekeepers protecting against entrance of microbes, we suggest that antagonistic host-parasite interactions are the principal force shaping their evolution.
Selection imposed by pathogens: adaptive evolution
The model which explains the evolution of immune receptors in the light of host-parasite co-evolution is called the matching alleles model (Frank 1993; Agrawal and Lively 2003). This model depicts the host-pathogen interaction as the process of reciprocal adaptive genetic change. In the context of microbe and pathogen it means receptors evolve to perfectly match the specific parasite structures (as the lock and key). In other words changes in gene frequencies resulting from selection acting on one population (species) create selective pressure for changes in gene frequencies in another population (species). This type of selection, called frequency-dependent selection, signifies that relative fitness of a genotype depends on its frequency (Carius 2001) (Fig. 4).
Frequency-dependent selection (FDS) can be positive or negative, Positive FDS favours the most numerous allele or genotype which thus increases its frequency and rapidly tends to fixation (Fig. 4c and d). Therefore, in essence the mode of this selection is directional and can be detected by important non-synonymous amino acid changes in different groups or lineages (Quintana-Murci and Clark 2013). Nevertheless positive FDS is less probable in the scope of immune genes.
In contrast, co-evolution determined by negative frequency dependent selection (NFDS) maintains high genetic diversity by favouring rare allelic variants (Takahata and Nei 1990; Stevens 2001) (Fig. 4a and b). In the context of the host-parasite interaction the mechanism of NFDS can be described as follows. The host immune system is adapted to tackle the most common parasite genotype and hence less common genotypes are favoured by natural selection. Rare gentoypes increase in frequency and subsequently become common therefore the cycle goes on (Fig. 5). NFDS is the type of balancing selection, which was already described in MHC genes (Takahata and Nei 1990; Bernatchez and Landry 2003; Garrigan and Hedrick 2003; Aguilar et al. 2004; Bryja et al. 2006; Piertney and Oliver 2006; Smith et al. 2011). Balancing selection maintains genetic variation and leads to excess of polymorphism and excess of intermediate-frequency alleles. Besides NFDS, balancing selection can act as antagonistic and cyclic selection, selection in variable environment or through overdominance, i.e. heterozygote advantage where heterozygous genotypes confer higher fitness that homozygotes, for example, because they allow to recognize a wider variety of parasites (Doherty and Zinkernagel 1975).
Polymorphism and the effect of non-synonymous substitutions on protein functions
The principal prerequisite enabling to participate in the Red Queen dynamics is genetic polymorphism. In previous chapter I have already described how this variability can be maintained and that evolution of living organisms is the consequence of both adaptive and neutral evolutionary processes. So how we can define genetic polymorphism (variation) and what is its origin? Nucleotide sequences are composed of four nucleotide bases (A, C, G, T). These bases form 64 triplets (codons), and most of them encode for 20 amino acids (three of them are stop codons). One amino acid then can be logically coded by multiple codons. Nucleotide substitutions generated randomly and continuously during evolution, belong among the most important sources of new genetic variants. They can be either synonymous (silent), which do not change amino acids, and function of protein or non-synonymous, which replace amino acids and might have an important functional impact. Even if the probability of synonymous substitutions to arise by random mutagenesis is nearly twice lower compared to non-synonymous ones, they occur almost always at a much higher rate than non-synonymous substitutions since they are not eliminated by selection. However, the proportion between both types can significantly vary between different genes and/or genome regions.
According to their effects on fitness non-synonymous substitutions can be deleterious, favourable, or selectively neutral. It was shown that approximately half of the known disease-causing mutations result from amino acid substitutions and in humans 25-30% of non-synonymous substitutions were predicted to negatively affect protein function (Ng and Henikoff 2006).
Effect of (point) substitutions on protein function depends largely on biochemical properties of amino acids (Fig. 6). These exchanges can lead to differences in protein conformation what can results in the disruption of interactions with other structures (e.g. ligands or receptor partners) or influence the stability of the protein. Important factors are charge, hydrophilicity or hydrophobicity, size and functional groups (Villaseñor-Cardoso and Ortega 2011). For example, a substitution of Isoleucine (I) for Leucine (L) may not bring about significant changes in a receptor function as both amino acids have similar biochemical properties: nonpolar and neutral). By contrast, replacement of Arginine (R) by Alanine (A) at position 441 of the receptor-binding domain of the SARS coronavirus spike protein interrupts the binding activity (He et al. 2006). Alanine and Arginine are amino acids with completely different qualities (A is small, nonpolar and neutral while R is polar and positive) and therefore we can expect more serious impact on protein function (Fig. 6). Substitutions of amino acids that have similar biochemical properties and are therefore less likely to affect the function of a protein occur much more frequently than more radical changes.
Origin and function of the TLR family
TLRs are an ancient family which has probably arose before the Cambrian period (Roach et al. 2005). We can therefore find TLR homologues in many organisms from plants to invertebrates (insects, cnidarians, nematodes, and crustaceans) to vertebrates and mammals (reviewed in Leulier and Lemaitre 2008; Vinkler and Albrecht 2009). During the last 700 million years the TLR family experienced multiple gene duplications and sequence divergence and today we classify the vertebrate TLRs into six major groups or families: TLR1, TLR3, TLR4, TLR5, TLR7 and TLR11 (Roach et al. 2005; Temperley et al. 2008; Huang et al. 2011) (Fig. 10). A half of these families involve only one gene; however, in TLR1, TLR7 and TLR11 groups we can find multiple paralogues. Family TLR1 contains TLR1, TLR2, TLR6 and TLR10, in birds known as TLR1La, TLR1Lb, TLR2a, TLR2b. Family TLR7 include TLR7, TLR8 and TLR9 and family TLR11 cover TLR11, TLR12, TLR13, TLR21 and TLR23. One should be also aware that each species has its own spectrum of TLRs. For example, there are 17 distinct TLRs in bony fish, while only 11 TLRs were described in the fugu fish, 10 different TLRs were described in chicken and in the sea urchin (Strongylocentrotus purpuratus), as many as 222 different TLRs were revealed (Huang et al. 2008; Hughes and Piontkivska 2008; Rebl et al. 2010). The number of TLRs differs also across mammalian species. To date 10 and 12 functional TLRs have been described for humans and rats and mice, respectively (Dembic 2005). TLR10 which exists in humans is not functional in the mouse genome due to a retrovirus insertion; while mice have, in addition to humans, TLR11, TLR12 and TLR13 (Fig. 11, these three TLRs were lost in humans). The role of TLR11 was associated with detection of uropathogenic bacterial components (Zhang et al. 2004). However the function and ligands of TLR12 and TLR13 are still not well identified (Kawai and Akira 2010). Detailed overview of TLR number in distinct groups can be found in Leulier and Lemaitre (2008).
Analysis of selection at the intraspecific or population level
On the intraspecific or population level we can use tools of population genetics, such as estimates of nucleotide variability between populations, analysis of population expansion, population differentiation (FST statistic) etc. Several statistics permit to asses departures from a neutral model of evolution in the distribution of allele frequencies. The most famous is Tajima’s D which compares the number of segregating sites with the mean pairwise difference between sequences (Tajima 1989). A negative Tajima’s D signifies an excess of low frequency polymorphisms, indicating population size expansion or purifying selection, while positive Tajima’s D signifies low levels of both low and high frequency polymorphisms, indicating a decrease in population size and/or balancing selection (e.g. population after bottleneck) (Städler et al. 2009). Other tests which evaluate departures from neutrality based on allele frequencies are Fu and Li’s D and F test, and Fay and Wu’s H test. As for Tajima’s D significantly negative values indicate an excess of low-frequency variants, which can resut from population expansion, weak negative or positive selection. On the other hand significantly positive values represent an excess of intermediate-frequency alleles, which can result from population bottlenecks or and balancing selection. Fay and Wu’s H statistic tests for an excess of high-frequency derived mutations, which is the hallmark of posiive selection. However, deep knowledge of demographic history of studied populations is necessary to interpret these statistics.
Table of contents :
1 GENERAL INTRODUCTION
1.1 Immune system and recognition of antigens
1.1.1 Recognition receptors of innate immunity
1.1.2 Antigen recognition in adaptive immunity
1.2 Evolutionary processes affecting evolution of immune receptors
1.2.1 Selection imposed by pathogens: adaptive evolution
1.2.2 Stochastic evolutionary processes
1.2.3 Polymorphism and the effect of non-synonymous substitutions on protein functions
1.2.4 Effect of non-synonymous substitutions on the function of pattern recognition receptors
1.3 Toll-like receptors a general overview
1.3.1 A brief historical survey: from Toll to Toll-like receptors
1.3.2 Structure of TLRs
1.3.3 Signalling of TLRs
1.3.4 Origin and function of the TLR family
1.3.5 Variability and polymorphism of TLRs
1.3.6 Evolutionary forces acting on TLRs
1.4 Thesis aims
2 MATERIAL AND METHODS
2.1 Rodents and rodent-born infectious emergent diseases
2.1.1 Origin and radiation of the tribe Rattini in the Southeast Asia
2.1.2 Tribe Murini and evolution of house mice (Mus musculus)
2.1.3 Rodent-borne diseases, emergence risk for humans and rodents as model species
2.2 Analysis of natural selection
2.2.1 Analysis of selection at the intraspecific level
2.2.2 Analysis of selection at the intraspecific or population level
3.1 Polymorphism of bacterial-sensing TLRs in wild-derived and classical laboratory strains of Mus musculus
3.1.2 Material and Methods
3.1.3 Laboratory techniques
3.1.4 Data analysis
3.2 Analysis of variability at intraspecific level in wild populations of Mus musculus
3.2.2 Materials and methods
3.3 Analysis of variability at interspecific level of wild rodents
3.3.2 Materials and methods
4 GENERAL DISCUSSION
4.1 Selection forces acting on TLRs in free living populations: intra- vs. inter-specific level
4.2 The role of recombination: instrument of stochastic processes or selection
4.3 Selection forces acting on TLRs: bacterial sensing vs. viral sensing
4.4 Selection forces acting on TLRs: ECD vs. ICD
4.5 TLRS IN SPECIATION RESEARCH – FUTURE PROSPECTS
6.2 PCR PROTOCOLS
6.3 CURRICULUM VITAE
6.4 ACCEPTED ARTICLE