Get Complete Project Material File(s) Now! »
FROM PHENOTYPIC VARIATION TO MOLECULAR VARIANTS AND VICE VERSA
In the last chapter, I described the extent of genetic, epigenetic and phenotypic diversity observed in A. thaliana. Part of this phenotypic variation can be explained by genetic and epigenetic modifications but not all mutations are responsible for phenotypic diversity. A long standing question in quantitative genetics has been to identify among all the variants observed between accessions the ones responsible for phenotypic variation (figure 12). In this chapter, I will describe the methods available to identify and validate the genes (QTG) and polymorphisms(QTP) underlying QTLs. Besides, through several examples, I will present what quantitative genetics teach us about the complexity of the genetic architecture of natural phenotypic variation.
Finally I will bring some insights of the importance to combine all the information obtained from genomics approaches for various traits and under different environments into more global networks in order to better understand the consequences of genetic and epigenetic variation on the plant phenotype as a whole.
Mapping and cloning the genes responsible for phenotypic variation
Overall, the mapping and cloning of quantitative trait loci is based on the association of the phenotypic and genotypic variation observed within a group of individuals usually from the same species that can be more or less related to each other. The goals of such associations are to detect QTLs (and, potentially interaction among QTLs), to locate them on chromosomes with the best possible precision (the best being to identify the causal polymorphisms) and to estimatetheir effects. If these associations are performed on different environments the environmental specificity of QTLs can also be assessed. The power to detect significant associations dependson the genetic architecture of the trait of interest, i.e. the number of loci responsible for trait variation, the relation between the different loci and their effect, as well as trait heritability and the mapping difficulties associated to trait genetic architecture with be discussed in the second section. Besides, because genetic variants are reshuffled within a population thanks to recombination events during meiosis, the type of population (experimental cross design and accessions) and the number of individuals analysed can strongly influence the power of QTL mapping. Finally, the number, type and distribution of genetic markers that allow the identification of recombination events and the definition of QTL intervals will also affect the output of the mapping analyses. Different strategies can be used to identify QTLs and the choice of the strategy to conduct depends on the trait of interest, the genetic resource available and the biological question addressed. Indeed some researchers are mapping QTLs in order to identify new genes involved in a specific pathway whereas others will be more interested in the evolutionary and ecological importance of the underlying QTLs, and others will be essentially looking for markers to use in selection. In the first case, the cloning of at least one gene would be satisfying whereas in the second and third cases the number and effects of the QTLs in the whole population are the key results. In this section, I will present those strategies and start to highlight their strengths and drawbacks.
Genome-Wide Association mapping
As opposed to ‘linkage mapping’ which is based on the use of experimental populations, genome-wide association (GWA) mapping exploits accessions’ shared ancestry to directly identify the variants responsible for phenotypic variation. Because of the higher number of generations separating individual accessions compared to individual RILs or F2s, much more recombinations accumulated in natural populations. In A. thaliana, linkage disequilibrium (LD) 7, decays rapidly within 10kb so the optimal number of SNPs necessary to indirectly take into ac-count (through linkage) any polymorphism in the genome has been estimated between 140,000 and 240,000  (figure 14). As a consequence, a 250k chip has been designed based on the result of the array sequencing of 19 accessions and used to genotype 1,307 worldwide accessions [18, 140]. The 1001 Genome project will probably improve GWA analyses by including new accessions and new variants in GWA projects and by increasing the chance to detect directly–or via mutations’ effect predictions–the responsible variants .
Several studies showed that GWA mapping could identify many true genotype-phenotype associations in A. thaliana. Among others, a meta analysis performed on 76 to 194 accessions for 107 phenotypic traits related to flowering time, defense, ionomics and development, identified several associations previously mapped or cloned by traditional linkage mapping and cloning [74, 141, 142] such as MOT1 [74, 143, 144], HKT1 [74, 145, 146] and ACD6 [74, 147] as well as new regions containing no a priori candidate genes. One of the great advantages of GWA over QTL mapping is the resolution power that can often be reduced to few kb i.e. a small number of genes (figure 14). Nevertheless some genomic regions show clusters of significant SNPs that are likely due to extended LD [74, 148] and can make the identification of candidate genes difficult. Also, clear spurious associations several tens of kb away from the real causal polymorphism also start to be described in the community. Concerning GWA power, another advantage that was not expected from human GWA analyses is the relatively low number of accessions that need to be included in the analysis at least to start to get significant associations. Atwell and colleagues showed that 96-192 accessions could be sufficient but this number actually depends on the architecture of the trait of interest and probably reflects the occurrence of common segregating alleles in the whole population [74, 127, 148].
Among the drawbacks of genome-wide association mapping, the one that has received the most attention is the confounding effect arising from populations’ genetic background. Population structure 8 which generally describes remote common ancestry of large groups of individuals, can cause LD between the causal variants of a given trait and unlinked loci throughout the genome leading to spurious genotype-phenotype association (figure 14). This occurrence of false positives is particularly important when phenotypic variation overlaps with the pattern of population structure and/or environmental clines (because A. thaliana population structure is partially associated with geographic distances (at least in Eurasia, )). Finally, correlations between genetic variants can also be observed in ‘unrelated individuals’ although to a lesser extent . Several methods have been used to take population relatedness into account in genotype-phenotype association studies. For example, methods using the cluster membership (Q) obtained with STRUCTURE software , or the coordinates of the first axes of principal-component analysis, as covariates in the association model perform well when popu-lation structure is simple. An additional improvement, based on Fisher’s observation in 1918 that the more alleles individuals share, the more similar they will be, models phenotype with a linear mixed models that accounts for the phenotypic variation that is linked with accessions pairwise relatedness . Overall, these methods reduced the number of false positives but in the same time reduced sensitivity . The apparition of false negatives when correcting population relatedness is especially likely when the environmental conditions that constraints phenotypic variation and genetic variants overlap with population structure. Besides, these models can artificially increase the association score of rare alleles so that they often ignored SNPs that have minor allele frequencies (5% or 10%) . The second problem regarding rare alleles (< 5%), is that they have little influence on the population as a whole even if they have a strong effect on the phenotype and therefore GWA has little power to detect them . This issue might be particularly important in A. thaliana in which an excess of rare alleles has been observed .
Validation of causal genes and polymorphisms
Once few candidate genes have been identified by mapping and cloning strategies, their specific effects on the phenotype have to be tested and validated. First, the phenotypic consequences of null alleles for the genes of interest can be investigated in the Col-0 background using the numerous T-DNA lines available in stock centers (see 1). Besides, gene silencing by RNA interference (RNAi) or artificial microRNAs (amiRNAs) is another convenient tool because it allows testing the activity of genes in other genetic backgrounds. However depending on the promoter of the construct used to introduce the RNAi or amiRNA into a given genotype or on the position of the transgene insertion in the genome, gene silencing efficiency may vary in different transformants and the expression levels of the genes of interest have to be tested to ensure silencing .
If altering the function of one gene modifies the phenotype of interest, quantitative complementation and knockdown approaches can be used to check whether the QTL is allelic to this candidate gene or not. The first method phenotypically compares the two QTL alleles against the TDNA allele in a heterozygous context (F1). If only one of the alleles complements the mutant phenotype (significant interaction between the alleles at the gene [WT/T-DNA] and at the QTL; figure 15) then the candidate gene is potentially responsible for the QTL , or at least interacting with it. Quantitative knockdown tests whether the inactivation of one allele by gene silencing differentially affects the phenotype compared to the inactivation of the second allele (significant interaction between the genetic background and the presence of the transgene) (figure 15). If so then the candidate gene is probably responsible for the QTL .
Depending on the architecture of the trait, quantitative complementation and knockdown can be performed on the accessions directly or on the NILs/HIFs to avoid genetic background effects. As an alternative, transgenic complementation can help in the identification of the gene affecting a trait by introducing in different backgrounds a copy of different alleles and testing their effects on the phenotype (figure 15). Because the modifications associated to phenotypic differences can affect either protein functionality such as non-synonymous changes in functionally important amino-acids , frame-shifts , premature stop codons  and splicing alterations  or transcript accumulation due to DNA sequence variation [146, 163] or epigenetic modifications , the best is to introduce both the promoters and the genomic coding region of the genes of interest (figure 15). By introducing WT allele with point mutations, this technique can allow the identification and validation of the QTP, at least when it represents a single SNP. Despite the analysis of multiple independent transformants, the variation observed between different transgenic lines (as the result of insertions’ position) may be larger than the QTL effect. Overall, the power of these three approaches strongly depends on the genetic architecture of the trait of interest.
The complex genetic architecture of quantitative traits
The genetic and recombination patterns characterising mapping populations and accession sets partially explain the statistical power of linkage and GWA approaches to detect genotypephenotype associations. The other important factor is the genetic architecture of phenotypic variation, i.e. the number, identity, frequency and effect of all the polymorphisms affecting genes involved in the developmental, physiological and/or biochemical pathways associated with a trait phenotype, as well as all the 2 ways and higher-order epistatic interactions. The more complex the genetic architecture underlying the variation observed in a trait is, the more difficult it is to study it and to identify all its components. In this section I will show that linkage and GWA mapping differ in their ability to identify some of those components (figure 16). However because they are inherent to quantitative traits, it is important to be able to develop statistical methods in order to better detect them.
Table of contents :
Part I General Introduction
1. Arabidopsis thaliana as a model to study natural variation
1.1 A. thaliana population biology
1.1.1 A. thaliana taxonomy
1.1.2 A. thaliana biogeography
1.1.3 A. thaliana accessions
1.1.4 A. thaliana life cycle
1.2 Genetic & epigenetic diversity
1.2.1 A. thaliana genetic variation
18.104.22.168 Structure and demography
22.214.171.124 The pattern of genetic variation
126.96.36.199 Large scale genetic variation
1.2.2 A. thaliana epigenetic variation
1.3 Phenotypic diversity
1.3.1 Quantitative phenotypic variation
1.3.2 Phenotyping quantitative traits
1.3.3 Integrated phenotypic variation and plasticity
2. From phenotypic variation to molecular variants and vice versa
2.1 Mapping and cloning the genes responsible for phenotypic variation
2.1.1 Linkage mapping in segregating population
188.8.131.52 Mapping populations
184.108.40.206 Statistical methods
220.127.116.11 Conclusion on the use of linkage mapping
2.1.2 Cloning strategies
2.1.3 Genome-Wide Association mapping
2.1.4 Nested association mapping (NAM)
2.1.5 Validation of causal genes and polymorphisms
2.2 The complex genetic architecture of quantitative traits
2.2.1 The number and effects of QTLs
2.2.2 Rare variants & allelic heterogeneity
2.2.3 Pleiotropy or linkage
2.2.4 Epistatic interactions
2.3 The unexpected complexity of phenotypic variation
2.3.1 Phenotypic buffering
2.3.2 The genetic x environment interactions
2.4 Conclusion about QTLs detection using natural variation
3. Evolutionary significance of A. thaliana natural variation
3.1 Ecologically significant phenotypic variation in A.thaliana
3.2 Evolutionary significance of phenotypic variation
3.2.1 Theoretical elements
3.2.2 Methods to detect evolutionary significant variants
18.104.22.168 Tests based on within-species variation
22.214.171.124 Tests based on between-species variation
3.3 Conclusion: Major evolutionary traits in A. thaliana?
4. Scope of the thesis
Part II Allelic heterogeneity and trade-off shape natural variation for response to soil micronutrient
5. Project background and personal contribution
6. Publication in PLoS Genetics
7. General discussion and Perspectives
7.1 Why is MOT1[Sha] hypofunctional?
7.2 Is MOT1[Sha] adaptive?
7.3 What are the differences between MOT1[Sha] and MOT1[Ler] ?
7.4 Further investigations
Part III A tandem of receptor-like kinases is responsible for natural variation in shoot growth response to mannitol treatment in A. thaliana
8. Project background and personal contribution
10. General discussion and Perspectives
10.1 EGM1 and EGM2 and biotic stress
10.1.1 Innate immunity in plants
10.1.2 Mannitol signalling pathway in plant immunity
10.1.3 Which pathogens and in which plants?
10.2 What are the roles of EGM receptor-like-kinases within mannitol signalling pathway?
10.2.1 The role of plant receptor like kinases in plant innate immunity
10.2.2 What is the function of EGMs RLKs?
10.3 EGM1 and EGM2 natural variation and evolution
Part IV Natural epigenetic variation at QQS loci in A. thaliana
11. Project background and personal contribution
12. Publication in PLoS Genetics
13. General discussion and Perspectives
13.1 The phenotypic consequences of QQS epivariants
13.2 The origin and evolution of QQS epivariants
A. Additional publication