Transferable bread wheat EST-SSRs demonstrated that grass species can be a reservoir of new alleles for wheat improvement (Plant Breeding in press)

Get Complete Project Material File(s) Now! »

Advantages of SSRs compared to other markers

The numerous advantages of the microsatellites have been well-documented (Morgante and Olivieri 1993; Rafalski and Tingey 1993; Powell et al. 1996). One of the main advantages of the microsatellites compared with other markers is their highly polymorphic rate due to the variability of the number of repeats at a given locus (Poulsen et al. 1993; Schmidt et al. 1993; Thomas and Scott 1993; Senior and Heun 1993; Becker and Heun 1995; Rongwen et al. 1995). However, they remain sufficiently stable to avoid somatic variations (Barret 1993). The polymorphism can be evaluated through the polymorphism information content value (PIC value), a criterion that was first introduced by Nei et al. (1973) in human genetics. These are also mainly co-dominant markers with a Mendelian inheritance, which can reveal homozygous as well as heterozygous lines and which can be used on a wide range of segregating population (Morgante and Olivieri 1993). In addition, they show a high reproducibility compared to other markers such as RAPDs. When polyploid plants like rapeseed and wheat are considered, microsatellites are also frequently specific of the genome and give a single signal compared to RFLPs (Poulsen et al. 1993; Röder et al. 1995). Finally, they can be easily automated and used on high throughput genotyping platforms (Nicot et al. 2004).

Abundance in genomes

Abundance and polymorphism of the SSRs, especially of the dinucleotides (TG)n, (AC)n and (AT)n was demonstrated in human and other eukaryotes (Weber and May 1989; Tautz 1989; Litt and Luty 1989). In general, plants have a lower proportion of sequences that account for SSRs than do vertebrates but a higher proportion compared with invertebrates and fungi. The range for plants is between 0.85% (Arabidopsis) and 0.37% (maize) of the genome (Morgante et al. 2002), whereas estimates for the fish species Tetraodon nigroviridis and Fugu rubipres are respectively 3.21% (Crollius et al. 2000) and 2.12% (Elgar et al. 1999), 1.07% for human chromosome 22, 0.21% for Caenorhabditis elegans and 0.30% for Saccharomyces cerevisiae (Toth et al. 2000). Within higher plants, Morgante and Olivieri (1993) estimated that the frequency of the microsatellites was one every 50 kb which contrasts with what is observed in human, with an estimated average density of one SSR every 6 kb (Beckman and Weber. 1992). Only few SSRs were detected within the organellar genome (Wang et al. 1994). The different types of SSRs exist in different density throughout the genome with slight variations according to the species. In primates, mononucleotides (mainly, poly (A/T) tracts) are the most copious classes of SSRs (Toth et al. 2000; Wren et al. 2000). Dinucleotide (AT)n motifs were found to be rare within animal genomes while (CA)n were the most common SSRs (Moore et al. 1991). Most of the SSRs (48-67%) found in plant species are dinucleotide repeats while mono- and tetranucleotide repeats are the least common (Wang et al. 1994; Schug et al. 1998). A mean of one SSR every 64.6 kb and 21.2 kb was detected for monocotyledons and dicotyledons species respectively. However, frequency can vary from one every 3 kb in barley (Becker and Heun 1995) to one every 80 kb in rice (Panaud et al. 1995). At the centimorgan level, the SSRs provide thus an important source of markers for genetic mapping in eukaryotes. The utilization of microsatellites as genetic markers for a global approach of the genetic mapping in eukaryotes was proposed (Beckman and Soller 1990) and they have been thus extensively used to elaborate genetic maps in human (Weissenbach et al. 1992), pig (Rohrer et al. 1994) and other mammals and plants.

Dispersion within genomes

Microsatellites were found to be widely and randomly distributed in coding as well as non coding regions of the genome. A high proportion of SSRs is associated to Alu and SINE elements in primates and human (Arcot et al. 1995; Nadir et al. 1996) and thus forms a rather large portion of non-coding DNA regions. It was also demonstrated that they are frequently associated with retro-transposons and other dispersed repetitive elements in barley (Ramsay et al. 1999) as well as in other plant species (Wang et al. 1994). Di- and tetranucleotide SSRs are more frequent in non coding regions compared with trinucleotides. Dinucleotides are about 20 times more frequent in random genomic clones of Norway spruce (Picea abies, Scotti et al. 2000) compared to expressed sequences while Morgante et al. (2002) reported that all SSR types except tri- and hexa-nucleotides are significantly more frequent in the non coding fraction of six plant species compare to 25,762 predicted protein-coding sequences. In some cases, it was suggested that they could be clustered in some region of the genome (Condit and Hubbel 1991; Panaud et al. 1995; Arens et al. 1995).

SSRs within cereals

During the last decade, SSR markers were developed and investigated in a large number of plants including major cereal species such as barley (Thiel et al. 2003), maize (Chin 1996; Yu et al. 2001), oats (Li et al. 2000), rice (Gao et al. 2003), rye (Saal and Wricke, 1999), Sorghum (Bhattramakki et al. 2000) and wheat (Röder et al. 1998; Varshney et al. 2000; Guyomarc’h et al. 2002a, b; Gupta et al. 2002). In the majority of these, the two most common SSRs, whose density in the genome were determined while screening genomic libraries included GA and GT. The density of GA (38% to 59%) and GT (20% to 34%) in these different species ranged from one SSR every 212 kb to 704 kb (Varshney et al. 2002). Estimates of the total number of SSRs at the genome level were also performed in several crops. The frequencies per haploid genome were one SSR each 3.6 X 104 bp for (GA)n and for 2.3 X 104 bp for (GT)n in bread wheat (Röder et al. 1995), and were estimated to be one SSR each 1.36 X 103 bp for (GA)n and for 1.23 X 103 bp for (GT)n in rice (Panaud et al. 1995).
In a recent survey of EST and genomic sequences, densities of SSRs were higher in the coding regions compared to non-coding regions of the genomes (Morgante et al. 2002). In EST sequences of several cereal species, it was estimated that the frequency ranged from one SSR every 3.9 kb in rice to one SSR every 7.5 kb in maize (Varshney et al. 2002). In wheat, the same authors mentioned one SSR every 6.2 kb while Gupta et al. (2003) reported one SSR each 9.2 kb.
Overall, an average of one SSR every 6-7 kb seems to be a good estimate for SSR frequency in the whole genome of plants (Cardle et al. 2000).

Level of polymorphism of SSRs

SSR polymorphism can be defined as the difference in the number of DNA repeats at a given locus, and can be easily analyzed through electrophoresis after PCR amplification. The level of polymorphism is generally evaluated through the Polymorphism Information Content (PIC) value (Nei et al. 1973) or diversity index (Akagi et al. 1998). However, it must be noticed that these values mostly depend on the sample studied (size and variability) and just allow comparison within the samples but not among different studies.
The sequences flanking specific SSR loci in a genome are supposed to be highly conserved within a species, but also in some cases across species, within a genus and rarely across related genera. These conserved sequences are therefore used to design primers for individual SSR loci. In human and various animals, SSRs have high mutation rates, ranging from 10-3 to 10-6, which is superior to that of other regions in the genome (Bachtrog et al. 1999). In plants, the mutation rate at microsatellite loci is even higher than in animals, with the range of 10-2 to 10-3 in chickpea (Cicer arietinum, Udupa et al. 2001), or 10-3 to 10-4 in durum wheat (Triticum turgidum, Thuillet et al. 2002). SSR mutation rates are influenced by several factors, including motif type, perfection and number of repeats, among which the repeat number was best characterized (Schlötterer et al. 1998; Thuillet et al. 2004). In a variety of organisms, it was demonstrated that SSR mutation rates are positively correlated with repeat numbers and that the most polymorphic SSRs are the longest ones (Sia et al. 1997; Schlötterer et al. 1998; Thuillet et al. 2004). Weber (1990) reported that SSRs with 12 or fewer repeats were monomorphic in human (PIC = 0) and that the PIC value increased simultaneously to the average number of repeats, especially in the range of about 11 – 17 repeats.

READ The importance of economic variables v. household characteristics

Mutational mechanisms of SSRs

Two hypotheses are currently admitted to explain the variability of SSRs. The first one involves unequal crossing over between sister chromatides either during mitosis or after homologous pairing of chromosomes at meiosis (Wolf et al. 1989; Harding et al. 1992). However, they are most likely thought to accumulate by DNA slippage (Wells et al. 1965; Streisinger et al. 1966; Morgan et al. 1974; Efstratiadis et al. 1980; Drake et al. 1983; Levinson and Gutman 1987) and mispairing during replication and recombination or extension of single-strand ends (Tautz and Rentz 1984; Dover and Tautz 1986; Jeffreys et al. 1985; Wells et al. 1965).

Replication slippage

A model for SSR mutation based on replication slippage was formulated by Levinson and Gutman (1987). Replication slippage or slipped strand mispairing refers to the out-of-register alignment of the two DNA strands following dissociation at the time when the DNA polymerase traverses the repetitive region. This slippage implies the formation of a single-strand loop during DNA synthesis and addition (or suppression depending on the strand concerned with the loop) of a motif. If the most 3’ repeat unit of the nascent strand re-hybridizes with a complementary repeat unit downstream along the template strand, a loop will be formed in the nascent strand and the new sequence will become longer than the template sequence upon elongation (Fig1-8a). On the contrary, if the incorrect alignment occurs upstream along the template strand, the new strand will become shorter than the template sequence (Fig1-8b). Most of these primary mutations are corrected by the mismatch repaired system, and only the small fraction that was not repaired ends up as microsatellite mutation events (Schlotterer and Tautz 1992).

Unequal recombination

Recombination could potentially change the SSR length by unequal crossing over or by gene conversion, which introduce mutation in the satellite sequence. In the version of the The open rectangles represent individual repeat units. (A) Homologous recombination between misaligned arrays of 10 repeats results in one DNA molecule with a repeated tract one repeat unit longer than the original tract and one molecule that is one repeat unit shorter than the original tract. (B) A double strand break occurs within the repeated tract. This break is followed by exonuclease degradation that exposes complementary single-stranded regions. Hybridization of these regions followed by DNA synthesis and religation results in a reduction in repeat units. Alternatively, the broken ends could invade an array on another DNA molecule leading to an expansion in the tract. recombination model shown in Fig1-9A, simple repeats located on different DNA molecules pair in a misaligned configuration and a crossover occurs, resulting in arrays with reciprocal additions and deletions. Other recombination models in which the additions or deletions are not reciprocal are also possible (Fig1-9B). There is little evidence that recombination would also contribute to microsatellite mutation. Genomic SSR distributions are associated with sites of recombination, most probably as a consequence of repetitive sequences being involved in recombination rather than being a consequence of it (Treco and Arnheim 1986; Majewski and Ott 2000)

Interaction between replication slippage and recombination

Other possible SSR mutation model was proposed by Li et al. (2002). When studying SSR diversity in wild emmer wheat, they found a strong interaction between mean repeat length and SSR locus distance from centromere. This interaction has an effect on the number of alleles and variance in repeat size at SSR loci. SSR mutation might thus be the result from the possible influence of replication slippage during recombination-dependent DNA repair. In fact, strand exchange between two homologous chromosomes should create a four-stranded configuration, called a Holliday structure, associated with mismatched (heteroduplex) DNA region. These regions undergo replication-dependent correction. Hence, a slippage mechanism may also work in recombination tracts involving SSR arrays (Gendral et al. 2000; Li et al 2002)

Polymorphism of g-SSRs

Numerous studies demonstrated that g-SSRs show a high level of polymorphism compared to other types of molecular markers. Kölliker et al. (2001) reported that SSR markers developed from white clover were highly polymorphic, 88% detecting polymorphism across seven genotypes with an average allele number of 4.8. In maize, PIC values ranged between 0.06 and 0.91 with a mean of 0.62 (Smith et al. 1997) which was similar to what is mentioned for RFLPs (Senior and Heun 1993). In barley, it was demonstrated that even if SSRs show lower diversity index compared to AFLP (0.521 and 0.937 respectively), they reveal a high number of allele at each locus making them very effective to study genetic relationships (Russell et al. 1997). In cultivated spelt wheat, Bertin et al. (2001) found that the mean PIC value was 0.64 while in common wheat; it was assessed to be 0.71. Even if lower PIC value (0.57) are also reported for common wheat (Stachel et al. 2000), this is twice higher than what is revealed by RFLPs (Chao et al. 1989; Cadalen et al.

Table of contents :

Chapter I: Bibliography
1. Hexaploid wheat
1.1 Economical importance of wheat
1.2 Classification of the hexaploid wheat
1.3 Origins of bread wheat
1.4 Genetic specificity of bread wheat
2. Simple sequence repeats (SSRs)
2.1 Definition and classification
2.2 Advantages of SSRs compare to other markers
2.3 SSRs within genomes
2.3.1 Abundance in genomes
2.3.2 Dispersion within genomes
2.3.3 Composition ….
2.3.4 Number of repeats
2.4 SSRs within cereals
2.5 Level of polymorphism of SSRs
2.5.1 Definition of SSR polymorphism
2.5.2 Mutational mechanisms of SSRs
2.5.2.1 Replication slippage
2.5.2.2 Unequal Recombination
2.5.2.3 Interaction replication slippage and recombination
2.5.3 Polymorphism of g-SSRs
2.5.4 Polymorphism of EST-SSRs
2.6 Role and function of the SSRs
2.6.1 DNA conformation
2.6.2 Promoter regulation
2.6.3 Meiosis and mitosis
2.6.4 Coding function
2.7 Application of SSRs to plant genetic studies
2.7.1 Genetic mapping
2.7.2 Genetic diversity
2.7.3 Comparative mapping
2.7.4 Phylogenic studying
2.7.5 MAS breeding
3. Organizations of genetic resources
3.1 History
3.2 Definition and classification
3.3 Collections
3.3.1 Landraces
3.3.2 Wild species
3.4 Evaluation for genetic resources
3.4.1 Agronomic and morphologic data
3.4.2 Isozymes data
3.4.3 NIRS data
3.4.4 Molecular data
3.4.4.1 RFLP
3.4.4.2 RAPD
3.4.4.3 AFLP
3.4.4.4 Microsatellites
3.4.4.5 ISSR
3.4.4.6 STS
3.4.4.7 SNPs
3.4.4.8 ESTs
3.4.4.9 Comparison between the different types of markers for diversity
3.5 Aims of the thesis
3.5.1 Scientific question
3.5.2 Methodology
3.5.3 Deliverables
Chapter II: Materials and Methods
1. Plant material and extraction
1.1 Hexaploid wheats for polymorphism evaluation
1.2 Grass species for transferability studies
1.3 Aneuploid lines
1.4 Segregating populations
1.4.1 ITMI population
1.4.2 Courtot X Chinese Spring population
1.5 DNA extractions
2. Wheat and rice EST-SSRs
2.1 Analysis of the wheat ESTs
2.2 Primer design
2.3 Rice EST-SSRs
2.4 EST-SSR detection
2.4.1 Principle of the M13-tailed primer method
2.4.2 PCR reaction and detection
3. Data analysis
3.1 Genetic and cytogenetic mapping of EST-SSRs
3.1.1 Chromosomal assignment
3.1.2 Genetic mapping
3.2 Transferability of the EST-SSRs to wheat related species
3.3 In silico analysis of EST-SSRs
3.4 Distribution of EST-SSRs on the rice genome
3.5 Determination of Polymorphism Information Content (PIC) values
3.6 Determination of the Jaccard similarity coefficient
3.7 Unweighted Pair Group Method with Arithmetic Mean (UPGMA)
Chapter III: Results and discussion
1 Analysis of the wheat ESTs
1.1 Characteristics of EST-SSRs
1.2 Amplification and polymorphism of EST-SSRs
1.3 Genetic mapping of the EST-SSRs
1.4 Discussion
2 Transferability of the wheat EST-SSRs to grass species
2.1 Publication 1: High transferability of bread wheat EST-derived SSRs to other cereals (2005, Theor Appl Genet 111:677-687)
2.2 Publication 2: Transferable bread wheat EST-SSRs demonstrated that grass species can be a reservoir of new alleles for wheat improvement (Plant Breeding in press)
2.3 Comments and perspectives
3 Transferability of EST-SSRs between wheat and rice
3.1 Transferability from bread wheat to rice…
3.2 Transferability from rice to bread wheat
3.3 Discussion, comments and perspectives
4 Phylogenetic studies of Triticeae using wheat EST-SSRs
4.1 Publication 3: Transferable bread wheat EST-SSRs can be useful for phylogenetic studies among the Triticeae species (Theor Appl Genet, on line first, DOI: 10.1007/s00122-006-0304-4)
4.2 Comparison of the classification of T. aestivum lines obtained using g-SSRs and EST-SSRs
4.3 Genetic diversity of Triticale species
4.4 Phylogenetic relationship for species with D genome
4.5 Phylogenetic relationship between all the species
4.6 Discussion
4.7 Comments and perspectives
General Conclusion
1. Analysis of the wheat ESTs
1.1 Main results
1.2 Perspectives
2. Transferability of the wheat EST-SSRs
2.1 Main results
2.2 Perspectives
3. Phylogenetic studies
3.1 Main results
3.2 Perspectives
4. Transferability of rice EST-SSRs
4.1 Main results
4.2 Perspectives
5. Conclusion
References