Genomic Context of DNA Methylation
Eukaryotic genomes consist not only of genic sequences but also of transposable elements (TEs), that are capable of moving to new locations and have a potential to increase their number of copies from one generation to another (Orgel and Crick 1980). Genome sequencing has revealed that TEs comprise a large fraction of most eukaryotic genomes, including the human genome (approx. 50% ;(Lander, Linton et al. 2001)) and proliferation of these elements is largely responsible for differences in genome size among eukaryotes (Kidwell 2002). These mobile DNA sequences, often considered as ‘selfish’ or ‘parasitic’ elements, are highly mutagenic, and active TEs can disrupt protein coding genes, cause chromosomal breakage, illegitimate recombination or other genome rearrangements. The various classes of transposon sequences employ different mechanisms to proliferate within a genome. For example, Class I elements, termed retrotransposons, use an RNA intermediate for their transposition, Class II or DNA transposons employ a ‘cut and paste strategy,’ and a third class of transposons called Helitrons are thought to use a ‘rolling circle mechanism’. Most TEs of any class are not actively duplicating or transposing mainly due to a mutation or deletion of a part of the TE sequence (non-autonomous elements). However, full-length autonomous copies of TEs are present, and these are typically silent (Slotkin and Martienssen 2007; Lisch 2009). Although TE activity is typically deleterious within the lifespan of an organism, their role over evolutionary time-scales is considered to be a major factor that contributes to shaping the functional genome of an organism.
A Bioinformatics analysis of the A. thaliana Epigenome
Genome-wide and other local mapping studies of DNA methylation have long indicated both non-autonomous and full-length TEs to be the main targets of cytosine DNA methylation in eukaryotes. Multiple lines of evidence indicate that eukaryotic DNA methylation serves primarily to keep in check these potentially harmful sequences (Suzuki and Bird 2008) and cytosine methylation is therefore generally seen as a classical silencing epigenetic mark that is associated with the repression of TEs and other repetitive sequences. However, gene-body methylation has been proposed as an ancient property of eukaryotic genomes with preference for exons in most organisms including Arabidopsis thaliana where it is associated with active genes (Zhang, Yazaki et al. 2006; Zilberman, Gehring et al. 2007; Cokus, Feng et al. 2008; Lister, O’Malley et al. 2008; Feng, Jacobsen et al. 2010; Zemach, McDaniel et al. 2010). Unlike in transposons, where cytosine methylation is distributed throughout the length of the TE sequence, methylation within genes occurs predominantly at CG sites only, is confined to the transcriptional part of the gene, and is depleted at both the 5’ and 3’ ends of coding sequence (Figure 1.2.1). This suggests that methylation at the 5’ and 3’ ends of genes could be inhibitory to transcription, potentially interfering with initiation or termination. Indeed, methylation of promoter sequences and 5’ coding sequences is strongly negatively correlated with the expression of the downstream gene.
It has also been proposed that moderately transcribed genes are more likely to be methylated than those with low or high expression (Zhang, Yazaki et al. 2006; Zilberman, Gehring et al.2007), suggesting a scenario where the transcription process itself could contribute to maintaining or enhancing DNA methylation levels over the transcriptional unit. CG methylation in gene bodies of certain mammalian genes has also been observed and is found to be positively correlated with levels of transcription(Jones 1999), and there is now growing evidence that this may be a general phenomenon (Ball, Li et al. 2009). Gene-body methylation has been hypothesized to suppress spurious initiation of transcription within active genes in Arabidopsis and a similar function may exist in mammals (Suzuki and Bird 2008; Feng, Jacobsen et al. 2010; Zemach, McDaniel et al. 2010).
DNA methylation can suppress transcriptional activity or lead to silent chromatin in two ways.
First, the methylation of DNA itself may physically impede the binding of transcription factors to the gene and make the gene inaccessible to the transcription machinery. Second, in a more likely scenario, methylated DNA may be bound by methyl-cytosine binding proteins. These proteins can then recruit additional proteins to the locus, such as histone deacetylases and other chromatin remodelling proteins that can modify histones or remodel the chromatin, thereby forming compact, inactive and silent chromatin.
Establishment and maintenance of DNA methylation
The DNA methyltransferase (DNA MTase) family of enzymes catalyzes the transfer of a methyl group to cytosine bases in DNA and all known DNA MTases use S-adenosyl-L-methionine (SAM or AdoMet) as a methyl group donor. In Arabidopsis, DOMAINS REARRANGED METHYLTRANSFERASE 2 (DRM2), a homologue of the mammalian de novo DNA methyltransferase DNMT3, is primarily responsible for catalyzing de novo DNA methylation (Cao and Jacobsen 2002). DNA METHYLTRANSFERASE 1 (MET1), which is the homologue of mammalian DNMT1, and the plant-specific methyltransferase (CMT3 in Arabidopsis) , are CG- specific and CHG-specific maintenance methyltransferases, respectively. Although, a general perception is that distinct cytosine DNA methyltransferases are responsible for either de novo or maintenance methylation, an emerging view is that different DNA MTases frequently cooperate to catalyse both steps.
After each round of DNA replication, DNA methylation of the newly synthesized strand is either guided by the parental strand (maintenance) or established de novo through RNA directed DNA methylation (RdDM). RdDM refers to the RNA interference (RNAi) mediated cellular response in plants to the presence of double stranded RNA (dsRNA) in the cell, in which the dsRNA is processed by the RNase III-like endonuclease, Dicer, to produce small RNAs (siRNAs), which then are loaded on an Argonaut complex to guide DNA MTases to the homologous DNA sequences:. This process induces de novo DNA methylation of cytosine bases in all sequence contexts (CG, CHG, CHH) at the region of siRNA-DNA sequence homology. A prerequisite for siRNA biogenesis is the presence of dsRNA precursors which can form either by bidirectional transcription, transcription through inverted repeats or conversion of transcripts into dsRNA by RNA-dependent RNA polymerases (RDRs). Small RNAs are then incorporated into multiprotein silencing effector complexes to direct either mRNA degradation and repress translation via post transcriptional gene silencing (PTGS), or target DNA methylation and associated repressive chromatin modifications and lead to transcriptional gene silencing (TGS) in a sequence-dependent manner. A vital component of these silencing effector complexes is an argonaute (AGO) protein, which can bind small RNAs through its PAZ domain. Specific members of the argonaute protein family confer functional specificity to different silencing pathways which can be distinguished by either source of dsRNA, size-class of small RNA, or nature of target sequence. Thus, in Arabidopsis whereas the argonaute protein, AGO4 is a member of the RNA- induced transcriptional silencing complex (RITS) involved in TGS and associates with 24-nt siRNAs, AGO1 incorporates into RNA induced silencing complex (RISC) and leads to 21-nt siRNA or miRNA guided cleavage of the target mRNAs (PTGS). Since cytosines at CG and CHG sites are in a symmetrical sequence context, methylation at these sites has been largely thought to be dependent on maintenance mechanisms (Figure 1.2.2 b; (Teixeira and Colot 2010)). While this holds true for the CG sites, where methylation is mainly under the control of METHYLTRANSFERASE1 in plants (homolog of mammalian methyltransferase Dnmt1) and Dnmt1 in vitro has a higher affinity for hemi-methylated than for unmethylated CGs (Goll and Bestor 2005). However, the maintenance of methylation at CHG sites does not solely seem to depend on the palindromic symmetry of the sequence, as, methylation maintenance at these sites is mainly carried out by CHROMOMETHYLASE3 (CMT3), a chromodomain containing plant-specific methyl-transferase, and SUVH4, the main histone MTase for histone H3K9 dimethylation (Sharif, Muto et al. 2007; Ooi, O’Donnell et al. 2009). The chromodomain of CMT3 can recognize dimethylated H3K9 and the SRA domain of H3K9 methyl-transferase can bind to methylated CHG sites (Johnson, Bostick et al. 2007).
Table of contents :
CHAPTER I INTRODUCTION TO CHROMATIN AND CHROMATIN-level GENOME ORGANISATION
Histone post-translational modifications
Genomic Context of DNA Methylation
Establishment and maintenance of DNA methylation
Regulation of DNA methylation
Methods to identify DNA methylation
The bioinformatics of epigenome data analysis
R and Bioconductor support
Analysis of microarray ChIP-on-chip data
Aims and organisation of the thesis
CHAPTER II GENOME-WIDE EVIDENCE FOR LOCAL DNA METHYLATION SPREADING FROM SMALL RNA TARGETED SEQUENCES IN ARABIDOPSIS
Materials and Methods
DNA methylation analysis
Data visualization and downstream analysis
CHAPTER III INTEGRATIVE EPIGENOMIC MAPPING DEFINES FOUR MAIN CHROMATIN STATES IN ARABIDOPSIS
Materials and methods
Review process file
Cluster Validity and clustering tendency
CHAPTER IV SPATIAL & TEMPORAL DYNAMICS IN HISTONE H2BUB CHROMATIN MARK DURING LIGHT DRIVEN DEVELOPMENTAL ADAPTATION & THE ROLE THEREIN FOR FINE-TUNING OF GENE EXPRESSION
The hub1-3 mutant shows defects in fine-tuning of gene expression
Cluster analysis of expression data for differentially expressed genes
Genome-wide dynamics of H2Bub distribution
Light-driven induction is associated with dynamic changes of H2Bub
CHAPTER V DISCUSSION