Function and information content of DNA methylation in the CpG context
Distribution of CpG sites in the mammalian genome
The mammalian genome contains millions of CpGs, 60-90% of which are methylated , suggesting that methylation of CpGs is actually a default state (Figure 11A). Because methylat-ed cytosines have a high deamination rate , many cytosines have mutated into thymine over the course of evolution, despite the presence of two glycosylases that are thought to repair this mismatch . As a consequence, CpGs occur at only 21% of the expected frequency in the human genome , and the genomic regions that are typically unmethylated show an unusually high frequency of CpGs. These regions, called “CpG islands” (CGIs), are ~1000 base pairs long and show an elevated G+C base composition . In humans, 40% of CGIs are found in annotated gene promoters [121,122] and the remaining 60% have been termed “orphan CGIs” due to the uncertainty surrounding their annotation  (Figure 11B). However, it seems that despite not being annotated to any specific promoter, many orphan CGIs are actual sites of transcription initiation for nearby annotated genes or ncRNAs [123,124]. Conversely, 70% of annotated gene promoters are associated with a CGI , especially housekeeping genes . CGIs are interspaced by long stretches of highly methylated CpG-poor regions that are found both within and between genes . Following the marine metaphor of the CpG island, one can find the “CGI shore” from 0 to 2 kb on either side of a CGI, and the “CGI shelf” from 2 to 4kb on either side of a CGI (Figure 11B). Any CpG site that is not located in a pro-moter, a gene body, a CGI, a CGI shore or a CGI shelf is said to be located in the “open sea”.
(A) Genomic distribution of methylated cytosine in a typical mammalian genome. The representative genomic region includes an example of an active and an inactive gene with proximal (promoter) and distal (enhancer) regulatory regions. The height of the bar indicates the relative proportion of DNA methylation (5-methylcytosine, 5mC) that is observed in each region. CpG islands (CGIs), which often overlap with promoter regions, generally remain unmethylated, whereas CG-poor promoters are methylated when not active. Adapted from . (B) Genomic distribution of CGIs. Many CGIs are found in gene promoters, but the majority are located in intragenic or intergenic regions and termed « orphan CGIs » due to the uncertainty surrounding their annotation. Following the metaphor of the CpG « island », the regions located up to 2kb away from the island is called « CGI shore », and the region up to 4kb termed « CGI shelf ». CpG sites that are not located in a promoter, a gene body, an island, a shore or a shelf is said to be located in the « open sea ». Adapted from .
DNA methylation patterns are deeply linked to the underlying chromatin state
Interestingly, DNA methylation does not have the same effects on transcription whether it occurs in promoters, enhancers or gene bodies, and whether it happens in CGIs, in the regions flanking CGIs, or in CpG-poor regions (Figure 12). What we know best is how CGIs in the promoter context work to control gene expression. Promoter CGIs are usually unmethylated, but when they become methylated, they robustly repress transcription, as notably seen in X-chromosome inactivation  and genomic imprinting . The role of DNA methylation at CpG-poor promoters is less clear. CpG-poor pro-moters display tissue-specific methylation patterns , and there is an inverse correlation between methylation at these promoters and transcription . In addition, in vitro removal of DNA methyla-tion at CpG-poor promoters can result in their direct silencing . This suggests that there may be a role for methylation at CpG-poor promoters in the establishment and maintenance of tissue-specific expression patterns. However, it has also been shown that CpG-poor promoters could still be ex-pressed when they are methylated , and that it is the binding of TFs that shape DNA methylation profiles at CpG-poor regions . Therefore, the role of methylation at CpG-poor promoters is not well understood, but potential effects might only apply to certain binding sites such as those with low-er affinity, at which DNA methylation might further reduce the likelihood of binding . On the contrary, intragenic methylation positively correlates with transcription , and it may well serve several functions at once. Like intergenic methylation , intragenic methylation was initially sug-gested to silence repetitive DNA elements that would otherwise cause genomic instability . How-ever, it was discovered that when intragenic methylation occurs at a CGI, it can suppress intragenic promoters . Also, intragenic methylation might be involved in alternative splicing events , but evidence for this phenomenon is scarce . Recent attention has been drawn to CGI shores that lie in close proximity of CGIs . In several pathologies such as cancer [140,141], intrauterine growth restriction , and in normal conditions that compared men and women , monozygotic twins  or simply different cell types , differentially methylated CpG sites were found to be enriched in CGI shores. Last but not least, CpG-poor regions span many enhancers, and methylation at enhancers is crucial for early development [146–149]. Importantly, genome-wide patterns of DNA methylation are adapted to the tissue they belong to: promoters of tissue-specific genes are often unmethylated in the tissues in which they are expressed, and methylated in the tissues where they are silenced [132,150]. In the end, all these observations show that it is not merely the presence of DNA methylation that determines its relation to transcription, but its interpretation in the context of a partic-ular genomic region .
The function of DNA methylation is well established at CGI promoters, where it stably represses transcription. Genes with CpG-poor promoters usually show tissue-specific methylation, and methylation in their promoter may repress transcription, especially at binding sites with lower affinity. Within gene bodies, CGIs are thought to silence intragenic promoters. Intragenic methylation positively correlates with transcription and could also si-lence repetitive elements, or participate in alternative splicing events. Intergenic methylation has a role in silenc-ing repetitive DNA elements, and probably in regulating certain enhancers.
DNA methylation and transcription influence one another
Given that the roles of DNA methylation are so context-dependent, what do we know about the mech-anistic and causal relationship between DNA methylation and gene expression? The emerging picture involves the complex interaction between DNA methylation, nucleosomes, MBPs and methylation-sensitive TFs  (Figure 13). DNA methylation can impair the binding of specific TFs to DNA [152–155], recruit transcriptionally repressive MBPs [156–159], and affect nucleosome positioning . In their methylated state, promoter CGIs are bound by MBPs, which in turn prevent the binding of methyl-CpG binding TFs; in their unmethylated state, promoter CGIs are bound by CXXC domain– containing complexes that activate transcription. This scenario does not fit well transcription of genes with CpG-poor promoters though, as MBPs are sensitive to the local density in CpGs . At en-hancers, methylation can prevent the binding of methyl-sensitive TFs, or attract methyl-CpG binding TFs, leading to either gene activation or repression. Nevertheless, while it is now acknowledged that DNA methylation influence TF binding, it also seems that TF binding itself as well as transcription shape DNA methylation patterns, a situation very reminiscent of the feedback loops often observed in biology (Figure 14). In cancer for example, gene silencing precedes the establishment of DNA meth-ylation at CGI promoters [130,162–164]. DNA methylation subsequently happens because inactive genes are often more susceptible to de novo methylation, providing added stability to the silent state  (Figure 15). Interestingly, while transcription of a gene harbouring a methylated CGI in its pro-moter is unheard of, a gene containing an unmethylated CGI in its promoter is not necessarily tran-scribed . In this respect, methylation at promoter CGIs could be seen as a sort of genomic “rust” that accumulates when a gene is not in use . Therefore, methylation at promoter CGIs is not the mechanism triggering silencing, but rather the mechanism allowing gene expression to become “locked”. At CpG-poor regions, methylation can change as a result of TF binding [166–168], and does not necessarily impair the binding of these TFs . The current data suggest that TF binding is the central event that mediates concerted changes in other regulatory mechanisms determining chromatin states, accessibility, and conformation . The issue of causality in CpG-poor regions is far from being solved, and it seems that DNA methylation is much more complex and context-dependent than previously thought.
(a) The ‘old’ textbook model describing how DNA methylation regulates transcription. Left, methylated CpG-island promoters (P) recruit transcriptionally repressive MBD proteins and prevent transcription-factor binding. Right, nonmethylated CpG islands are bound by transcription factors. (b) New models describing regulation of transcription by DNA methylation. Left, genes with methylated CpG-island promoters are repressed by repres-sive MBD-containing complexes. In addition, methylation of an enhancer (E) can block binding of a transcrip-tion factor. Right, most active genes with nonmethylated CpG-island promoters are bound by CXXC domain– containing activator complexes. In addition, transcription factors bind to nonmethylated enhancers. Finally, gene bodies of active genes are highly methylated, and this serves to repress cryptic transcription. (c) Uncoupling between DNA methylation and repression of transcription initiation. In some cases, such as during early verte-brate development, some methylated promoters with low CpG density are actively transcribed. Transcriptionally repressive MBD proteins do not interact with these promoters, for yet-unknown reasons. Furthermore, some DNA sequences with low CpG density (including enhancers and promoters) can be bound by activating tran-scription factors. H3K4me3, trimethylated histone H3 Lys4, a promoter-associated histone mark associated with active transcription. Taken from  (a) A methylation-insensitive transcription factor causes reduced methylation after binding. (b) A transcription factor binds specifically to the methylated state of its binding site. (c ) A methylation -sensitive transcription fac-tor is blocked by 5-methylcytosine (5mC). (d) Methyl-CpG-binding domain (MBD) proteins bind to the methyl-ated state, leading to indirect repression, which probably requires high local density of CGs (shading) . (e) A methylation-insensitive transcription factor functions as a pioneer factor and creates a site of reduced methyla-tion that allows a methylation-sensitive factor to bind. Taken from 
Active promoters and enhancers have nucleosome-depleted regions (NDRs) that are often occupied by transcrip-tion factors and chromatin remodellers. Loss of factor binding — for example, during differentiation — leads to increased nucleosome occupancy of the regulatory region, providing a substrate for de novo DNA methylation.
DNA methylation subsequently provides added stability to the silent state and is likely to be a mechanism for more accurate epigenetic inheritance during cell division. The example given is for the OCT4 and NANOG genes, and its generality is not yet known, but inactive genes are often more susceptible to de novo methylation than their more active counterparts. In the figure, OCT4 binding is shown and NANOG binding is not shown, although its expression is required. Recent experiments have demonstrated that the methylation must be removed by active and/or passive processes to reactivate the gene. DNMT3A, DNA methyltransferase 3A; siRNA, small interfering RNA. Taken from 
Genetic influences on DNA methylation
It is now largely documented that genetic polymorphisms located in the vicinity of TF binding sites impact TF binding [170,171], but since TF binding and DNA methylation are so tightly linked, can genetic polymorphisms impact DNA methylation patterns? Recent studies have shown that not only do genetic polymorphisms impact DNA methylation, but the vast majority of inter-individual varia-tions in methylomes are caused by genetic polymorphisms [172,173]. In fact, methylome variations are so tightly linked to the underlying DNA sequence that Whitaker et al. developed a pipeline called Epigram that uses DNA motifs to predict the methylation status at tissue-specific differentially meth-ylated regions (DMRs) from 18 human tissues . The regions associated with differential methyl-ation levels have been termed methylation QTLs (meQTLs) and have been identified in a wide varie-ty of human tissues [175–187]. Although mostly located in cis (i.e. close to the CpG site they regu-late), a small number of meQTLs have also been uncovered in trans [188,189]. Moreover, there is substantial overlap between meQTLs detected in blood and in the brain, suggesting that some meQTLs could exert ubiquitous effects on DNA methylation .
Table of contents :
1. The roots of obesity and its disastrous consequences
1.1. Obesity: a huge health and economic burden
1.2. The relative contribution of genetics and the environment to obesity
1.3. Genome-wide association studies (GWASs) and the problem of missing heritability
1.4. Reconciling genetics and the environment with epigenetic mechanisms
2. Epigenetic modifications: focus on DNA methylation
2.1. Overview of epigenetics
2.2. Function and information content of DNA methylation in the CpG context
3. Studying DNA methylation in the context of obesity
3.1. Measuring DNA methylation
3.2. What we already know about DNA methylation in the context of obesity
3.3. Challenges to face