RNA-binding proteins aggregates are a hallmark of ALS and IBMPFD
Similar to other neurodegenerative diseases, the degenerating neurons of ALS patients are characterized by the accumulation of protein aggregates (Forman et al., 2004). In 2006 it was discovered that the major protein component of many of these inclusions is the 43-kD TAR-DNA–binding protein (TDP-43) (Arai et al., 2006; Neumann et al., 2006).
IBMPFD is also characterized by cytoplasmic inclusions with approximately 55% of patients having ubiquitin-positive inclusions of TDP-43 (Figure 2) and 45% having tau-positive inclusions (Leyton and Hodges, 2010). It remains unknown how these inclusions are formed and what the consequences of such inclusions are.
Proteins involved in ALS and IBMPFD
IBMPFD was attributed to being caused by mutations in the gene encoding Valosin-Containing Protein (VCP) (Watts et al., 2004). Later on, exome sequencing revealed VCP mutations as a cause of familial ALS as well (Johnson et al., 2010). Inherited forms of ALS and IBMPFD were also found to be often caused by missense mutations impacting heterogeneous ribonucleoproteins (hnRNPs) such as TDP-43, FUS, hnRNPA1 and hnRNPA2B1.
Valosin-containing protein (VCP), also known as p97, is a protein belonging to the ATPase Associated with diverse cellular Activities (AAA+ ATPase) protein family. This family of enzymes is associated with a wide range of cellular processes such as proteolysis, DNA replication and repair, and membrane fusion, which it carries out in an ATP-dependent manner (Ogura and Wilkinson, 2001). VCP is ubiquitously expressed in the cytoplasm and the nucleus and has a tripartite structure consisting of an N-terminal domain and two central D1 and D2 AAA+ domains. It also contains linker regions that join the N-D1 and D1-D2 domains (Figure 3).
The N-terminal domain is necessary for substrate and co-factor association, whereas the D1 and D2 domains are needed for ATP binding and hydrolysis. The D1 domain is primarily responsible for VCP hexamerization but this molecular assembly is not dependent on nucleotide binding (Wang et al., 2003a; Wang et al., 2003b; Zhang et al., 2000). The ATPase activity conferred by the D2 domain is essential for VCP’s function as a molecular chaperone in various cellular processes (Esaki and Ogura, 2010; Song et al., 2003). Six VCP monomers assemble into a macromolecular ring complex with the inner core formed by the D1/D2 domains, with the N-domains radiating outward. Structural changes can occur in the ring, allowing VCP to function as a chaperon that interacts with diverse adaptors. Notably, VCP is a ubiquitin-dependent segregase that can remove some ubiquitinylated proteins from protein complexes due to interactions with various adaptors. Through this mechanism, VCP can mediate a wide variety of essential cellular processes such as: cell cycle progression, DNA replication, protein quality control, and chromosomal decondensation. The C-terminal domain of VCP, following the D2 domain, is poorly characterized, but this region includes the major tyrosine phosphorylation site implicated in the regulation of endoplasmic reticulum (ER assembly) and cell cycle-dependent nuclear localization of VCP (Egerton and Samelson, 1994; Lavoie et al., 2000; Madeo et al., 1998).
Because of its role in many cell processes, deregulation or mutations in VCP cause serious diseases. For example, VCP is upregulated in cancer while missense mutations cause a dominantly inherited degeneration of bone, brain, and muscles. Fourteen mutations in VCP have been associated with diseases including myopathy, dementia, Paget’s disease of bone, ALS, and Parkinsonism. Among these mutations, R115H is the most prevalent disease-associated mutation and A232R causes the most serious symptoms of the diseases (Ju and Weihl, 2010). Interestingly, the amino acids affected by the disease-causing mutations are highly conserved across species. Even though it remains unknown how mutations in VCP cause disease, some studies have shown how they affect the structure of the protein. The majority of the mutations localize to the interface of the N- and D1- domains (Figure 4).
Figure 4: Graphic representation of VCP crystal structure showing the distribution of the IBMPFD-linked mutations. The N domain is represented in green, the D1 domain in blue and the D2 domain in grey. Adapted from (Niwa et al., 2012).
Because of this localization, a conformational change occurs, leading to impaired communication between the D1 and N-domains. This change indirectly influences the D1 nucleotide binding pocket (Fernandez-Saiz and Buchberger, 2010), altering the relative affinity for ATP and ADP. The VCP mutations R155P and A232E, found in IBMPFD patients, show an increase in ATPase activity and increased sensitivity to heat-induced upregulation in ATPase activity (Halawani et al., 2009). It has been shown recently that stress granule clearance was impaired when VCP was mutated (Buchan et al., 2013). Yet, it remains unclear how the mutations lead to the pathogenesis in biological systems.
TAR DNA-binding protein 43 (TDP-43) belongs to the hnRNP (heterogeneous nuclear ribonucleoprotein) family of proteins (Chaudhury et al., 2010). TDP-43 is a 414 amino acid nuclear protein encoded by the TARDBP gene on chromosome 1. It is composed of two N-terminal RNA-recognition motifs (RRM), nuclear localization (NLS) and nuclear export (NES) sequences, and a C-terminal domain enriched in glycines (Gly-rich) that may be required for the exon skipping and splicing inhibitory activity as well as for binding to nuclear proteins (Figure 5).
Figure 5: Domain structure of TDP-43. (shown in black) and ubiquitin-positive FTD 2011).
Known mutations are associated with ALS (in blue). Adapted from (Dormann and Haass, TDP-43 is a nuclear DNA- and RNA-binding protein with multiple functions in transcriptional repression, pre-mRNA splicing, and translational regulation (Da Cruz and Cleveland, 2011; Lagier-Tourenne et al., 2010). TDP-43 is highly conserved through evolution and it associates with other members of the heterogeneous nuclear ribonucleoprotein (hnRNP) family by interactions through its C-terminal domain (Brundin et al., 2010).
TDP-43 was initially identified as the major constituent of neuronal inclusions in sporadic ALS and in the largest subset of FTD (Neumann et al., 2006). In the degenerating neurons and glial cells of the central nervous system (CNS) of ALS patients, TDP-43 is relocated from the nucleus to large cytoplasmic aggregates. Additional studies showed that TDP-43 abnormally accumulates in the sarcoplasm of IBMPFD patients (Salajegheh et al., 2009). This accumulation is accompanied by its nuclear depletion, suggesting its redistribution from the myonucleus to the sarcoplasm is similar to the nuclear to cytoplasmic redistribution seen in affected cortical regions.
Mutations in TDP-43 were identified in both sporadic ALS and familial ALS (Sreedharan et al., 2008). However, these rare mutations are associated with less than 5% of the familial cases. It is important to note that most of the disease-causing mutations are localized in the C-terminal glycine-rich region of the protein (Figure 5).
Identification of hnRNPA1 and hnRNPA2B1 as disease-causing proteins
Identification of VCP-mutations negative families
Some, but not all, cases of IBMPFD associated with ALS are caused by mutations in the VCP gene. Work from Kim et al. (Kim et al., 2013) (See annex) identified two families with dominantly inherited degeneration of muscle, bone, brain, and motor neurons. The affected members of these families had identical clinical symptoms compared to patients presenting VCP mutations. Sequencing of the affected patients revealed that VCP was not mutated. Genetic analysis of this family by exome sequencing and linkage analysis identified mutations in two genes: hnRNPA1 and hnRNPA2B1 (Figure 6).
Figure 6: Identification of previously unknown disease mutations in IBMPFD/ALS. a), Family 1 pedigree indicating individuals affected by dementia, myopathy, PDB and ALS. The causative mutation was p.D290V/D302V in hnRNPA2B1. Roman numerals denote generation and Arabic numbers denote family member within a generation. b), Family 2 pedigree indicating individuals affected by myopathy and PDB. The causative mutation was p.D262V/D314V in hnRNPA1. Adapted from (Kim et al., 2013)
The two mutations described corresponded to a substitution of a valine residue in the place of a
highly conserved aspartate residue in human paralogs of the hnRNP A/B family (Figure 7). The
mutations identified for hnRNPA2B1 and hnRNPA1 were D290V and D262V, respectively.
Figure 7: Sequence alignment of hnRNPA2B1 and hnRNPA1 orthologues. There is an evolutionary conservation of the mutated aspartate and surrounding residues.
hnRNPA1 and hnRNPA2B1
hnRNPA2B1 and hnRNPA1 belong to the A/B subfamily of ubiquitously expressed hnRNPs. hnRNPA2B1 is expressed as two alternatively spliced isoforms: A2 and B1. The shorter isoform, hnRNPA2, lacks 12 amino acids in the N-terminal region and is the major isoform, accounting for ~90% of the protein in most tissues. The hnRNPA2 and hnRNPA1 proteins have two repeats of RNA-recognition motifs (RRMs) that bind RNAs. They are one of the most abundant core proteins of hnRNP complexes (Krecic and Swanson, 1999). These complexes are made of predominantly nuclear RNA-binding proteins that form complexes with RNA polymerase II transcripts. These proteins function in multiple cellular activities, ranging from transcription and pre-mRNA processing in the nucleus to cytoplasmic mRNA translation and turnover . hnRNPA1 and hnRNPA2B1 are involved in the packaging of pre-mRNA into hnRNP particles, transport of poly A+ mRNA from the nucleus to the cytoplasm, and may modulate splice site selection (Han et al., 2010). They are mainly present in the nucleus but can also shuttle between the nucleus and the cytoplasm (Pinol-Roma and Dreyfuss, 1992). Indeed, they are exported from the nucleus, along with other hnRNP proteins and probably bound to mRNA, to then be re-imported.
Their M9 domain acts as both a nuclear localization and nuclear export signal (Izaurralde et al., 1997). hnRNPA1 and hnRNPA2 also contain a carboxy terminal domain enriched in glycines that clusters the disease-causing mutations (Figure 8).
In muscle biopsies of healthy patients, hnRNPA2B1 and hnRNPA1 are predominantly nuclear. In patients presenting the disease-causing mutations in these two genes, muscles biopsies reveal a relocalization of those proteins to cytoplasmic inclusions; the proteins are cleared from many nuclei (Kim et al., 2013). Muscle biopsies of the same patients also reveal TDP-43 pathology with a nuclear clearance and cytoplasmic inclusions. This observation is consistent with previous observations made in patients harboring mutations within VCP. hnRNPA2B1 and hnRNPA1 pathologies were also observed in VCP-related IBM (Salajegheh et al., 2009).
Relationship between previously described disease-implicated proteins
Interaction between VCP, TDP-43, hnRNPA1 and hnRNPA2
Several pieces of evidence support the conclusion that VCP, TDP-43, hnRNPA2B1 and hnRNPA1 interact with each other. First, it has been shown that the C-terminal region of TDP-43 binds directly to hnRNPA1 and hnRNPA2/B1 (Buratti et al., 2005). This allows a cooperative regulation of RNA metabolism in cells. Drosophila melanogaster has also been used in a genetic screen to identify suppressors of VCP-related degeneration (Ritson et al., 2010). Three genes in particular were identified: TBPH, the Drosophila homolog of TDP-43, and Hrb87F and Hrb98DE, the homologs of hnRNPA2B1 and hnRNPA1. Finally, mutations in VCP, hnRNPA2B1, hnrNPA1 and TDP-43 are shared between different neurodegenerative diseases, such as ALS, IBM, Paget’s disease of bone and FTD (Figure 9).
Importantly, similarities exist in the pathology underlying muscle degeneration in ALS and IBMPFD caused by mutations in hnRNPA2B1, hnRNPA1 or VCP. Indeed, hnRNPA2B1, 29 hnRNPA1 and TDP-43 are found in pathological cytoplasmic inclusions in sporadic and familial IBM (Kim et al., 2013). These proteins harbor the same structure and by analogy we could speculate that the C-terminal domain of the proteins containing the disease-causing mutations could contribute to the initiation or propagation of disease.
hnRNPs contain a low complexity sequence domain clustering the disease-causing mutations
Heterogeneous nuclear ribonucleoproteins or hnRNPs are defined as the set of proteins that associate with pre-mRNAs. In addition to their nuclear functions that includes regulation of mRNA splicing, a subset of hnRNP proteins also accompany mRNAs out into the cytoplasm and participate in the localization and translational regulation of the RNAs (Dreyfuss et al., 2002). Here we will focus on TDP-43, hnRNPA1, hnRNPA2B1, and other hnRNPs.
Figure 10: Structure comparison of several hnRNPs. The different disease-causing mutations are indicated.
hnRNPs have similar structures composed of several RNA recognition motifs and a C-terminal glycine rich domain (Figure 10). This domain has relatively homogeneous amino acid content, harboring low amino acid diversity. For example, in hnRNPA1, 86% of this domain is made up of only four amino acids: glycine, serine, asparagine, and phenylalanine. Because of this low diversity, this region has been designed as a low complexity sequence domain (LCD) (Figure 10). LCDs are intrinsically disordered and are enriched in glycine as well as the uncharged polar amino acids: asparagine, glutamine, tyrosine, and serine. In the literature, several nomenclatures are used to describe it: glycine-rich domain, low complexity sequence domain (LCD) and prion-like domain (PrLD). The latter designation comes from the fact that these domains have a composition similar to the ones described for yeast prions. This domain is also extremely interesting because it clusters most of the disease-causing mutations. The mutations displayed in Figure 10 cause IBMPF and ALS.
Different hnRNPs can interact with each other through their LCDs and be recruited to a particular cellular structure: stress granules. A wealth of genetic evidence has emerged over the past 5 years implicating stress granules as a subcellular compartment that is central to the pathogenesis of degenerative diseases described earlier (Li et al., 2013; Ramaswami et al., 2013). Indeed, it is hypothesized that the cytoplasmic inclusions found in patient cells could be the result of persistent stress granules.
Stress granules as crucibles of these pathogeneses
What are stress granules?
The central dogma of molecular biology deals with the detailed “residue-by-residue transfer of sequential information” (Crick, 1970). In other terms, “DNA makes RNA and RNA makes protein”. DNA is translated in the nucleus into messenger RNAs (mRNAs). In eukaryotic cells, the site of transcription (the nucleus) is usually separated from the site of translation (the cytoplasm), so the mRNA needs to be transported out of the nucleus into the cytoplasm. As mRNA exits the nucleus, the major cytoplasmic cap-binding protein eukaryotic translation initiation factor 4E (eIF4E) recruits the translation initiation complex. This recruitment promotes protein synthesis via the assembly of polysomes (a string of translating ribosomes bound to mRNA). In cells, polysomes can be disassembled in the presence of cellular stressors such as heat shock, oxidative stress, ischemia, or viral infection. These different stresses trigger a translational arrest that leads to a process of molecular triage in which mRNA from disassembling polysomes is sorted and the fate of individual transcripts is determined. Cytoplasmic stress granules (Figure 11) are cellular compartments that form as a direct consequence of this triage process, suggesting that these granules contain mRNAs stalled in the process of translation initiation (Buchan and Parker, 2009). Stress granules are membrane-less cytoplasmic foci ranging in size from 0.1 to 5 μm and they have been observed in yeast, protozoa, and metazoa.
Stress Granule dynamics
Assembly of Stress Granules
Stress granule (SG) assembly depends on different sequential events that include stalled initiation, polysome disassembly, and messenger ribonucleoprotein (mRNP) aggregation. Upon cellular stress in eukaryotic cells, four different stress-sensing serine/threonine kinases (PKR, PERK, HRI, GCN) are activated and in turn phosphorylate the alpha subunit of the initiation factor eIF2 (Anderson and Kedersha, 2008; Buchan and Parker, 2009) (Figure 12). Translation initiation usually needs eIF2 in its unphosphorylated state to initiate translation, thus phosphorylation of eIF2 inhibits initiation of a further round of translation by depleting the eIF2/tRNAiMet/GTP ternary complex. In the absence of the active ternary complex, ribosomes finish their round on the translated transcript and then “run-off”, as a further round cannot be initiated. The remaining 48S preinitiation complex stays bound to the 5´UTR of the mRNA (Anderson and Kedersha, 2008). This results in the production of a translationally stalled, noncanonical 48S complex unable to recruit the 60S ribosomal subunit. Although the next step, SG nucleation, is not yet fully understood, it is assumed that aggregation-prone SG proteins, such as TIA-1, TIAR and G3BP, associate with the 48S pre-initiation complex and form mRNP oligomers. Once this nucleation occurs, SGs assemble and contain the core SG components: eIF3, eIF4F, PABP-1 and small ribosomal subunits. The mRNA transcripts recruited to SGs are bound to multiple proteins that can interact with each other. Homotypic aggregation of the proteins assembles mRNPs into submicroscopic aggregates, which are then clustered into larger, microscopically visible SGs by heterotypic interactions between SG-nucleating proteins, and by association of some RBPs with multiple classes of transcripts (Anderson and Kedersha, 2008). Over time, the SGs fuse in larger and fewer structures. Some proteins lacking mRNA-binding properties can also be recruited because of their interaction with the SG nucleating proteins or with other components of SGs (Anderson and Kedersha, 2008).
Disassembly of Stress Granules
Stress granule formation is reversible. During stress recovery, stress granules disassemble rapidly and polysomes are re-formed, leading to the translation of mRNAs (Figure 12). Because 48S pre-initiation complexes are preserved in SGs in an assembled state, translation can be rapidly reactivated upon stress recovery. Interestingly, some drugs can inhibit translational elongation and force the disassembly of pre-formed stress granules, indicating that stress granule mRNPs are in dynamic equilibrium with polysomes (Anderson and Kedersha, 2009). Moreover, fluorescence recovery after photobleaching (FRAP) measurements showed a very fast turnover of RNA-binding proteins in and out of the stress granules (on the order of seconds to minutes) (Buchan and Parker, 2009). This rapid shuttling of RBP and mRNAs within stress granules suggests that stress granules are not sites of long-term mRNP storage, unlike other types of granules such as germ granules and neuronal granules (Anderson and Kedersha, 2009).
Table of contents :
Chapter I: Introduction
I.1 Pathogenesis of different neuronal and muscle disorders
I.1.1 Introduction of ALS and IBMPFD
I.1.1.1 Amyotrophic lateral sclerosis (ALS)
I.1.1.2 Inclusion body myopathy associated with Paget’s disease of bone and frontotemporal dementia (IBMPFD)
I.1.1.3 RNA-binding proteins aggregates are a hallmark of ALS and IBMPFD
I.1.2 Proteins involved in ALS and IBMPFD
I.1.2.3 Identification of hnRNPA1 and hnRNPA2B1 as disease-causing proteins
I.126.96.36.199 Identification of VCP-mutations negative families
I.188.8.131.52 hnRNPA1 and hnRNPA2B1
I.1.2.4 Relationship between previously described disease-implicated proteins
I.184.108.40.206 Interaction between VCP, TDP-43, hnRNPA1 and hnRNPA2
I.220.127.116.11 hnRNPs contain a low complexity sequence domain clustering the disease-causing mutations
I.2 Stress granules as crucibles of these pathogeneses
I.2.1 What are stress granules?
I.2.2 Stress Granule dynamics
I.2.2.1 Assembly of Stress Granules
I.2.2.2 Disassembly of Stress Granules
I.2.3 Stress granule function
I.2.4 Core component of stress granules
I.2.5 Implication of disease-causing mutations in stress granule dynamics
I.3 Does fibrillization mediate stress granule assembly?
I.4 Recruitment of stress signaling proteins to stress granules
I.4.1 Compartmentalization of membrane-less organelles in cells
I.4.2 Liquid-like behavior of cellular structures
I.4.3 Liquid-liquid phase separation: a mechanism that allows cellular compartmentalization
I.4.3.1 Definition of liquid-liquid phase separation
I.4.3.2 Phase diagrams
I.4.3.3 Liquid-liquid phase separation: a mechanism for subcompartmentalization
I.4.3.4 Proteins harboring low complexity domain sequences can mediate LLPS
Chapter II: Phase Separation by Low Complexity Domains Promotes Stress Granule Assembly and Drives Pathological Fibrillization
Chapter III: Discussion
III.1 LLPS as a mechanism commonly used by cells in order to accomplish their biological processes
III.2 From stress granules to pathological inclusions: is there a unifying model?
III.3 Compartmentalization via LLPS: a code in low complexity sequence domains?
Chapter IV: Annexes
IV.1 Annex I: VCP is essential for mitochondrial quality control by PINK1/Parkin and this function is impaired by VCP mutations
IV.2 Annex II: Mutations in prion-like domains in hnRNPA2B1 and hnRNPA1 cause multisystem proteinopathy and ALS
IV.3 Annex III: No mutations in hnRNPA1 and hnRNPA2B1 in Dutch patients with amyotrophic lateral sclerosis, frontotemporal dementia, and inclusion body myopathy