Reduction and stability analysis of a transcription-translation model of RNA polymerase
The aim of the work presented in this chapter is to analyze the dynamical behavior of models of gene transcription and translation, in the case of RNA polymerase synthesis. This is an example of positive feedback loop, where RNA polymerase is needed to tran-scribe its own gene. We write a full model of high dimension based on mass-action laws. Using monotone system theory and time-scale arguments, we reduce it to a model with two variables (RNA polymerase and its mRNA). We show that it has either a single globally stable trivial equilibrium in (0, 0), or it has an unstable zero equilibrium and a globally stable positive one. We give generalizations of this model, in particular with a variable growth rate. The dynamical behavior can be related to biological observations on the bacterium Escherichia coli.
A first draft of this work was written by Ismail Belgacem, former PhD student of the BIOCORE team. My contributions were: performing simulations of the full and reduced models with a new set of parameters to have more realistic results from a biological point of view; comparing a classical model of RNA polymerase to the reduced model that we obtained, including a sensitivity study with respect to the number of ribosomes; designing and studying a reduced system including a variable growth rate.
A journal paper version of this chapter was submitted to Bulletin of Mathematical Bi-ology, in which I am second author (see Appendix A).
The central dogma of molecular biology argues that “DNA makes RNA and RNA makes proteins”, which are the primary components of cells, see . As we have seen in Chapter 3 gene expression starts with transcription, where the gene is copied into a messenger RNA (mRNA) by the RNA polymerase. The mRNA is then translated into proteins by ribosomes. In prokaryotic cells like bacteria, transcription and translation take place in the same compartment. As a consequence, ribosomes can translate nascent mRNAs being elongated by the RNA polymerase.
Classical models of gene expression often disregard the eﬀect of RNA polymerase and ribosome concentration on the accumulation of RNAs and proteins, which is assumed non limiting. Yet, some works emphasize the important role of the global machinery for gene expression (see  for an example). It is therefore interesting to build detailed models involving the main actors of the transcription-translation processes, such as RNA polymerase and ribosomes: some partial detailed models have been developed, see  for an example. We develop in this chapter a complete and detailed model of RNA polymerase. From the point of view of Control Theory, it is also a nice example of a positive feedback loop, where RNA polymerase is needed to transcribe its own gene.
Based on mass-actions laws, we first write a detailed mechanistic model of transcription and translation, where every event (binding, release,. . .) is accounted for. The high dimension of the resulting model makes it too diﬃcult to handle: we reduce it into a much simpler system by time-scale arguments and we study the mathematical properties of the reduced model. To investigate the stability of the fast subsystem and of the reduced system, we use monotone system theory and concavity properties.
Monotone systems form a class of dynamical systems such that the partial order in dimension n between two solutions is conserved (see ); see Appendix C.1 for more details about monotone systems. These tools are well adapted to analyze the stability of biological models . They have strong properties of convergence towards equilibria, and cannot (for example) exhibit stable periodic oscillations. The second tool is related to the concavity of functions used in diﬀerential equations . In our opinion, these tools are particularly appealing because they are qualitative (they do not depend too much on the values of parameters), and they give very strong results about the global dynamical behavior of the system . These tools have been applied to biological systems already: population dynamics , chemical networks … J.-L. Gouz´e and I. Belgacem have worked with monotone systems theory on metabolic-genetic networks  and on detailed models for gene expression, without any loop [14, 16]. Yet, to our knowledge, the theorem on concave and monotone systems has not been used in the context of detailed gene expression models, where functions can be given by rather complex algebraic expressions resulting from mass balance.
Using monotone system theory, we are able to prove the global stability of the fast sub-system with respect to the equilibrium; the global result is diﬃcult to prove by other techniques, and not proven or proven only locally in most similar works [44, Chapter 4]. We present the original model and its reduction in the next section. Using parameters built from the literature, we analyze the diﬀerent time scales in which evolve the variables of the full system. We decompose the full system into fast and slow subsystems and verify in Section 5.4 that the fast subsystem satisfies the conditions for applicability of the Tikhonov’s theorem. This allows us to put the fast subsystem at its quasi-steady state and obtain a reduced model with a similar dynamical behavior. In Section 5.6, we verify that the concavity and monotonicity assumptions hold for the reduced model. We show that the trivial equilibrium is either globally stable (in that case no other equilibrium exists) or locally unstable, and that it implies the existence and uniqueness of a positive equilibrium, which is globally stable with respect to the positive orthant. We provide the biological condition for this alternative. We then investigate a generalization of the model with a variable growth rate, compare it with simpler models, and finally give conclusions from a modeling and a biological point of view.
The coupled transcription-translation model of RNA polymerase
Description of the model
Figure 5.1 shows the transcription-translation model for the synthesis of RNA poly-merase in a single cell; for simplification, we consider it to be encoded from a single gene. This model is inspired from those given in . Transcription is initiated by the specific binding of RNA polymerase to the promoter region D onto the DNA, a process promoted by an initiating factor called σ. The RNA polymerase clears the promoter (with a constant rate kc) and moves along the DNA (with a constant rate kt). Complexes
Y and Y i describe the elongating RNA polymerase, which adds nucleotides one by one. Addition of the last nucleotide completes the full length mRNA, which is released from the RNA polymerase. The completed RNA molecule is either subject to degradation (with a constant rate km) or it is used by ribosomes as a template for the synthesis of a new RNA polymerase1. Translation starts with the ribosome R forming a complex RRN A′ with the free ribosome binding site RN A′ on the newly synthesized mRNA. After clearance of the ribosome binding site (with a constant rate kw), the elongating 1The process of translation can be initiated from every nascent mRNA as shown in . For simplicity, we suppose that proteins are synthesized from completed mRNAs only. This is consistent with recent observations on the lack of coupling between transcription and translation in E. COLI cells .
The protein and its mRNA are also subject to degradation (with a constant rate kp and km, respectively), and dilution by growth due to the augmentation of cell volume (at a rate µ). In a first step, we consider a constant growth rate µ and, for simplicity, the sum of degradation and dilution will be expressed by only one parameter: kp′ for the protein and km′ for the mRNA. In a second step (Section 5.9) we will consider the degradation rate and the dilution rate separately, where the latter varies in function of the RNA polymerase concentration.
• The nucleotides, the amino acids and the sigma factor are non limiting and their concentrations are included in the parameters;
• The free and bound forms of RNA polymerase and mRNA are considered to be degraded at the same rate;
• The degradation of the bound forms (PD and RRN A′) releases the promoter and the ribosome;
• The free mRNA corresponds to the mRNA with a free ribosome binding site RN A′.
Using classical mass action kinetics laws we obtain the system:
where p, d, c, y, yi and m are the concentrations of P , D, P D, Y , Y i and mRNA respectively, and where w, r, x and xi are the concentrations of RRN A′, R, X and Xi respectively. L and H are the lengths of the mRNA and the protein, respectively2. The promoter D remains intact during the degradation of the complex between the RNA polymerase and the promoter, which means that the following mass conservation relation holds for the total concentration of promoter:
where d0 is the initial concentration of promoter. We can reduce System (5.1) by re-placing the three first equations with the following ones:
The values of parameters in Tables 5.14, 5.2 and 5.3 have been carefully built from the literature based on classical papers such as . In the next section, we will show that System (5.1) has two diﬀerent time scales – fast and slow – and therefore it can be approximated by a reduced system by applying Tikhonov’s theorem (see Appendix C.2 and  for the statement of the theorem).
Principal process analysis and its robustness to parameter changes
In this chapter we discuss a work that has been written as a journal paper and submitted to Journal of Theoretical Biology in which I am first author (see Appendix A).
We design a method called principal process analysis (PPA) that aims at analyzing the key processes for the system behavior of dynamical networks of high dimension. The knowledge of the system trajectories allows us to decompose the system dynamics into processes that are active or inactive with respect to a certain threshold value. Process activities are graphically represented by boolean and dynamical process maps. We elim-inate from the model processes that are always inactive, and inactive in one or several time windows. This reduces the complex dynamics of the original model to the much simpler dynamics of the core processes, in a succession of sub-models that are easier to analyze. In this chapter we apply the method to a well-known model of circadian rhythms in mammals  and we use global relative errors to assess the quality of the model re-duction and apply global sensitivity analysis to test the influence of model parameters on the errors. The results obtained prove the robustness of our method. Analysis of the sub-model dynamics allows us to analyze the source of circadian oscillations. We find that the negative feedback loop involving proteins PER, CRY, CLOCK-BMAL1 is the main oscillator, in agreement with previous modeling and experimental studies. Hence, PPA is a simple-to-use method, which constitutes an additional and useful tool for analyzing the complex dynamical behavior of biological systems.
The parameter sensitivity analyses were performed with the supervision of Suzanne Touzeau, researcher at INRA and Inria of Sophia Antipolis.
In Appendix B we present our first use of PPA. We apply our method to a model that describes the circadian rhythm in Drosophila  and another one that describes the influence of RKIP on the ERK signaling pathway . The work in Appendix B has been presented at the 23rd Mediterranean Conference on Control and Automation MED, held in Torremolinos, Spain, on June 16th-19th, 2015 (with peer reviewed proceedings) and has been accepted as a conference paper in which I am first author (see Appendix A).
Mathematical modeling has been used for decades as an approach to address complex problems in several domains of biology. For example, it helps studying the large networks of metabolites, RNAs and proteins that allow cells to live and grow. Numerous models of these networks have been developed in computational biology, of increasing complexity due to advances in modeling and parameter estimation approaches (see  and  for an example). Complexity arises from the high dimension of these networks, the large number of biological processes involved and their non linearity due to complex feedback loops. This makes the analysis of network functioning rather diﬃcult, notably in terms of key processes and regulatory mechanisms for the system dynamics.
Diﬀerent techniques are classically used to reduce model complexity. The simplified models are easier to analyze, while retaining the main characteristics of the original models and their biological significance. Quasi-steady-state approximations are mostly used to reduce system dimension when diﬀerent time scales are present. Replacement of some ordinary diﬀerential equations (ODEs) by algebraic expressions results in a diﬀerential-algebraic system of smaller dimension. However the reduced models may remain diﬃcult to analyze . Other approaches simplify the mathematical functions describing the molecular processes. For instance, piece-wise aﬃne diﬀerential equations approximate by step functions the sigmoidal functions used to describe the regulation of gene expression. The dynamics of the simplified system can be easily analyzed by means of state transition graphs . However, these simplifications are generally restricted to models of gene expression and are more diﬃcult to apply to other types of networks .
Here we address the problem of high dimensional model analysis and reduction by de-veloping a mathematical and numerical approach, based on the boolean concept of ac-tivity/inactivity. The method, called principal process analysis (PPA), determines the contribution of each cellular process to the output of the dynamical system, without changing its main structure. We first identify processes that are not contributing signifi-cantly to the system dynamics and neglect the inactive ones. Some processes can switch from active, when they contribute most to the system dynamics, to inactive. In a second step, we thus define time windows in which processes are either always active or always inactive. We eventually create sub-models for each time window that only contain the active processes. This reduces the system to its core mechanisms. PPA is a general approach that can be easily applied to any biological system described by ODEs. For example, it has been recently used to reveal the correlation between C cycling and pesti-cide degradation in the detritusphere and fungal dynamics , or to reduce a dynamic metabolic model of lipid accumulation .
PPA shares some common features with a method focusing on the major model param-eters rather than processes . The exploration of parameter space leads to admissible system outputs. Parameters contributing most to the system dynamics are identified by cross sections of the admissible regions, whereas PPA uses dynamical weights and fixed thresholds to determine major processes. Parameters contributing less to the sys-tem dynamics are eventually removed. Another approach dedicated to chemical kinetics makes use of stoichiometric coeﬃcients and chemical reactions to identify and remove chemical species that contribute less to the model output . In this case, the problem is solved using optimization approaches (see also ).
The main objective in this work is that our method should neither change the structure of the model nor require additional and complicated computations as QSSA can do. The original and simplified models should remain close and the interpretation of results should be easy for the biologist.
In Appendix B, we started to develop PPA and applied it to two ODE models whose reduction preserved their dynamical behavior. Hence, the model of circadian rhythms in Drosophila  was reduced into two models, each describing the system dynam-ics during day light or darkness. The simpler models maintain a functional negative feedback loop responsible for the oscillatory behavior of clock proteins. In the second model describing the regulation of the ERK signaling pathway, the process found to be the most active influences the variable determinant for the system dynamics, as shown in . Questions remained open though, concerning the scalability of the approach and its robustness: to which extent does PPA preserve model dynamics in systems of higher dimension, with many more biological processes involved and interlocked feed-back loops? And since the approach requires an a priori knowledge of parameter values, how sensitive are process activities or inactivities to parameter values? In this study, we address these questions by studying a much more complex model of circadian rhythms in mammals, including 16 variables, 76 processes, and intertwined positive and negative feedback loops . Parameter sensitivity analysis of the global relative error between the original and reduced systems allowed us to assess the quality and robustness of our approach. To that aim, fractional factorial design was used to explore a large parameter space in a limited number of simulations.
Table of contents :
1.3 Organization of the manuscript and contributions
2 Introduction (en francais)
3 Notes on molecular cell biology
3.1 Escherichia coli
3.2 Growth of E. coli
3.3 Gene expression
3.3.3 mRNA degradation
3.4 Regulation of gene expression in E. coli
4 Modeling genetic regulatory network systems
4.1 Ordinary differential equation models
4.1.1 Modeling transcription-translation
4.1.2 Quasi-steady-state assumption of mRNA concentration
4.2 Analysis of a genetic bistable switch
4.2.1 Phase plane analysis
4.2.2 Jacobian matrix
4.2.3 Piece-wise affine linear system
4.3 Parameter sensitivity analysis
4.3.1 Local sensitivity analysis
4.3.2 Global sensitivity analysis
4.4 Parameter fitting
5 Reduction and stability analysis of a transcription-translation model of RNA polymerase 35
5.2 The coupled transcription-translation model of RNA polymerase
5.2.1 Description of the model
5.2.2 Full equation
5.3 Time-scale reduction (fast-slow behavior)
5.3.1 Parameter values for the coupled transcription-translation models of RNA polymerase
5.3.2 Separation of the full system into “fast” and “slow” variables
5.4 Verification of the applicability of the Tikhonov’s theorem for the fast subsystems
5.5 Application of the Tikhonov’s theorem
5.6 Dynamical study of the reduced system
5.6.1 Simulations of the full and the reduced system
5.6.2 Equilibria of the reduced system
5.6.3 Stability of equilibria
5.7 Applications to other models
5.8 Comparison with a classical model of RNA polymerase
5.9 System with a variable growth rate
6 Principal process analysis and its robustness to parameter changes
6.2.1 Principal process analysis (PPA)
6.2.2 Visualization of process activities
6.2.3 First model reduction
6.2.4 Creation of chains of sub-models
6.2.5 Global sensitivity analysis
6.3 Model description
6.4 Principal process analysis and first reduction
6.5 Creation of sub-models
6.6 Parameter influence
7 Principal process analysis and reduction of biological models with dif-ferent orders of magnitude
7.2.1 Principal process analysis and model reduction
7.2.2 Principal process analysis and model reduction based on initial conditions in a rectangle
7.2.3 Possible transitions between domains
7.3 The gene expression model
7.4 Model reduction from an initial condition
7.5 Model reduction in a rectangle
8 Principal process analysis applied to a model of endocrine toxicity induced by Fluopyram
8.2.1 Absolute principal process analysis
8.2.2 Visualization of the process activity
8.3 Hierarchical graph
8.4.1 Blood compartment
8.4.2 Liver compartment
8.4.3 Brain compartment
8.4.4 Thyroid compartment
8.4.5 The data
8.4.6 Different experiments in silico
8.5 Absolute principal process analysis on the experiment 1A
8.6 Absolute principal process analysis on the experiment 2B
8.7 Conclusion and future steps
9 Model and control of the gene expression machinery in E. coli
9.2 The model
9.2.1 Growth rate
9.3 The effect of IPTG on E. coli growth
9.4 Model analysis with three-level PPA
9.4.2 Different applications
220.127.116.11 Nutrient stress condition
18.104.22.168 IPTG stress condition
10 Single-cell model calibration of growth control experiments in E. coli
10.3.2 Extraction of cellular profiles
10.3.3 Calculation of average cell profiles
10.3.4 Calibration of the model
10.4.1 Cellular profiles
10.4.2 Calibration of the average cell model
10.4.3 Calibration of the single-cell models
11 Conclusion and perspectives
11.1 Classical tools for the analysis and reduction of biological models
11.2 New tools for the analysis and reduction of biological systems
11.3 Design and analysis of the gene expression machinery in E. coli
11.4 Single-cell and average cell calibration of the gene expression machinery control in E. coli
12 Conclusion et perspectives (en francais)
12.1 Outils classiques pour l’analyse et la r´eduction des mod`eles biologiques
12.2 Nouveaux outils pour l’analyse et la r´eduction des syst`emes biologiques
12.3 Mod´elisation et analyse du m´ecanisme d’expression des g`enes dans E. coli
12.4 Calibration d’un mod`ele de contrˆole de la machinerie d’expression des g`enes dans E. coli en utilisant les profils de la cellule individuelle et la moyenne