Synthetic biology tools for Y. lipolytica
Synthetic biology is an emerging discipline that aims to apply engineering princi-ples to biological systems to render them more controllable, standardized, and predict-able. In order to accomplish these tasks and significantly accelerate research, new and highly efficient tools are being developed, allowing systems of interest to be con-structed more rapidly and at a lower cost. In recent years, different synthetic biology tools have been created and applied in Y. lipolytica, including 1) DNA parts for the construction of expression cassettes, 2) DNA assembly techniques, 3) genome-editing techniques, and 4) computational tools. The significantly expansion of the dedicated genetic toolbox for this oleaginous yeast further expand its range of applications.
In terms of DNA parts for expression cassettes, promoters, terminators, selection markers and sequences for genome integration need to be considered. Selecting pro-moters according to their strength is the most widespread method for controlling gene expression in metabolic engineering techniques, which is a key parameter to optimize in pathway engineering. Consequently, significant efforts have been made to develop promoters exhibiting a wide range of transcriptional activities. In Y. lipolytica, in addi-tion to endogenous constitutive and inducible promoters, hybrid promoters have been developed mostly by fusing multiples UAS (Upstream Activating Sequence) to a core promoter (Madzak et al., 2000; Müller et al., 1998; Park et al., 2019; Trassaert et al., 2017). These hybrid promoters, show an increase in promoter strength as a function of the number of tandem UAS; some can increase expression efficiency eight-fold, com-pared to the known endogenous promoter in Y. lipolytica (Blazeck et al., 2011). On the other hand, terminators have been much less studied and developed, even though, they are essential for completing the transcription process. Native terminators are mostly used, however, synthetic terminators were recently constructed (Curran et al., 2015; Wagner and Alper, 2016).
Regarding selection markers, both auxotrophy and dominant markers are available for Y. lipolytica. Auxotrophy markers, which can only be used with specific strains, remain the best choice for performing selection in Y. lipolytica (e.g., leucine, uracil, lysine, or adenine) (Barth and Gaillardin, 1996). On the other hand, dominant markers available for Y. lipolytica include the hygromycin resistance gene from E. coli (Cordero Otero and Gaillardin, 1996), the nourseothricin resistance gene from Streptomyces noursei (Kretzschmar et al., 2013), the mycophenolic acid resistance gene from E. coli (Wagner et al., 2018), and the zeocin resistance gene from Streptoalloteichus hindusta-nus (Tsakraklides et al., 2018). Other dominant markers involve the utilization of a specific carbon source, such as sucrose when the SUC2 gene from S. cerevisiae is ex-pressed (Nicaud et al., 1989), erythritol when the EYK1 gene is expressed (Vandermies et al., 2017) or acetamide by expressing the gene AMD1 gene (Hamilton et al., 2020). For the last two cases, deletion of these genes is required in order to be used afterwards as markers, as for auxotrophy markers.
Finally, integration into the genome is mostly achieved by large (0.5–1 kb) homol-ogous 5′ and 3′ flanking regions or by zeta regions from the Ylt retrotransposon. While homologous sequences are used for specific genome targets, zeta sequences are in-volved in random integration into the genome in strains devoid of retrotransposon ele-ments (Juretzek et al., 2001). For the purposes of heterologous expression and genetic engineering, the integration into the genome is preferred rather than synthetic replica-tive plasmids (Nicaud et al., 2002). The use of such vectors is limited in Y. lipolytica because of low copy numbers (~1–3 plasmids/cell) and the high frequency of loss (Fournier et al., 1993; Nicaud et al., 1991).
As the cornerstone of synthetic biology, the DNA assembly process allows the construction of novel biological systems and devices using defined, standardized, and well-characterized components. The assemblies consist of DNA fragments physically linked end to end, creating a target higher-order assembly that is then joined to a vector. Traditional techniques employing restriction digestion and element-by-element cloning are time consuming and cost inefficient (Celińska and Grajek, 2013; Matthäus et al., 2014). Therefore, significant efforts are being made to develop methods that would allow multigene cassettes to be constructed more quickly and efficiently, which will consequently enable the easier construction of strains with complex genetic functional-ities. Among the most recent and relevant ones are Golden Gate assembly, Gibson as-sembly, BioBricks and Gateway cloning (Figure 2).
The Golden Gate (GG) is a modular cloning approach, based on type IIS restriction enzymes, for assembling multiple genes via a single-step, one-pot reaction (Engler et al., 2008). Type IIS enzymes cut outside their recognition sites to excise sequences with overhangs that can be, thus, arbitrarily defined. The DNA modules are designed in such a way that the enzyme recognition site is lost after the digestion and the overhangs generated are compatible to each other to allow a predetermined sequential order of the modules. In Y. lipolytica, this technique has been successfully used to assemble heter-ologous pathways of three genes at once, allowing the production of β-carotene and the utilization of xylose as carbon source (Celińska et al., 2017; Larroude et al., 2019). One of the advantages of GG is that it allows combinatorial assembly, which can be used to efficiently generate libraries, and was used to identify the best promoter-gene pairs for the carotenoids pathway (Larroude et al., 2017).
Gibson assembly allows multiple DNA fragments to be assembled regardless of their length or end compatibility. It exploits three different enzymes: (1) The exonucle-ase that creates single-stranded 3´ overhangs that facilitate the annealing of fragments that share complementarity at one end (overlap region), (2) the DNA polymerase that fills gaps within each annealed fragment, and (3) the DNA ligase that seals nicks in the assembled DNA. The end result is a double-stranded fully sealed DNA molecule (Gibson et al., 2009). In Y. lipolytica, this technique was used to study the xylose path-way (Rodriguez et al., 2016).
A set of BioBrick-based vectors dedicated for Y. lipolytica was developed by Wong and colleagues (Wong et al., 2017). The so called YaliBricks comprises four compatible restriction enzyme sites (AvrII, XbaI, SpeI, and NheI) that enable modular genetic en-gineering and the reuse of parts. The system was used for constructing the violacein biosynthetic pathway (Wong et al., 2017). It is a fast and easy method, but it relies on specific restriction-site-free genes.
The Entry/Gateway® method employs site-specific recombination between att sites on interacting molecules to rapidly clone single DNA sequences in multiple des-tination plasmids (Hartley et al., 2000). It was recently adapted for use in Y. lipolytica and used for the construction of a library overexpressing individual transcription factors to identify regulators involved in lipid metabolism (Leplat et al., 2018).
It is important to note that there are no perfect techniques for universal DNA as-sembly. Each method presents advantages and disadvantages in different situations. Therefore, the selection of a method will depend on one’s objectives.
Golden Gate assembly exploits type IIS restriction enzymes, which cut outside their recognition sites to excise parts with assembly exploits type IIS restriction enzymes, which cut outside their recognition sites to excise parts with arbitrary defined overhangs. Through the careful selection of compatible overhangs, such parts can be assembled alto-arbitrary defined overhangs. Through the careful selection of compatible overhangs, such parts can be assembled alto-gether in a defined order. Igether in a defined order. In the figure, the letters A to M represent different compatible 4n the figure, the letters A to M represent different compatible 4-nt overhangs; the yellow, nt overhangs; the yellow, orange, and red arrows represent promoters; the green arrows represent genes; the violet circles represents markers; orange, and red arrows represent promoters; the green arrows represent genes; the violet circles represents markers; the blue squares represents insertion sequencesthe blue squares represents insertion sequences; and the Ts represents terminators. The ; and the Ts represents terminators. The BioBricksBioBricks technique is used to technique is used to clone parts via restriction digestion and subsequent ligation of the resultant compatible sticky ends. YaliBricks vectors clone parts via restriction digestion and subsequent ligation of the resultant compatible sticky ends. YaliBricks vectors were designed to have AvrII, XbaI, SpeI and NheI endonuclease were designed to have AvrII, XbaI, SpeI and NheI endonuclease site recognition. The ligation of the compatible overhangs site recognition. The ligation of the compatible overhangs produces a scar that is no longer recognized by either enzyme. In produces a scar that is no longer recognized by either enzyme. In GibsonGibson assembly, parts are synthetized to overlap by assembly, parts are synthetized to overlap by 30+ bp. Their ends are then processed by an exonuclease30+ bp. Their ends are then processed by an exonuclease that creates singlethat creates single-st randed 3’ overhangs, which facilitates randed 3’ overhangs, which facilitates annealing. The overhangs are fused together using a polymerase, which fills in gaps within each annealed fragment, annealing. The overhangs are fused together using a polymerase, which fills in gaps within each annealed fragment, and and a ligase seals gaps in the assembled DNA. In the a ligase seals gaps in the assembled DNA. In the GatewayGateway method, the gene of interest, which has bmethod, the gene of interest, which has been cloned into the een cloned into the entry vector, is transferred into the destination vector via att site recombination. The expression vector obtained is entry vector, is transferred into the destination vector via att site recombination. The expression vector obtained is digested to release the expression cassette and used to transform digested to release the expression cassette and used to transform Y. lipolyticaY. lipolytica. Figure adapted from Figure adapted from Larroude Larroude et al.,et al., 2018 2018
In Y. lipolytica some genomic tools have been developed for metabolic engineering (Figure 3), including a disruption cassette system for gene knock-out (Fickers et al., 2003) and transcription activator-like effector nucleases (TALENs) (Rigouin et al., 2017). However, the low efficiency and time-consuming procedures involved in these methods make multiple modifications fastidious. The emergence of the CRISPR-Cas system offers a potential solution for these problems due to its high efficiency and ease to operate.
The CRISPR-Cas system basically consists of an endonuclease enzyme guided by a RNA structure to a specific DNA sequence that is digested (Jinek et al., 2012). The modifications in the genome are driven by the natural DNA reparation mechanisms of the cell after DNA break, resulting most of the time in randomly sized small deletions or insertions (indels) producing frameshift mutations that lead to the production of short non-sense proteins. If a DNA template is provided, reparation can occur by HR and used to insert a specific DNA sequences at the break point (Figure 3-B). Thus, CRISPR-Cas system turned out to be of great help for targeted genome modification. In addition, it increases the rate of DNA integration by HR in organism where it is highly impaired because of the high NHEJ rate, like in Y. lipolytica.
CRISPR-Cas9 system was successfully implemented in Y. lipolytica for the first time by Schwartz and colleagues (Schwartz et al., 2016). Cas9 nuclease targets specific genome loci defined by a 20 bp region of single guide RNA (sgRNA) and creates a double-stranded break (DBS) which is then repaired with the subsequent genome mod-ification. The system was rapidly expanded afterwards, allowing gene knock-out/knock-in and repression/activation applications, which was very helpful for accel-erating engineering cycles in this yeast (S. Gao et al., 2016; Schwartz et al., 2017b, 2017a; Zhang et al., 2018).
Schwartz and colleagues expressed the gRNA under a synthetic hybrid RNA Pol-ymerase III promoter and used a codon optimized Cas9. Single-gene disruption and HR were more than 90% and 70% effective, respectively. HR efficiency reached 100% when NHEJ was disrupted in the strain (Schwartz et al., 2016). These researchers also managed to integrate multiple genes at different loci and showed that gene integration efficiency depends on the integration site: of the 17 loci tested, only 5 had high CRISPR-Cas9-mediated integration frequencies (48–62%) (Schwartz et al., 2017b). In parallel, a second strategy for CRISPR-Cas9 genome editing in Y. lipolytica was devel-oped (S. Gao et al., 2016). It involves expressing a human-codon-optimized Cas9 var-iant and gRNA flanked by ribozymes under the control of a RNAP II promoter. Its efficiency was 86% after four days of outgrowth. Both systems allow highly effective gene targeting.
Two other CRISPR tools have been developed for use in Y. lipolytica. Holkenbrink and colleagues created a toolbox, EasyCloneYALI, containing a set of plasmids for integrating expression cassettes at a defined genomic locus; users can employ different selection markers or use a marker-free mode. In this study a ΔKu70 strain was used, making HR more efficient, and Cas9 was integrated into the genome and constitutively expressed. Eleven intergenic sites with high gene expression levels were identified, but only five had efficiencies higher than 80% for marker-free integration (Holkenbrink et al., 2018). Very recently, Gao and colleagues developed a dual CRISPR-Cas9 strategy using paired gRNA to create complete gene knockout via gene excision. Basically, two vectors, each containing a Cas9 gene and a single-guide RNA (sgRNA) cassette, are co-transformed in Y. lipolytica. The gRNAs were designed to target areas upstream from the start codon and downstream from the stop codon, which led to complete gene excision when the breaks occurred simultaneously and the resulting genomic regions were end-joined. The strategy was tested on six genes, and excision efficiency reached about 20%. They also used this dual CRISPR-Cas9 strategy to integrate a marker-free DNA fragment into the excision region, reaching an integration efficiency that range from 15% to 37% (Gao et al., 2018).
A CRISPR technique was also developed to control gene expression. In these sys-tems, a catalytically inactive Cas9 (dCas9), able to bind to specific DNA sequences when co-expressed with a gRNA but unable to introduce DSBs, is used to repress or activate gene expression. The repression is achieved by allosteric interference of the initiation or elongation of the transcription by the presence of the Cas9 in the DNA, and can be enhanced by fusing a repressor effector domain, like KRAB or Mxi1. On the other side, the transcription can be activated by fusing a transcriptional activator effec-tor to dCas9 and directing the complex to the promoter region of the gene (Gilbert et al., 2013; Qi et al., 2013) (Figure 3c). Schwartz and colleagues adapted CRISPR inter-ference and activation (CRISPRi and CRISPRa) systems for their use in Y. lipolytica. Nine genes were tested for gene repression, and for eight of them at least 50% of tran-scription was repressed using a multiplex strategy. Repression was enhanced when the Mxi1 repressor, but not the KRAB repressor, was fused with dCas9. This technique was used for the repression of KU70 and KU80, and led to an HR efficiency of 90% (Schwartz et al., 2017a). The same group developed a CRISPRa system to activate genes in Y. lipolytica. They screened four different activation domains and several tar-get sites in the promoter region. By adding the VPR activation domain to dCas9 and choosing gRNA targeting locations upstream from the core promoter, they activated two native β-glucosidases genes, BGLI and BGLII, which allowed growth on cellobi-ose (Schwartz et al., 2018). In addition, Zhang and colleagues also used a CRISPRi system for gene repression in Y. lipolytica—four different repressing constructions (dCpf1, dCas9, dCpf1-KRAB, and dCas9-KRAB) were employed. Gene repression ef-ficiency exceeded 80% when three different sites where simultaneously targeted for the same gene, exploiting a multiplex gRNA strategy. However, no strong repression was achieved by targeting only one point in the genome. As shown by Schwartz and col-leagues (2017a), the KRAB domain does not influence dCas9 activity. However, com-pared to results for dCpf1 alone, the use of dCpf1-KRAB increased repression effi-ciency by about 30% (Zhang et al., 2018).
All the tools described above were further discussed as part of a larger recent re-view work on synthetic biology tools for Y. lipolytica, published in Biotechnology ad-vances (Larroude et al., 2018). The review, which also includes other tools such as computational tools, is presented at the end of this section.
Since the publication of the mentioned review, further developments were done on genomic modification tools for Y. lipolytica, four of which are based on CRISPR sys-tem, and they are summarized here after. This is a perfect showcase of the fast expan-sion of synthetic biology field and on the interest Y. lipolytica is getting.
Morse and co-workers developed a CRISPR-Cas9 based on an orthogonal T7 pol-ymerase system for gRNA expression, thus, avoiding host organism-related RNA pro-cessing drawbacks when using type II CRISPR-Cas9. The system consists of a T7 pol-ymerase with SV40 nuclear localization tag driven by a strong constitutive promoter and a T7 phi9 promoter driving the guide RNA expression. It was first design in S. cerevisiae, were the editing frequency of CAN1 gene was of around 1% and was in accordance with the results found for the same gene using the commonly used Pol III snr52 system. The system was then transferred to Y. lipolytica, with 60% efficiency for CAN1 disruption, and K. lactis were NDT80 was edited with 96% efficiency (Morse et al., 2018).
Table of contents :
1.1 Context of the study
Synthetic biology, metabolic engineering and chassis strains
The CHASSY project
Objectives of the thesis
1.2 Yarrowia lipolytica as a chassis strain
General characteristics of Y. lipolytica
Y. lipolytica as a biotechnological workhorse
1.3 Synthetic biology tools for Y. lipolytica
1.4 Yeast factories for the production of aromatic compounds
The aromatic amino acid biosynthetic pathway
The Ehrlich pathway
Regulation of the aromatic amino-acid pathway
Engineering the AAA biosynthetic pathway.
Examples of compounds derived from the AAA pathway
1.5 Organization of the thesis
2 Materials and methods
2.1 Strains, media and growth conditions
Specificities of culture conditions
2.2 Cloning and strain construction
General molecular biology
Expression cassette and vector construction
Strategies for genome editing.
2.3 Verification of constructions
E. coli colony PCR
Digestion by restriction enzymes
Y. lipolytica colony PCR
2.4 Analytical techniques
Absorbance and fluorescence measurements.
2.6 Enzymatic assays
Chorismate mutase activity
DAHP synthase activity
2.7 Golden Gate
Construction of donor vectors with GG parts
Assembly of expression vectors
Transformation into a Y. lipolytica strain
Design gRNA molecules and relevant primers
Construction of the double stranded gRNA insert
Cloning double stranded gRNA inserts into the CRISPR-Cas9
acceptor plasmid pGGA_CRISPRyl
Check for correct assembly of the gRNA plasmids
Transform CRISPR-Cas plasmids to edit Y. lipolytica genome
Check DNA mutation.
2.9 Lugol test
3.1 Synthetic biology tools
Golden Gate assembly
Discussion and conclusion
3.2 Metabolic engineering of the aromatic amino acid pathway
Aromatic Amino Acids chassis strain
Regulation of Aro4 and Aro7 in Y. lipolytica
AAA chassis strain as cell factory to produce valuable compounds
Discussion and conclusion
4 General conclusion
6.1 Supplementary data
Supplementary data of articles presented
Codon optimized sequences of heterologous genes
Table of strains
Table of primers
6.2 Supplementary articles
6.3 List of contributions and communications
Scientific articles sorted by year
Conferences and workshops
6.4 Students supervision
6.5 Résumé complet en Français