Performing genotype-phenotype mappings of enzymes is limited with current high-throughput approaches
Deep mutational scanning approaches yield interesting results regarding the properties of the genotype-phenotype landscape. But we are currently lacking high-throughput methods to quantitatively measure absolute thermodynamic constant values. Some efforts have been made in this direction for binding proteins , but not for enzymes.
Also, there is a need for methods to perform in vitro enzymatic assay to decouple fitness from catalytic efficiency. To my knowledge, Romero et al.  were the firsts to couple an in vitro enzymatic assay with a deep mutational scanning approach to perform a systematic genotype phenotype mapping of an enzyme. Thanks to droplet-based microfluidics, they could physically couple the genotype, the phenotype and the product of the enzymatic reaction. But, the product of the activity was not normalized by the expression level, which prevents quantitative measurements.
Thanks to its high-throughput and the possibility it offers to couple the genotype to the product of the activity, droplet-based microfluidics is very well suited to perform genotype-phenotype mapping of enzymes. I started developing high-throughput and quantitative droplet-based work-flows to measure the catalytic efficiency of large number of enzyme variants by simultaneously measuring their expression level and the product of their activity. The next section is dedicated to discussing the state of the art in terms of enzymatic assays in droplet-based microfluidics.
Constraints of the droplet format on performing quantitative enzymatic assays
My experiments should allow me to measure the properties of millions of enzyme variants: their catalytic efficiency (ideally the Michaelis Menten constants kcat andKM should be measured), their stability (melting temperature) and their specificity (detection of more than one type of activity simultaneously using two or more substrates). Sorting droplets and sequencing their content should allow me to relate the phenotype to the genotype. Fluorescence Activated Droplet Sorting Devices (FADS)  are operated under a microscope bearing a single objective: the experimenter can only focus and measure the droplets characteristics at one particular location of the microfluidic device at a time (at the sorting fork in this case). Droplets pass only once in front of this detection point. As a result and until now, high-throughput droplet-based enzymatic assays have been limited to one point measurement in the course of an enzymatic assay. The fluorescence of each droplet at this detection point should provide enough information to extract at a high frequency the properties of the variants. The amount of information given depends on the three following quantities:
• The noise of the experimental measurement: how variable is the result of the measurement on one particular enzyme variant?
• The resolution of the assay: what is the smallest difference of activity that can be distinguished?
• The dynamic range: what is the amplitude between the lowest and the highest activities that can be measured?
Ratus norvegicus trypsin (rat trypsin). Are trypsin sectors independent functional units? What is the relation between the trypsin sectors and epistasis?
Ratus norvegicus trypsin is a member of the S1A serine protease family which has already been studied in  where it was chosen as a model system to study the sector hypothesis: are trypsin sectors mutational independent function units? We chose this model system to address this question systematically.
The serine proteases Serine proteases are characterized by a catalytic triad Asp-His-Ser where the serine is used as a nucleophilic center to attack carbonyl groups in the polypeptide chain . This mechanism is thought to have evolved independently several times, as it is found in different families of proteases that do not share the same fold . Among the serine proteases, trypsin is a classical biochemistry model system. It is well characterized, abundant (usually purified from bovine or porcine pancreas) and many substrates are commercially available.
Ratus norvegicus trypsin is a member of the S1A serine protease subfamily (chymotrypsin A like serine proteases), which is the largest subfamily of serine proteases (40596 sequences according to the MEROPS database5). Inside this subfamily, chymotrypins and trypsins display different specificities in spite of similar structures. Chymotrypsins cleave polypeptide on the carbonyl side of large hydrophobic residues whereas trypsins prefer lysines or arginines residues. Switching from trypsin to chymotrypsin specificities is not trivial: it has been shown that it not only requires mutating residues in the binding pocket but also in surface loops which are not directly involved in catalysis [194–196]. But the systematic measurement of the specificity of all possible mutants separating two representative trypsin and chymotrypsin has not been performed yet. The mechanisms of molecular adaption from trypsin to chymotrypsin (and vice versa) are therefore unknwown.
Fabrication and preparation of disposable devices to handle aqueous phases and emulsions
Here I present the protocols that I followed to fabricate disposable devices that I used to inject aqueous solutions in the microfluidic chips and collect and reinject droplets.
Fabricating and preparing 1 mL aqueous phase injectors Rigid PDMS plugs (cylinders) are cut out of PDMS remnants using a 6 mm diameter biopsy (height of the plug is approximately 5 mm). A hole is punched in the plug of PDMS (0.5 mm). The hole is cleaned using isopropanol and then dried using dry compressed air. A piece of tubing is inserted throughout the hole (approximately 10 cm long, at least half a centimeter should be going through). The plug is then inserted into a 1 mL plastic syring (latex free, Omnifix, Braun). The piston is evacuated from the body of the syringe and the plug is inserted inside, with the long part of the tubing facing outwards. Needle (27G, Terumo) and tubing (PTFE, 0.3mm ID 0.76 mm OD, Adtech) are adapted to the body of the plastic syringe. The latter is filled with fluorinated oil from its bottom, using another syringe full of oil. Once the free tubing of the injector is filled with oil, its tip is put inside the aqueous solution. The aqueous phase is gently sucked inside the injector, by pulling the piston of the syringe that was used to fill the injector with oil.
Previous work on a cell-free microfluidic work-flow to assay SGAP in droplets
In this section, I briefly discuss the work already carried out by Alexei Godina to assay SGAP in droplets. During his PhD, Alexei Godina developed an in vitro expression microfluidic work-flow  based on the Pure system in vitro transcription translation kit (PUREfrex, Cosmobio) to screen SGAP libraries for aminopeptidase and phosphodiesterase activities simultaneously. This expression kit is made out of purified components (RNA polymerase, ribosomes, tRNA, etc.)[155, 156] and showed no detectable background aminopeptidase or phosphodiesterase activities contrary to other cell-free commercialized protein synthesis kits based on cells extracts. The work-flow included the following steps:
1. Encapsulating at most one SGAP mutant plasmid per droplet to obtain at most one genotype per droplet.
2. Adding the expression mix to the droplets containing the SGAP mutants.
3. Expressing the SGAP mutant proteins in droplets.
4. Adding a cocktail of fluorogenic substrates to each one of the droplets by picoinjection.
5. Letting the aminopeptidase and phosphodiesterase enzymatic reactions occur in droplets.
6. Measuring the fluorescence of the hydrolyzed substrates.
Table of contents :
1.1 The genotype-phenotype relationship
1.1.1 Genotype, phenotype and the landscape metaphor .
1.1.2 The high dimensionality of genotype space or sequence space
1.1.3 Questioning the properties of the genotype-phenotype landscape
1.1.4 Knowing the genotype-phenotype landscape: opportunities and potential applications
1.1.5 Approaches to understand the genotype-phenotype landscape
Numerical approaches from the first principle of physics
and protein structural information
Statistical inference on protein sequence data
Experimental approaches: the need for large scale genotypephenotype
1.2 Current high-throughput genotype-phenotype mapping approaches: deep mutational scanning experiments
1.2.1 Generate sequence variants by mutagenesis
1.2.2 Opportunities and challenges of next generation sequencing
1.2.3 Mapping the genotype to the phenotype in deep mutational
General considerations about deep mutational scanning
Limitations of those approaches
Overcoming those limitations for a quantitative genotypephenotype mapping
1.2.4 Properties of the genotype-phenotype landscape as seen through deep mutational scanning
Distribution of robustness and agreement with the phylogeny
Distribution of beneficial mutations
Distribution of epistatic effects
Influence of robustness on molecular evolution
Influence of epistasis on molecular evolution
The roles of promiscuity and evolvability in molecular evolution
Trade-offs and couplings between proteins’ properties
1.2.5 Performing genotype-phenotype mappings of enzymes is limited with current high-throughput approaches .
1.3 Droplet-based microfluidics to perform high-throughput enzymatic assays
1.3.1 Droplet-based microfluidics: high-throughput manipulation of micro-metric reaction vessels
The opportunities offered by the picoliter droplet format in biology
The development of bio-compatible materials dedicated to microfluidics
Microfluidic devices to perform high-throughput operations
1.3.2 Constraints of the droplet format on performing quantitative enzymatic assays
General droplet-based microfluidics constraints
Constraints specific to enzymatic assays
1.3.3 Current droplet-based microfluidic work-flows
1.3.4 Conclusion: requirements for a quantitative genotypephenotype mapping of model enzymes in droplets .
1.4 Studying model enzymes to better understand the link between genotype and phenotype
1.4.1 Streptomyces griseus Aminopeptidase (SGAP) to study allostery and promiscuity
1.4.2 Ratus norvegicus trypsin (rat trypsin). Are trypsin sectors independent functional units? What is the relation between the trypsin sectors and epistasis?
2 Material and methods: droplet-based microfluidics and fluorogenic substrates
2.1 Droplet-based microfluidics
2.1.1 Microfluidic devices fabrication
2.1.2 Fabrication and preparation of disposable devices to handle aqueous phases and emulsions
2.1.3 Microfluidic devices operation
Optical and electrical setup
Preparing the microfluidic devices
Making and manipulating droplets
Measuring droplet fluorescence
2.2 Fluorogenic substrates
2.2.1 SGAP fluorogenic substrates
2.2.2 Trypsin fluorogenic substrates
3 Development of a cell-free microfluidic work-flow for the genotypephenotype mapping of Streptomyces griseus Aminopeptidase (SGAP)
3.1 Previous work on a cell-free microfluidic work-flow to assay SGAP in droplets
3.2 Development of the in vitro workflow 1: pico-injecting the substrate in droplet containing the enzyme
3.2.1 The PCR reagents inhibit cell-free expression in bulk
3.2.2 Cell-free expression is successful in bulk and droplets
3.2.3 Cell-free expression reagents inhibit SGAP enzymatic activity in bulk: picoinjection is incompatible with activity detection
3.2.4 Cell-free expression droplets have to be diluted in the assay droplets
3.3 Development of the in vitro workflow 2: diluting the enzyme containing droplet into the substrate containing droplet
3.3.1 Fusing 2 pL expression and 20 pL assay droplets allows aminopeptidase activity detection with high contrast
3.3.2 Synthesizing a new non leaky substrate to improve SGAP enzymatic assay in droplets
3.3.3 SGAP PCR amplification is successful in 0.2 pL droplets
3.3.4 0.2-2pL droplet electro-coalescence development
3.4 Discussion, Conclusion and Perspectives
4 Development of an in vivo microfluidic workflow for the genotypephenotype mapping of Ratus norvegicus trypsin
4.1 The in vivo workflow 1: rat trypsin periplasmic expression in E. coli
4.1.1 Osmotic shock of E.coli cells in hypotonic buffer allows trypsin activity detection in bulk
4.1.2 MUGB inhibits trypsin activity but cannot be used as a reporter for its concentration
4.1.3 Normalizing rat trypsin activity using mCherry as a reporter of its expression level
4.1.4 Difficulties with the E. coli expression system
4.2 The in vivo workflow 2: trypsin secretion by B. subtilis
4.2.1 The rat trypsin – mCherry fusion protein is secreted by WB800N as a full protein in the supernatant
4.2.2 The rat trypsin – mCherry fusion protein is fluorescent and enzymatically active in the culture medium
4.2.3 The mCherry fluorescence can be used as a reporter of the trypsin expression level
4.2.4 Optimizing incubation time for the trypsin-mCherry expression in droplets
4.2.5 Shaking the emulsion during incubation for expression reduces emulsion size polydispersity
4.2.6 Measuring the catalytic efficiency of trypsin variants in bulk
4.2.7 Measuring the catalytic efficiency of trypsin variants in droplets
4.3 Towards a library of all rat trypsin single point mutants
4.3.1 « Around the horn » site directed mutagenesis principle
4.3.2 Designing the mutagenic primers
4.3.3 Performing saturated mutagenesis on the rat trypsin protein
4.3.4 Analyzing the first rat trypsin library by deep sequencing
First deep sequencing run results
Second deep sequencing run results, with fraction of
the mutagenic primers redesigned.
Coverage of all single point mutants
4.4 Discussion, Conclusion and Perspectives
5.1 Conclusion and further work
A Appendix: Development of a cell-free microfluidic work-flow for the genotype-phenotype mapping of Streptomyces griseus Aminopeptidase (SGAP)
A.1 Experimental details
A.1.1 Plasmid maps and primer sequences
A.1.2 General comments about the experiments
A.1.3 SGAP PCR protocol with Phusion polymerase
A.1.4 The PCR reagents inhibit cell-free expression in bulk
A.1.5 Cell-free expression is successful in bulk and droplets
A.1.6 Cell-free expression reagents inhibit SGAP enzymatic activity in bulk: pico-injection is incompatible with activity detection 3.2.3
A.1.7 Fusing 2 pL expression and 20 pL assay droplets allows aminopeptidase activity detection with high contrast
A.1.8 SGAP PCR amplification is successful in 0.2 pL droplets
A.1.9 Random mutagenesis on the SGAP gene
A.2 Synthesis of New Hydrophilic Rhodamine Based Enzymatic
Substrates Compatible with Droplet-Based Microfluidic Assays
B Appendix: Development of an in-vivo microfluidic work-flow for the genotype-phenotype mapping of Ratus norvegicus trypsin
B.1 Ratus norvegicus trypsin expression in E. coli
B.1.1 Plasmid maps
B.1.2 General comments about the experiments
B.1.3 Osmotic shock of E.coli cells in hypotonic buffer allows trypsin activity detection in bulk 4.1.1, 4.1.3 . . 161
B.1.4 Lysozyme and sucrose only marginally improve trypsin activity detection 4.1.1
B.1.5 MUGB inhibits trypsin activity but cannot be used as a reporter for its concentration 4.1.2
B.1.6 Development of a non leaking substrate based on the fret pair EDANS-Dabcyl 4.1.4
B.1.7 Rat trypsin in B. subtilis: plasmid maps
B.2 Ratus norvegicus trypsin expression in B. subtilis
B.2.1 General comments about the experiments
B.2.2 B. subtilis transformation protocol
B.2.3 Induction protocol
B.2.4 Droplet induction protocol
B.2.5 Bulk assay protocol
B.2.6 Droplet assay protocol
B.2.7 The rat trypsin – mCherry fusion protein is secreted
by WB800N as a full protein in the supernatant 4.2.1 166
B.2.8 Measuring the catalytic efficiency of trypsin variants in droplets 4.2.1
B.3 Towards a library of all rat trypsin single point mutants
B.3.1 Designing the mutagenic primers 4.3.2
B.3.2 Performing saturated mutagenesis on the rat trypsin protein 4.3.3
B.3.3 Library preparation 4.3.3
B.3.4 Redesigning part of the mutagenic primers
B.3.5 Sequencing data analysis