BACILLUS SUBTILIS: IN A NUTSHELL
Bacillus subtilis and its genome
Bacillus subtilis is the best-characterized Gram-positive bacteria. It’s a rod shaped, spore forming and an aerobic bacterium which is profoundly found in the upper layers of the soil and in association with plants. Under aerobic conditions, it can differentiate into heat-resistant spores. In addition, it can utilize various carbon sources which include many plant-derived molecules. They are widely used for the synthesis of broad range of agricultural, pharmaceutical and chemical compounds, which includes amino acids, antibiotics and biosurfactants. B. subtilis has a simple and competent gene transfer system which has led to the construction of genetic map of its genome.
In 1997, Kunst et al. published the entire genome of B. subtilis with a genome of 4 million base pairs with an average G+C ratio of 43%, consisting of 4,100 protein-coding genes and later in 2009, the sequence was updated (Barbe et al., 2009). B. subtilis CDS can be divided into three different classes. Class I includes mainly genes of sporulation, Class
II consists of genes expressed during exponential growth conditions i.e. genes related to transcription and translation machineries, core intermediary metabolism, stress proteins, and one-third of genes of unknown function. Lastly, Class 3 includes genes of unidentified function as well as this class has enriched A+T residues codon (Hénaut et al., 1996). It has been found that ATG as start codon for 78% genes, TTG for 13% and 9% with GTG in comparison to 85%, 3% and 14% for E.coli (Blattner et al., 1997).
Essential genes can be described as those genes, which are when inactivated under specific condition (LB as nutrient source and 37oC) affect the viability of the cells. Previously, Kobayashi et al. (2003) suggested 271 essential genes but recently Commichau et al. (2013) found out that 31 genes out 271 genes were non-essential. But as per recent updates available in subtiwiki, 251 essential genes are listed. The essential genes have been divided into various domains i.e. protein synthesis, secretion and quality control; metabolism; Cell envelope and division; DNA replication, modification and chromosome maintenance; RNA synthesis and degradation; protective functions and unknown represented in Figure 1.
The complex regulation of the lipopetpide synthesis
Two-component signal-transduction pathways comprise a sensor protein kinase and a response regulator. In B. subtilis 34 genes encode response regulator, largely having neighboring genes encoding histidine kinases. The ComP/ComA two component system is required for the expression of various peptide pheromones, which are involved in various signal transduction pathways (Lazazzera et al., 1999). It has an important role in competence development and regulation of the secondary metabolites biosynthesis, as the lipopeptide surfactin (Hamoen et al., 2003).
B. subtilis can develop a physiological condition which makes them to uptake exogenous DNA in macromolecular form under defined growth condition, this process is known as competence. Hamoen et al (2003) suggested five genetic loci involved for the transport of DNA: comC, comE, comF, comG and nucA. After transport into the cell, recombination proteins i.e. RecA and DNA helicase AddAB help in integration of the foreign DNA into the genome (Haijema et al., 1996). ComK is a major candidate for the development competence as all the above genes are regulated by this competence factor (van Sinderen et al., 1995). The crucial step for competence development is the activation of comK promoter by ComK (Hamoen et al., 2003). Sporulation gene spoOA can also regulate competence as phosphorylated spoOA (spoOA ~P) can inhibit the expression of abrB, which is a potential inhibitor of comK expression (Green et al., 1991). Besides inhibition by abrB, based on GTP levels as well as concentration of branched chain amino acids codY can bind to the RNA polymerase binding site of comK promoter, thus inhibiting its expression ( Serror and Sonenshein, 1996).Various competence stimulating factors have direct or indirect effect on the positive expression of comK. As it can be seen from Figure 2, expression of comQ is required for post-translational modification and secretion of comX, which is the main competence stimulating factor. ComX acts as an environmental signal for the phosphorylation of ComA by its protein kinase partner ComP (Solomon et al., 1995). Phosphorylated ComA (ComA~P) is required for the expression of srfA operon (Hahn and Dubnau, 1991). It has been found that this operon positively regulates comK through comS, which is located with the srfA operon but out of its ORF (Hamoen et al., 1995).
There are several phr genes (7 genes), which encode regulatory peptides that might have a role in quorum sensing mechanism (Perego et al., 1996). PhrC, one of the pheromone which forms another half of competence stimulating factor can also induce competence through srfA transcription via ComP and ComA activation. An oligopeptide permease, SpoOK senses this pheromone and facilitates its entry (Lazazzera et al., 1997). PhrC can inhibit the activity RapC. This protein can dephosphorylate ComA~P, thus reducing the expression of srfA and finally hindering competence development (Solomon et al., 1996). B. subtilis during transition state expresses various extracellular proteases and other degrading enzymes, transport function and chemotaxis to allow maximum utilization of nutrients as well as it excretes various antimicrobial compounds to defeat competing microbial species (Phillips and Strauch, 2002). There are various transition state regulators which inhibit the expression of genes that are only needed during stationary phase (Strauch et al., 1993). This transition state regulator includes AbrB, ScoC, SinR, CodY and Abh. AbrB, ScoC and SinR overexpression can restrain sporulation (Gaur et al., 1986) while AbrB is involved in the production and resistance to various antibiotics (Hahn et al., 1995). Both SpoOA and AbrB regulate the expression of scoC (Perego and Hoch, 1988). SinR represses the genes (spoIIA, E and G) responsible for sporulation, it is a global regulator for degrading enzymes synthesis, motility, competence and sporulation (Gaur et al., 1991). CodY is activated in the presence of amino acids (valine and isoleucine) and GTP; it represses competence and dipeptide permeases as well as genes involved in the utilization of alternate carbon and nitrogen sources (Serror and Sonenshein, 1996).
Figure 2 Overview of regulatory pathways affecting competence development as well as positive and negative regulation of comK and related genes are indicated by arrows and T-bars (Hamoen et al., 2003)
The most abundant class of proteins found in B. subtilis are ABC transporters as Gram-positive bacteria have an envelope comprising single membrane. There are 77 such transporters encoded in the genome (Kunst and al., 1997).
Bacillus species have been the “protagonists” in applied microbiology for industrial applications for the last 20 years and more. In 1982, B. subtilis (natto) was first used for the solid state fermentation of soyabeans in Japan. They have been drawing attention towards industrial application due to the following characteristics: high growth rates, ability to secrete proteins and various important compounds into extracellular medium and generally recognized as safe (GRAS) organisms for food and drug administration. The availability of the genomic information related to various pathways have facilitated reconstruction of this pathway through gene modification, as role expression studies can be carried out for the overproduction of various industrially important compounds along with the characteristics of development of competent state to uptake foreign DNA and sporulation. There has been extensive proteome analysis providing a panoramic view of B. subtilis proteome and its ability to adopt environmental changes (Becher et al., 2011). There are various genes which are responsible for the production of lipid, carbohydrate and protein degrading enzymes as well as responsible for secretary function, thus creating an industrial interest. B. subtilis secreted proteins have N-terminal signal peptides, which are important feature for transport from the cytoplasm (Dalbey et al., 2012) and the best studied translocation pathway for secretory proteins are secretory (Sec) pathway and twin-arginine translocation pathway (Antelmann et al., 2003). To enhance the biotechnological applications, large scale genome minimization studies are being carried out, strain (MBG874) lacks 874kb (20.7%) of the B. subtilis genome and B. subtilis Δ6 lacks 376 genes at seven different locations in reference to B. subtilis 168 (Reuß et al., 2016). Industrial enzymes market will be worth 6.2 billion $US by 2020, B. subtilis can be the major workhorse for this market. Its importance comes from its ability to produce these substances at gram per litre concentrations. B. subtilis produces α-amylase, which cleaves internal α-1, 4-linkages in an endo fashion and it is used in food, fermentation, textile and paper industries (Pandey et al., 2000). The enzyme β-glucanase is used for enzymatic modification of barley. It can produce various alkaline proteases which have pH optima in the range of 9-12, is thermostable as well as can be used as commercial detergents (Gupta et al., 2002). Two novel phytases from B. subtilis were cloned and biochemically characterized (Tye et al., 2002).
Braaz et al. (2002) found that B. subtilis was an efficient host for the overexpression of poly (3-hydoxybutyrate) depolymerase from P. lemoignei. B. subtilis can be a great host for the production of heterologous proteins as it has the potential to secrete proteins into extracellular medium but it requires various genetic signals for effective transcription and translation of foreign genes (Braaz et al., 2002). Production of heterologous eukaryotic proteins in B. subtilis is problematic due to the presence of large number of extracellular proteases (Wang et al., 1988). About 4% of the B. subtilis genomes code for large multifunctional enzymes which include genes for loci i.e. Srf, Pps and Pks. Due to presence of these genes, it can produce compounds with antibiotic activity. Bacitracin gene cluster was transferred from B. licheniformis to B. subtilis; it was able to produce the antibiotic in high levels (Eppelmann et al., 2001). Production of riboflavin at the concentration of 30 g/l in 3-day fermentation was obtained using mutant strain of B. subtilis (Dauner et al., 2002).
B. subtilis spores can also be used as an agent for vaccine delivery (Duc et al., 2003). System and synthetic biology approach along with fundamental and applied approach can convert B. subtilis into micron-scale incomparable factories.
It can be seen that B. subtilis genome is properly studied to be used for genetic modification for the synthesis of various molecules. In the next section, the focus is on non-ribosomal peptide synthesis (NRPS) mechanism in B. subtilis, which is essential for the production of particular class of industrially important molecules, such as lipopeptidic biosurfactants.
INSIGHTS INTO UNIQUE NON-RIBOSOMAL PEPTIDE SYNTHESIS
In 1928, Alexander Fleming returned to his laboratory after a month and saw his culture of Staphylococcus aureus had become contaminated with a mold. He was unaware of the fact that this incident will act as a catalyst in the world of medicine with the discovery of first antibiotic (which he called “mold juice”, later renamed as penicillin) and the mold was Penicillin notatum. Penicillin is a peptide derivative from Penicillin notatum. Lipmann and his co-workers (1971) communicated enzymatic synthesis of tyrocidine and gramicidin S from Bacillus sp. It was reported that their synthesis was independent on nucleic acid and was carried out by a large multifunctional enzyme complex analogous to fatty acid synthetases (Lipmann et al., 1971).
Non-Ribosomal Peptide Synthetase (NRPS) is a multifunctional, multienzyme complex found in various bacteria, fungi and lower eukaryotes. In comparison to conventional ribosomal synthesis, it does not require mRNA or ribosome for the synthesis of the peptides. Non ribosomal peptides (NRPs) display more than 500 different monomers and can be linear, cyclic or branched in structure. NRPs can be classified into lipopeptides, chromopeptides, glycopeptides etc. There is another class of multifunctional enzymes known as PKS (Poly Ketide Synthetases). The difference between PKSs and NRPs is that, the former uses acetate and propionate as builiding blocks while the latter uses monomers (Mootz et al., 2002). NRPs represent a large class of bioactive compounds with distinctive structural features like D- and L- amino acids; heterocyclic elements which can be methylated or glycosylated. Non-ribosomal peptides are generally produced by micro-organisms present in soil e.g. Bacilli, Actinomycetes or Pseudomonas. Although, it has been found that marine microorganisms can be a resource for these multifunctional secondary metabolites (Faulkner, 1998). These peptides display various functions i.e. antibiotic activity (Tyrocidine A; Bacitracin A; Vancomycin), biosurfactant (Surfactin, Iturin), antifungal (Iturin, Fengycin), elicitor (Surfactin, Fengycin), anti-inflammatory (Surfactin) and act as immunosuppressive agent (Cyclosporin) as well as can function as siderophore (Bacillibactin; Enterobactin; Vibriobactin)(Marahiel, 2009). They can also be divided into classes based on their structural conformation: linear product (vancomycin, etc.) and cyclic product (surfactin, cyclosporine, etc.) following intramolecular nucleophilic reaction.
NRPS is organized into various modules which forms the template for the synthesis of various linear and cyclic peptides. Modules are further divided into domains, which are responsible for specific catalytic activity during synthesis (Felnagle et al., 2008). Each module determines the nature of the monomer and inclusion of the monomer in the growing peptide (Caboche et al., 2008) whereas the domains are responsible for substrate binding, activation, modification, elongation and release of the complete peptide (Sieber, 2005). The length of the final peptide is generally determined by the size of the NRPS. Studies have revealed the presence of linker regions (~15 amino acids) between domains in a randomly distributed fashion. These regions being flexible in nature can be used for synthesizing hybrid NRPS template without disturbing the enzymatic integrity (Weber et al., 2000). A database consisting of all non-ribosomal peptides have been created called NORINE (Caboche et al., 2008). The biological activity of NRPS depends on their structure, which is defined by the peptide sequence and the length and isomery of the fatty acid chain for the lipopeptidic compounds. Sometimes, the peptide products can become problematic for the producer strain, so they acquire several defensive mechanism i.e. modification of the target region within the strain, temporary inactivation of the product and efflux pumps to escape the toxic nature (Crosa and Walsh, 2002).
The elongation module (C-A-PCP) is made of at least three essential domains i.e. Adenylation (A) domain, Peptidyl Carrier Protein (PCP) domain and Condensation (C) domain, for the selection and activation of the monomer, covalent binding to 4′-phosphopantetheine (PPan) and peptide bond formation, respectively. At the end, the final peptide release is catalyzed by thioesterase (TE) domain. These four domains make the primary domains of Non-Ribosomal Peptide Synthetase.
Table of contents :
CHAPTER 1: LITERATURE REVIEW
1. BACILLUS SUBTILIS: IN A NUTSHELL
1.1 Bacillus subtilis and its genome
1.2 The complex regulation of the lipopetpide synthesis
1.3 Industrial applications
2. INSIGHTS INTO NON-RIBOSOMAL PEPTIDE SYNTHESIS (NRPS), UNIQUE MACHINERY
2.2 Primary domains
2.3 Secondary domains
2.5 Precursors of the NRPS mechanism and their regulation mechanisms
2.6 Interest of the NRPS peptides produced by B. subtilis
3. SURFACTIN: EXPLORING FROM GENE TO PRODUCT
3.6 Structure and activity relationship
4. OVERPRODUCTION OF SURFACTIN
4.1 Present scenario
4.2 Role of various factors
4.3 Direct approach
4.4 Linkage between Direct and Indirect approach
4.5 Indirect approach
CHAPTER 2: MATERIALS AND METHODS
3 Genotype of plasmids and E.coli strains
4 Culture Media
5 DNA Manipulation
6 Formalization of reaction network
7 Mutant construction
9 Surfactin analysis
10 LC-MS-MS analysis of surfactin isomer
11 Extracellular metabolome
CHAPTER 3: RESULTS AND DISCUSSION
1. Is intracellular pool of leucine a limiting factor for surfactin biosynthesis?
2. How to enhance leucine production within the intracellular pool ?
2.1 Formalization of the model of BCAA metabolic pathway
2.2 Knockout prediction
2.3 Mutant constructed
2.4 Growth and surfactin analysis
2.5 Publication and Valorization of these results
3. FACTORS INFLUENCING SURFACTIN PRODUCTION
3.1 Role of nutrient composition
3.2 Impact of oxygen transfer
3.3 Other factors
3.4 Publication and Valorization of these results
4. ROLE OF PYRUVATE FLUX
4.1 Mutant constructed
4.2 Growth and surfactin analysis
4.3 Carbon metabolism and surfactin production
5. IMPACT OF DELETION ON SURFACTIN VARIANTS
5.1 Isoform analysis
5.2 Extracellular metabolite analysis
5.3 Publication and Valorization of these results
6.1 Mutant constructed
6.2 Growth and surfactin analysis with native promoter and constitutive promoter
6.3 Publication and Valorization of these results
CHAPTER 4: CONCLUSION AND PERSPECTIVES
APPENDIX I VECTORS AND PLASMIDS
Synthesized sequence (PrepU – ermC- p srfAA)
APPENDIX II MEDIA COMPOSITIONS
TSS SUPPLEMENTED WITH 16 AMINO ACIDS