ASM News
ASM Home Site Map Search ASM Site

    !animalc.gif (410 bytes)

    Vincent J. J. Martin is a postdoctoral fellow, Christina D. Smolke is an NIH postdoctoral fellow in the Department of Molecular and Cell Biology, University of California, Berkeley, and Jay D. Keasling is a professor in the Department of Chemical Engineering, University of California, Berkeley.

    Links to Other ASM Pages:

Redesigning Cells for Production of Complex Organic Molecules

If metabolically engineered properly, cells replenish both enzymes and cofactors while producing complex, potentially useful organic molecules

Vincent J. J. Martin, Christina D. Smolke, and Jay D. Keasling

Engineered microorganisms are becoming a significant alternative for synthesizing complicated organic molecules. While there may be more appeal in discovering new molecules, pathways, and their corresponding genes, the development of appropriate hosts and expression systems to produce these molecules must be undertaken simultaneously if we ever hope to produce them in quantities sufficient for evaluating or, eventually, using them on a commercial scale. Recent advances in gene expression systems will certainly play a part in developing optimized microbial hosts in which to produce many of these molecules.

During the last century, synthetic organic chemistry was the workhorse of the chemical and pharmaceutical industries for producing feedstock chemicals, fuels, polymers, and drugs. Though methods in organic synthesis continue to improve, some chemical compounds remain difficult to synthesize, particularly those that contain multiple stereochemical centers. The complexities involved in producing these compounds are reflected in intricate multistep chemical syntheses that often result in low yields and potentially toxic chemical waste products.

Enzymes, Metabolic Engineering Offer Routes to Complex Organic Molecules

Over the last two decades, enzymes have become increasingly popular for catalyzing some of the most complicated organic chemical transformations. Specialized enzymes allow one to produce enantiomerically pure molecules, sometimes eliminating one or more steps in a conventional organic chemical synthesis, unnecessary side reactions, and the associated use of toxic organic solvents. New and improved enzymes with broadened substrate ranges and activities, resulting from recent genome sequencing and prospecting projects, and also from directed genetic evolution techniques, increasingly enable more efficient production of diverse organic molecules.

While two or more enzymes may be used together in an in vitro synthetic train, differences in enzyme activity and stability and the costs associated with cofactor regeneration typically make these synthetic metabolic pathways uneconomical for all but the smallest product quantities. However, what may be difficult and costly in vitro is often relatively simpler to accomplish in vivo. Redirecting multiple enzymatically catalyzed reactions to improve production of useful or novel chemicals, or to remediate toxic chemicals in the environment, constitutes the field of metabolic engineering. Metabolic engineering and the various mathematical and experimental methods that encompass it have been used most widely for optimizing transformation of chemicals by cells. The principal advantage of cells over multienzyme in vitro reaction systems is that cellular metabolism replenishes both enzymes and cofactors, and may even furnish valuable precursors derived from inexpensive starting materials.

Metabolic engineering of cells often involves introducing multiple, heterologous genes encoding enzymes needed in the new pathway. The expression of these genes must be balanced so that no single enzyme is overproduced and no single enzyme severely limits the flux through the pathway, both of which could lead to an accumulation of one or more metabolic intermediates and inefficient use of cellular resources. Thus, accurate and reproducible control of the expression of individual genes in the heterologous metabolic pathway is necessary to maximize production of the desired compound. Unfortunately, most gene expression systems that have been developed for high-level production of heterologous proteins lack the finesse required in metabolic engineering. In this brief review, we discuss some of the recent developments in gene expression, with emphasis on those techniques most important for remodeling cellular metabolism.

In their early efforts, genetic engineers emphasized the development of multicopy expression vectors, assuming that protein production would increase with copy number of the encoding genes. While high vector copy numbers may be advantageous for producing certain sought-after proteins, in metabolic engineering, enzymes are sought for their catalytic properties—not as the end product. Thus, low-copy-number (or single-copy-number) vectors should be sufficient for producing enzymes in metabolically engineered hosts. Indeed, most genes involved in cellular metabolism are located on the chromosome, suggesting that expression of single-copy genes is sufficient for their ordinary function. Here, we will focus on two single-copy expression systems, one based on modifying the chromosome and the other on bacterial artificial chromosomes.

Chromosome Engineering

Figure 1

Four basic methods have been developed to deliver and insert DNA sequences into the chromosomes of microbial cells (Fig. 1). One method relies on transposons to insert DNA randomly into chromosomes, whereas the other three methods involve site-specific insertions at a predetermined gene or region of a chromosome. To eliminate selection markers and replicon sequences that are inserted into the chromosome when any of these techniques is used, several new methods are available.

Transposons are specific DNA sequences that catalyze their own movement to alternate sites within a chromosome. Victor deLorenzo and his collaborators at the National Biotechnology Center in Spain developed a series of mini-Tn5 transposon suicide delivery vectors (pUT system) for use in randomly inserting genes into bacterial chromosomes. Unlike the native Tn5 transposon, the transposase gene (tnp) in these vectors is located outside the mobile element, so that only the insertion sequence terminal sequences and the DNA cloned between the two sequences are delivered to the chromosome. Since the tnp gene is not transposed, the resulting insertion is not prone to genetic instability and cells containing the mobile element are not immune to further rounds of transposition. There are two disadvantages to using transposons in metabolic engineering: (i) insertion of a transposon into the chromosome may negatively affect expression of neighboring genes and (ii) the level of expression may depend on the location of the transposon in the chromosome.

Vectors containing conditional replicons, such as the temperature-sensitive pSC101 and the ?pir-dependent R6K are particularly useful for delivering genes into the chromosome of Escherichia coli. Additional narrow-host-range suicide vectors are being used for delivering genes into bacterial hosts other than E. coli, such as Pseudomonas species. In these systems, a specific cloned DNA sequence is used to target a region of the chromosome for recombination. In a single crossover event, heterologous DNA inserts along with the replicon and a selection marker (usually antibiotic resistance). The replicon and/or the marker is eliminated by a counter-selection strategy such as by cultivating the modified cells at an elevated temperature (pSC101) or on sucrose using the sacB allele.

To eliminate the requirement for first cloning a target sequence into a gene delivery vector, alternative site-specific, homologous recombination systems were engineered. For instance, one of these systems uses phage attachment sites (attP) on the delivery vector that then target specific attachment sites on the chromosome (attB). Recombination at the attB site may rely on host function or exogenous expression of the specific phage int gene, which promotes site-specific integrations. These systems have been developed to integrate genes at alternative attB sites using attP from various phages including ?, HK220, F80, P21, P22, FCTX (Pseudomonas aeruginosa) and FFSW (Lactobacillus casei).

Researchers in several groups are using the ? bacteriophage (?Red) and E. coli (RecET) recombination systems to introduce linear, heterologous DNA into the E. coli chromosome. These systems exploit the induced hyper-recombinogenic state of E. coli that arises when the ? bacteriophage exo and bet or E. coli recE and recT genes are transiently expressed, thereby promoting recombination, along with the ? phage gam gene, which inhibits RecBCD activities. In E. coli, recombination with sequences as short as 40 bp occurs efficiently during this hyper-recombinogenic state. This technology offers advantages over traditional homologous recombination in that: (i) recombination is achieved in a single step (resolution of cointegrants is not necessary), (ii) prior cloning of the gene of interest is optional (PCR products can be used), and (iii) only very small regions of homology are required for recombination to take place. Although these tools were designed for single gene replacements, they can also deliver multigene expression cassettes to the chromosome. Cassettes as large as 3.1 kb have been successfully inserted, but the upper size limit remains to be determined.

Many chromosome integration systems retain antibiotic resistance markers or other functional replicon-derived DNA that was inserted into the chromosome. However, in some of the newer integration systems, these sequences can be precisely removed once the desired chromosomal modifications are constructed. Removing antibiotic selection markers and inserted replicons not only allows for subsequent rounds of modification without accumulating additional resistance markers, but also yields modified microbial strains that are virtually free of DNA sequences that might interfere with subsequent rounds of homologous recombination. Furthermore, removing replicon sequences sometimes results in recombinants with improved genetic stability and reduces the chances of genetic transfer to other organisms. Examples of such recombinase/site exision systems include Flp/FRT from Saccharomyces cerevisiae, Cre/loxP from the P1 bacteriophage, ParA/res from the RP4 plasmid, and Xis/att found in phage.

Artificial Chromosomes Are Being Used for Metabolic Engineering

Researchers in several laboratories are working with extremely low-copy-number or single-copy plasmids that are large and stable, and behave in effect as additional chromosomes. Examples include the F plasmid and P1 prophage of E. coli and the TOL and NAH plasmids of Pseudomonas.

With their ability to harbor very large sequences of DNA and their relative stability, the replication origins and associated partition elements of these plasmids make them well suited for use in manipulating metabolism. Recently, for instance, we developed a single-copy, narrow-host-range expression vector (bacterial artificial chromosome) with which to engineer E. coli. This bacterial artificial chromosome was constructed from a 9-kb region of the E. coli F plasmid that contains several necessary elements, including the oriV and oriS origins that ensure cell cycle-specific replication of the plasmid; the locus that partitions plasmids into daughter cells at division; the repD gene that recombines plasmid dimers into monomers; and the ccd genes that kill any plasmid-free segregants.

This and other artificial chromosomes are segregatively stable in the absence of selection pressure, have well-controlled gene expression, and confer low metabolic burdens on host cells. Indeed, bacterial artificial chromosomes would seem to be excellent alternatives to multiple-copy-number plasmids for metabolically engineering E. coli where extreme stability and low metabolic burden are sought. Based on their large native size, these plasmids should be capable of faithfully replicating the many genes that may be needed for synthesizing complex products.

Fine-Tuning Gene Expression in Metabolically Reengineered Cells

Stringent control over the timing and level of gene expression can profoundly affect a metabolically engineered pathway and may determine its overall success. Achieving stringent control typically depends on several factors, including choosing promoters of various strengths, using inducers to control the induction level of a particular promoter, varying the strength of the ribosome binding site, and altering the stability of the transcript or enzyme it specifies.

Useful methods of modulating promoter activity include varying the concentration of a specific inducer, such as with arabinose and the araBAD promoter, or shifting the metabolic state of the cell, such as with phosphate starvation and the phoA promoter. A desirable and important feature of promoters is low (or no) expression in the noninduced state, with gene expression levels proportional to the amount of inducer used.

Whether the arabinose-inducible araBAD promoter (PBAD) and regulator (AraC) of E. coli is used for controlling a single gene or with another inducible promoter for controlling several genes, it offers control of gene expression in response to inducer and tight control in its absence. Unfortunately, the araC-PBAD system and the associated high-capacity, low-affinity L-arabinose transporter, AraE, display autocatalytic behavior that results in all-or-none expression in E. coli.

Figure 2

Rather than varying levels of gene expression in individual cells of the culture, varying arabinose concentrations in the medium changes the fraction of cells that are fully induced and yields two subpopulations of cells. However, if PBAD is being used to control the expression of a gene or genes necessary for synthesizing a particular product, only one subpopulation of cells will produce that product. Recently, we showed that expression of araE from an arabinose-independent (e.g., IPTG-inducible or constitutive) promoter allows control of gene expression from PBAD in individual cells, meaning a single, homogenous population of cells is found at all inducer concentrations (Fig. 2).

To simplify gene expression, it would be valuable to design cells that regulate the timing and level of expression during fermentation runs. One approach is based on introducing a gene expression system that can sense the metabolic state of the cell, based on factors such as carbon, energy, nutrients, or stress, and to use it to regulate expression of a pathway. For example, William R. Farmer and James C. Liao of the University of California, Los Angeles developed an elegant autoregulation approach, which they called "metabolic control engineering," by using the Ntr regulon of E. coli. They engineered a gene expression control loop to respond to excess glycolytic flux, during which acetyl phosphate accumulates within cells. In the absence of NRII, the nitrogen sensor of the Ntr regulon, excess acetyl phosphate phosphorylates NRI (the response regulator of this regulon)—which, in turn, positively regulates expression of the recruited glnAp2 promoter in the engineered cell.

Quorum sensing enables a bacterial cell to modify its behavior in response to signals from other bacteria. This form of communication between cells is observed in processes such as bioluminescence, expression of virulence or pathogenicity, stimulation of competence, and production of antimicrobial agents. Since quorum sensing allows cells within a population to coordinate their aggregate behavior, this form of cell signalling might be used for regulating gene expression within recombinant pathways. For example, Oscar Kuipers and other microbiologists at NIZO Food Research in Ede, the Netherlands, recruited the nisin antibiotic quorum-sensing peptide (NICE) system to control the onset of gene expression in lactic acid bacteria. With it, they efficiently rerouted pyruvate metabolism to produce diacetyl, which yields a buttery aroma in dairy products, and L-alanine.

Coordinating Multigene Expression Systems Adds Complexity

Some metabolic engineering applications depend on the coordinated expression of several genes in a single host cell. These genes may encode enzymes that are introduced to divert intermediates from the usual pathway into producing an unusual end product. Under ordinary circumstances, biochemical pathways operate under control mechanisms that coordinate expression of genes within particular pathways, optimizing the flux through the pathway and minimizing its burden on the cell. In bacteria, for example, such pathways operate under the control of operons, which coordinate gene expression.

However, transferring an operon from one microorganism to another and achieving efficient expression of the enzymes within that operon in a heterologous organism is not a trivial undertaking. Constructing new operons containing one or several heterologous genes is also a major challenge. Once constructed, additional methods are needed to coordinate and optimize expression of multiple genes in a heterologous host.

One of the most straightforward, energy-efficient ways to control expression of genes is at the transcriptional level. Some operons, including the dadAX and cydAB operons of E. coli, use multiple promoters to achieve coordinated protein production. Others, such as the gapA operon of Bacillus subtilis, use additional promoters located between genes to provide differential control over gene expression.

In metabolic engineering applications, the most direct approach to constructing a coordinated multistep biochemical pathway entails cloning each separately introduced gene behind a different promoter. In this way, the choice and subsequent manipulation of inducers permits one to control expression of individual genes within the constructed pathway (Fig. 3). For instance, Christian Solem and Peter Jensen of the Technical University of Denmark in Lyngby constructed a library of synthetic constitutive promoters of different strengths by randomizing the spacer sequences flanking promoter consensus regions. Synthetic constitutive promoters such as these can be used to coordinate gene expression if independent control is not required. This approach may be advantageous in that the engineered pathway will be expressed at steady state with no need to fine-tune its expression by adding precise amounts of inducers, some of which are costly.

Alternatively, coordinated gene expression can be achieved by directing the mRNA processing within operons. In some bacteria, several or all of the genes encoding enzymes within a particular metabolic pathway are under the control of a single promoter. To compensate for differences in the specific activities of the enzymes and to prevent intermediates from accumulating, stability differences of individual coding regions in multicistronic transcripts may give rise to vastly different enzyme levels even though the genes are under the control of the same promoter. Technology based on this type of posttranscriptional control represents an efficient but less direct method than targeting transcriptional control to coordinate multiple protein production.

Figure 3

For example, we developed a technology for coordinating multiple gene expression based on controlling the longevity of mRNA species being generated through use of stability control elements. Some of the control elements tested in this system include 5’ and 3’ RNA hairpin structures of varying strength (?Gfolding), RNase cleavage sites, and gene order in the operon (Fig. 3). We tested this series of mRNA stability control elements for their ability to modify the flux through several enzyme-catalyzed steps of a carotenoid pathway, whose genes were introduced into E. coli. We could alter individual enzyme levels, thereby altering the flux through this pathway, yielding different levels of carotenoid intermediates within these cells, indicating that rational design can be used to control how much of particular metabolic intermediates will accumulate.

We find that changing gene location and endonuclease cleavage sites within an operon leads to drastic differences in gene expression, whereas introducing hairpin structures fine-tunes gene expression levels. Depending on the type and location of control elements that are introduced, we can vary relative steady-state transcript and protein levels 500-fold and 1,000-fold, respectively. Controlling transcript and protein levels over such a large dynamic range undoubtedly will prove useful for anyone attempting to engineer metabolic pathways.

Alternative Control Elements

Work in this field initially focused on multiple promoter systems and directed mRNA processing as a way of optimizing multigene expression systems. However, as we learn more about the various controls cells use to coordinate gene expression, alternative design strategies that might be applied to these systems will be discovered. Controlling translation initiation and elongation provides one such potential means for coordinating gene expression. For example, native operons sometimes use translational coupling to coordinate the expression of genes within a multicistronic transcript. In some cases, this control depends on a secondary structure that sequesters the ribosome binding site (RBS) of the distal gene, thereby blocking its translation. However, when the proximal gene is being translated, this inhibitory structure unfolds, rendering the previously sequestered RBS accessible. This natural phenomenon might be adapted for use in producing two proteins within an engineered pathway at equivalent levels.

Protein fusion systems are often used to monitor production of one protein by tracking a reporter protein fused to it. This approach could be put to a different use—for example, by fusing two genes encoding different enzymes in a pathway to balance their production. Enzyme fusion strategies may solve other problems. For example, they might provide a means for properly folding otherwise difficult proteins or for forcing two enzymes of a metabolic pathway together, providing an efficient means for channeling products of the first to be substrates of the second.

Deliberately induced molecular breeding based on DNA shuffling is becoming a widely used technique for rapidly generating diversity in genes as a way of seeking particular traits. Extensions of this method from a focus on single genes to entire metabolic pathways and, in some cases, microbial genomes are now being developed. Applying this approach to particular biochemical pathways might yield operons with balanced expression and activity of each enzyme within those pathways.

Putting Several Sweeping Analytic Approaches To Use

Technologies for analyzing metabolism on a cell-wide basis could prove very useful in metabolic engineering. Altering or engineering cell-wide gene expression (the transcriptome), corresponding proteins (the proteome), and levels of cellular metabolites (the metabolome) could be used to improve desired metabolic properties. Several analytic approaches could play roles in these ambitious efforts, including:

  • DNA microarrays that can detect relative transcript levels in cells are being used to analyze metabolism of several microorganisms used in industry, including S. cerevisiae, E. coli, B. subtilis, and Corynebacterium glutamicum. This technology tracks global changes in mRNA levels in response to different mutational backgrounds, heterologous protein production, and growth conditions. However, transcript levels do not always correlate with protein production, enzyme activity, and metabolic flux.

  • Two-dimensional gels track changes in global cellular protein production. Coupled with recent advances in protein separation and mass spectrometry, these techniques are collectively referred to as proteomics, bringing analysis an important step closer to physiology. Thus, changes in one or more proteins, either endogenous or heterologous, are more likely to affect cellular physiology.

  • Metabolomics, a comprehensive analysis of all cellular metabolites, would allow one to monitor imbalances in intermediates in engineered and endogenous pathways. Progress toward such a sweeping undertaking is limited, and it is based on capillary electrophoresis, high-performance liquid chromatography, gas chromatography coupled with mass spectrometry, and nuclear magnetic resonance spectroscopy.

  • Phenotypic microarrays, developed by Biolog, use a simple, inexpensive colorimetric assay developed in 96-well plates to analyze thousands of different phenotypes through detection of changes in cellular respiration.

  • These analyses produce massive amounts of data to interpret—representing another major challenge facing those attempting to optimize cellular metabolism for specific purposes. Metabolic models for E. coli and various other industrial microorganisms are being developed to analyze cellular resources that may be diverted into heterologous metabolic pathways and to identify bottlenecks. However, such analyses are complex, and available models typically are designed to analyze specific subsets of cellular metabolism, or particular growth conditions and physiological states. Gene expression models will need to be integrated with metabolic models if we hope to obtain a clear picture of the effects of changes in metabolism on cellular physiology.

SUGGESTED READING

Datsenko, K. A., and B. L. Wanner. 2000. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Nat. Acad. Sci. USA 97:6640-6645.

de Lorenzo, V., M. Herrero, J. M. Sanchez, and K. N. Timmis. 1998. Mini-transposons in microbial ecology and environmental biotechnology. FEMS Microbiol. Ecol. 27:211-224.

Farmer, W. R., and J. C. Liao. 2000. Improving lycopene production in Escherichia coli by engineering metabolic control. Nature Biotechnol. 18:533-537.

Haldimann, A., and B. L. Wanner. 2001. Conditional-replication, integration, excision, and retrieval plasmid-host systems for gene structure-function studies in bacteria. J. Bacteriol. 183:6384-6393.

Jones, K. L., and J. D. Keasling. 1998. Construction and characterization of F plasmid-based expression vectors. Biotechnol. Bioeng. 59:659-665.

Khlebnikov, A., O. Risa, T. Skaug, T. A. Carrier, and J. D. Keasling. 2000. Regulatable arabinose-inducible gene expression system with consistent control in all cells of a culture. J. Bacteriol. 182:7029-7034.

Solem, C., and P. R. Jensen. 2002. Modulation of gene expression made easy. Appl. Environ. Microbiol. 68:2397-2403

Smolke, C. D., T. A. Carrier, and J. D. Keasling. 2000. Coordinated, differential expression of two genes through directed mRNA cleavage and stabilization by secondary structures. Appl. Environ. Microbiol. 66:5399-5405.

Smolke, C. D., V. J. J. Martin, and J. D. Keasling. 2001. Controlling the metabolic flux through the carotenoid pathway using directed mRNA processing and stabilization. Metabol. Eng. 3:313-321.

Zhang, Y.-X., K. Perry, V. A. Vinci, K. Powell, W. P. C. Stemmer, and S. B. delCardayre. 2002. Genome shuffling leads to rapid phenotypic improvement in bacteria. Nature 415:644-646.

Last Modified:July 16, 2002
Email: webmaster@asmusa.org
Copyright © 2002 American Society for Microbiology All rights reserved ASM
HomeSite Map Search ASM Site