Chap 5 Manipulation of Gene Expression in Procaryotes

I.  Introduction

l  A major objective of gene cloning is the expression of the cloned gene to study the biologic functions or to produce recombinant proteins (i.e. insulin). But gene cloning doesn’t guarantee successful expression.

Factors that influence gene expression

1.  The nature of the transcriptional promoter and terminator sequences

2.  The strength of ribosome binding site

3.  Efficiency of translation (mRNA stability, mRNA secondary structure..)

4.  # of the cloned gene copies (or # of plasmids) and whether the gene is plasmid borne or integrated into the chromosome.

5.  Nature and cellular location of the expressed protein (intra- or extracellular? secreted? toxic?)

6.  Post-translational processing: glycosylation, proteolytic processing…

7.  The intrinsic stability of the protein (misfolding of the proteins? susceptible to proteolysis?)

l  A large fraction of proteins (varying from 30% to 70% of all proteins made) is immediately degraded after synthesis before forming functional proteins8. These so-called DRiPs (defective ribosomal products) are the result of defective transcription or translation, alternative reading frame usage, failed assembly into larger protein complexes, the incorporation of wrong amino acids owing to mistakes by aminoacyl-tRNA synthetases or altered ubiquitin modifications. DRiPs are immediately degraded to prevent the formation of protein aggregates, which would affect cell viability.

l  Ubiquitination is a post-translational modification in which ubiquitin, a 76–amino acid protein, is covalently added to lysine residues. In humans, the ubiquitination reaction is catalyzed by >500 E3 ligases, each of which transfers ubiquitin to specific protein targets. There are several types of ubiquitin modification, and these may have different effects on target proteins. The best known is the polyubiquitin chain, which targets proteins for proteasomal degradation. The polyubiquitin chain begins with a ubiquitin conjugated at its C terminus to a lysine residue in a target protein.

l  26S proteasome: A giant multicatalytic protease that resides in the cytosol and the nucleus. The 20S core, which contains three distinct catalytic subunits, can be appended at either end by a 19S cap or an 11S cap. The binding of two 19S caps to the 20S core forms the 26S proteasome, which degrades polyubiquitylated proteins into peptides[1].

l  Some of the above factors can be improved by proper design (i.e. select strong promoter, use multiple gene copies…)

èChoice of expression system is very important!!!

l  Major expression systems are classified into procaryotic and eucaryotic.

Procaryotic (e.g. E. coli):

Pros:

1.  Very well-studied and common in protein production.

2.  Grow fast (doubling time»20 min), grow easilyè easy fermenter operation.

3.  Normally high yield (high cell density).

4.  Minimum media (simple composition, e.g. Na+, K+, Mg2+, Ca2+, NH4+, Cl-, SO42-, glucose and carbon sources)è cheap.

Cons:

1.  Often fail to perform suitable post-translational modifications.

2.  Inclusion body (insoluble proteins) when overexpressingè makes purification and regaining of protein conformation (protein renaturation) more difficult.

Eucaryotic: next chapter

II.  Strong and regulatable promoters

l  Why strong promoters?

n  Has higher affinity for RNA pol so the downstream gene is highly (frequently) transcribed.

l  Why regulatable promoters?

n  Continuous overexpression of a cloned gene is often detrimental to the host cell because it drains the energy and other resources and impair cellular functions.

ègenes are constructed under strong and regulatable promoters.

ègenes are expressed only when “induced”.

Examples

1.  E. coli lac promoter:

(a) Regulated by IPTG

n  Cells are grown in the absence of lactose and repressor binds to the operatorègenes can’t be transcribed. Only when IPTG is added then starts the gene expression.

n  Very common

(b) regulated by CAP (catabolite activator protein)

l  Combining the above, induce protein expression at high IPTG (or lactose) and low [glucose]. (èhigh [cAMP])èhighest transcription.

2.  Trp promoter: (regulates the transcription of genes responsible for Trp synthesis)

n  off (negatively regulated): tryptophan-trp repressor protein complex binding to trp operatorè transcription shutdown

n  on (positively regulated): removal of tryptophan

3.  Bacteriophage T7 promoter:

n  T7 promoter is very strong, but requires T7 RNA pol to activate.

n  Two recombinant genes can be co-introduced into the cells for expression. Alternatively, the genes encoding T7 RNA pol can be integrated into the chromosomal DNA to form a stable cell line.

4.  pL promoter (from bacteriophage l):

n  Controlled by cI repressor protein

n  Cells carrying temp-sensitive cI repressor are grown at 28°C (cI repressor is expressed under its own promoter pCI at 28°C) è cI repressor prevents transcriptionè when CD is high enoughèincrease to 42°Cè thermosensitive cI repressor is inactivatedè transcription is on.

l  Effectiveness of deactivating a repressor depends on

ratio too largeè difficult to induce

ratio too smallè transcription is “leaky” (transcription occurs in the absence of inducer)

l  Strategy:

Put repressor genes in a plasmid: low copy # (e.g. 1-8 copies/cell)

Put promoter-target gene in another plasmid: high copy number (e.g. 30-300 copies/cell)

è maintain the ratio to effectively deactivate and activate.

III. Expression vectors

l  Regulatable, strong promoters may not guarantee high yield of gene products. Efficiency of translation, stability of protein, etc. also are factors[2]. Expression vectors are similar to cloning vectors but contain more elements to confer efficient expression.

e.g. The expression plasmid pKK233-2 contains:

n  tac promoter (a hybrid that includes the -10 region of lac promoter and -35 region of trp promoter, can be induced by IPTG, 3X and 10X stronger than trp and lac promoters, respectively)

n  RBS, ori. (RBS: a sequence of 6-8 nt (e.g. UAAGGAGG) in mRNA that can base pair with rRNA on the ribosome, generally, binding of mRNA to rRNA increases, the translation initiation increases)

n  An ATG start codon about 8 nt downstream from the RBS (optional)

n  Multiple cloning site

n  Ampr gene as a selectable marker

Note:

n  the RNA sequence from RBS to the first few codons of the cloned gene must not form intrastrand loops, which hampers the binding to ribosome

n  DNA sequence is written as the coding strand, so ATG is often seen as the starting point.

IV.  Fusion Proteins

l  Problems: yield of foreign proteins normally low for various reasons (e.g. degradation by proteases)

l  Solution: covalently attach the cloned gene product to a stable (host) protein to form a fusion proteinè to protect the desired recombinant protein.

l  Construct at DNA level

n  transcribed RNA must have correct base sequence (stop codon in the middle must be eliminated)

n  Reading frame must be correct, base sequence in the linker must be precise, otherwise ORF will be wrong (need to know the precise sequence of these two proteins)

Cleavage of fusion proteins

l  The fusion may not be suitable as the final product because:

n  The biological function might be lost

n  Stringent regulation by government agencies (e.g. FDA)

l  The EK cleavage site enables the cleavage of the fusion by enterokinase at the specified site.

l  Another linker often used is the Xa linker (Ile-Glu-Gly-Arg) which can be recognized by a blood coagulation factor (Xa) and specifically recognized at the C-terminusà the desired protein should therefore be in the second segment.

Applications of fusion proteins (many applications, give 2 example only)

1. Simplifying purification

l  dual function of the fusion:

n  reduce the degradation, enable the cleavage

n  enable the product to be purified by immunoaffinity chromatography in which MAb directed against Flag is immobilized on a polypropylene support and used as a ligand to bind the fusion.

2. Stabilizing the protein (e.g. EnbrelÒ)

l  EnbrelÒ is a recombinant protein that is approved by FDA to treat autoimmune diseases (e.g. rheumatoid arthritis and psoriatic arthritis) by interfering with tumor necrosis factor (TNF; a soluble inflammatory cytokine) by acting as a TNF inhibitor. TNF-a is the "master regulator" of the inflammatory response in many organ systems and excess TNF-a causes aberrant inflammation.

l  EnbrelÒ is a fusion protein produced by recombinant DNA. It fuses the TNF receptor 2 to the Fc end of the IgG1 antibody. TNF receptor 2 binds to TNF-a. The protein is highly active and unusually stable as a modality for blockade of TNF in vivo.

V.  Golden Gate Shuffling: A One-Pot DNA shuffling Method

l  Limitations of the traditional cloning methods

n  Time consuming

n  Inefficient

l  Golden Gate Shuffling is a protocol to assemble separate DNA fragments together into a vector in one step and one tube.

l  The principle of the cloning strategy is based on the ability of type IIS restriction enzymes (e.g. BsaI) to cut outside of their recognition site.

n  Two DNA ends terminated by the same 4 nucleotides (sequence f, composed of nucleotides 1234) can be synthesized by PCR, where sequences f are flanked by a BsaI recognition sequence, B.

n  The type IIs restriction enzymes removes the enzyme recognition sites and generates ends with complementary 4 nt overhangs.

n  These ends can be ligated seamlessly, creating a junction that lacks the original site.

l  Ex: One-pot one-step assembly of 9 fragments

n  First select a number of 4 nucleotides ‘recombination sites’ on a nucleotide sequence alignment of several homologous genes.

n  The selection of these recombination sites defines modules that consist of a core sequence (C1-C9) flanked by two 4 nt sequences.

n  These 9 modules can be amplified by PCR with primers designed to add flanking BsaI sites on each side of the modules (the BsaI cleavage sites perfectly overlapping with the recombination sites) and cloned into 9 plasmids separately.

n  The recipient expression vector, pX-LacZ contains two BsaI sites compatible with the first (C1) and last (C9) modules.

n  Mix the 9 module plasmids and 1 recipient plasmid into one tube. Add BsaI and ligase.

VI.  Increasing protein stability

l  Normally, the half lives of proteins range from a few minutes to hours (some exceptions exist, e.g. collagen has a half life of years).

a.a. added / Half life
Met, Ser, Ala / > 20 h
Thr, Val, Gly / > 20 h
Ile, Glu / > 30 min
Arg / » 2 min

l  Normally, proteins with more disulfide bonds (S-S between Cys) and certain amino acids at the N-terminus are more stableè more proteins accumulate and the yield increases.

Ex: stability of b-galactosidase with certain a.a added to the N-terminus

Strategies:

l  Change the a.a. at the N-terminus

l  Increase the number of S-S bonds

l  Co-express chaperone proteins (e.g. groEL, dna J, dna K….) to aid the protein folding

VII.  Overcoming O2 Limitation

l  Oxygen is generally required for cell growth, to support respiration and maintain cellular functions and protein expression, but oxygen’s solubility is low. If the CD is high, even larger amount of air or oxygen or increasing the stirring speed may not be enough. When O2 depletion occurs, cells would enter stationary phase and die eventually.

l  If engineering approaches fail, what can we do?è bacterial hemoglobin

Solve: bacterium “Vitreoscilla” inhabits in stagnant ponds (oxygen deficient). To obtain oxygen for growth and metabolism, the bacteria express a hemoglobin-like protein that fetch oxygen from the environment and transport into the cells.

l  When this gene is cloned and expressed in E. coli, the recombinant E. coli shows higher metabolic activity and higher protein production at low levels of O2.

VIII.  DNA Integration into the Host Chromosome

Why integrate DNA into the chromosome?

l  Plasmid-borne expression drains the cellular energy because the antibiotics-resistant and other genes are expressed and the plasmid replication requires the resources and energies too.

l  Plasmid instability: plasmid-free cells outgrow plasmid-bearing cells, so after several passages, the percentage of cells bearing plasmids dropsèprotein expression level drops.

Integration using the plasmid

1.  Choose a suitable integration site.

2.  Clone part of the chromosomal DNA sequence at the integration site into the vector (e.g. plasmid). The chromosomal DNA sequence on the vector and at the integration site must be similar in sequence, typically >500 bp, so that homologous recombination can occur.

3.  Clone the target gene (and promoter) into the plasmid (vector) flanked by the chromosomal DNA sequence.

4.  Transfer the plasmid into a host cell (The vector does not replicate or can be removed from the host cell).

5.  Select the host cells that have the target gene integrated into the chromosome.

6.  Drawback: inefficient, long homology arm is required.

Integration using the l red system [1]: Recombineering

1.  Derived from bacteriophage λ, requires 3 proteins: Exo, Beta, and Gam

n  Exo has 5’ exonuclease activity that degrades one entire strand of dsDNA to ssDNA when dsDNA is introduced into a cell. The ssDNA is stabilized and protected from exonuclease attack when Beta binds to it.

n  Beta also delivers ssDNA to the target replication fork and facilitates annealing of the ssDNA to the target site (the mechanism is proposed but is not confirmed and still controversial).

n  Gam in E. coli is to inhibit the activity of the bacterial RecBCD protein complex by binding to it (otherwise RecBCD would degrade the incoming ds or ss DNA)

2.  These (Red) proteins should be tightly regulated because continuous expression of Exo and Beta increases background recombination and long-existing Gam could be toxic to the cell.

3. 
Transform a plasmid encoding Exo, Beta, and Gam under an inducible promoter (e.g. pL or others)è Transiently induce the Red proteins expressionè Introduce the template DNA (usually electroporation of oligonucleotides or PCR products) with the homology arm (as short as » 50 bp)èrecombination.