Supplementary Information For

Reference number: 2005-11-12577

Supplementary Information for

Production of the anti-malarial drug precursor artemisinic acid in engineered yeast

Supplementary Figure 1.

Legend for Supplementary Figure 1. Phylogenetic reconstruction of CYP71AV1 with other plant P450s of known functions.

Tree reconstruction was performed with the tree-bisection-reconnection heuristic search algorithm. Selected P450s were plant specific P450s (A-type) of known functions and three non-A-type P450s (88A3, 90A1, and 90B1)1 which served as an outgroup. An arrow-head indicates the A. annua CYP71AV1, amorphadiene oxidase. The bracket indicates the plant CYP71D subfamily, which comprises tobacco cembratrienol hydroxylase (71D16), tobacco 5-epi-aristolochene hydroxylase (71D20), mint limonene 3-hydroxylase (71D13), mint limonene 6-hydroxylase (71D18), and Madagascar periwinkle tabersonine 16-hydroxylase (71D12). Bootstrap values (percent of 1,000 replicates) for each cluster are given at the nodes. GenBank accession numbers or Arabidopsis Genome Initiative (AGI) numbers are given. 71C1 to 71C4, four P450s involved in DIMBOA (cyclic hydroxamic acids 2,4-dihydroxy-1,4-benzoxazin-3-one) biosynthesis from Z. mays (X81827, Y11404, Y11403, and Y11368); 71D20, aristolochene-1,3-dihydroxylase from N. tabacum (AF368376); 71D16, cembratrienol hydroxylase from N. tabacum (AF166332); 71D13, limonene 3-hydroxylase from Mentha x Piperita (AF124816); 71D18, limonene 6-hydroxylase from M. spicata (AF124815); 71D12, tabersonine 16-hydroxylase from C. roseus (AJ238612); 73A1, cinnamate-4-hydroxylase from H. tuberosus (Z17369); 75A1, flavonoid 3’,5’-hydroxylase from P. hybrida (Z22544); 75B2, flavonoid 3'-hydroxylase from P. hybrida (AF155332); 76B6, geraniol 10-hydroxylase from C. roseus (AJ251269); 79F1, aliphatic glucosinolate biosynthesis from A. thaliana (At1g16410); 84A1, coniferaldehyde hydroxylase from A. thaliana (At4g36220); 93C1, isoflavone synthase from G. max (AF135484); 98A3, 5-O-(4-coumaroyl shikimate)-3’- hydroxylase from A. thaliana (At2g40890); 701A3, ent-kaurene oxidase from A. thaliana (AF047719); 706B1, cadinene 8-hydroxylase from G. arboretum (AF332974); 88A3, ent-kaurenoic acid oxidase from A. thaliana (AF318501); 90A1, brassinosteroid 23-hydroxylase from A. thaliana (X87367); 90B1, brassinosteroid 22-hydroxyalse from A. thaliana (AF044216).

Supplementary Methods

Cloning of CYP71AV1 and CPR cDNA. A cDNA pool was prepared by Super SMART PCR cDNA synthesis kit (BD Bioscience) using 50 ng of total RNA purified from A. annua trichome-enriched cells. Degenerate P450 primers were designed from a conserved amino acid motif of lettuce and sunflower CYP71 subfamily; primer 1 from [Y/Q]G[E/D][H/Y]WR (forward) and primer 2 from FIPERF (reverse) (see, Supplementary Table I for sequence information). Polymerase chain reaction (PCR) using these primers and A. annua cDNAs yielded a 1-kb DNA fragment. The PCR-program used was 7 cycles with 48 °C annealing temperature and additional 27 cycles with 55 °C annealing temperature. The deduced amino acids from the amplified gene fragment showed 85% and 88% amino acid identity to the sunflower (QH_CA_Contig1442) and lettuce (QG_CA_Contig7108) contigs, respectively. The Compositae EST-database can be found at cgpdb.ucdavis.edu. A. annua CPR fragment was isolated using a forward primer (primer 3), and a reverse primer (primer 4), designed from the conserved QYEHFNKI and CGDAKGMA motifs, respectively. The PCR-program used was 30 cycles with 50 °C annealing temperature. Both 5’- and 3’-end sequences for CYP71AV1 and CPR were determined using an RLM-RACE kit (Ambion) followed by full-length cDNA recovery from A. annua leaf cDNAs. The open reading frames of CYP71AV1 and CPR were amplified by PCR and ligated into the SpeI and BamHI/SalI sites of pESC-URA (Stratagene) in FLAG and cMyc tagging, respectively. For PCR-amplification of CYP71AV1, primers 5 and 6 were used; for PCR-amplification of CPR, primers 7 and 8 were used. The PCR-program used was 35 cycles with 55 °C annealing temperature. All clones were sequenced to confirm sequences.

Plant extract analysis. A. annua leaf (100 to 200 mg fresh weight) was vigorously shaked in 1 mL hexane spiked with 5.8 mM octadecane as an internal standard for 2 hours. The hexanolic extracts were concentrated to 200 mL, and 1 mL sample was used for the GC-MS analysis using a DB-XLB column (0.25 mm i.d. x 0.25 μm x 30 m, J & W Scientific) to determine artemisinin content from 14 plant samples as described2. GC oven program used was 100 °C to 250 °C in 5 °C min-1 increment. The plant hexanolic extracts were derivatized by TMS-diazomethane to determine artemisinic acid content by the GC-FID equipped with DB5 column (n = 8). The GC oven program used was 80 °C (hold 2 min), 20 °C min-1 ramp to 140 °C, product separation by 5 °C min-1 increment up to 220 °C. Authentic artemisinin standards were purchased from Sigma-Aldrich (St. Louis, MO).

Synthesis of artemisinic alcohol. Artemisinic acid (100.0 mg, 0.43 mmol) was dissolved in THF (10.0 mL) and LiAlH4 (17.0 mg, 0.45 mmol) was added. The heterogeneous mixture was held at reflux (70oC) for 15 h. After cooling, the reaction was quenched with water (3.0 mL) and 15% aqueous NaOH (3.0 mL), stirred for 10 min and filtered through celite. The organic phase was separated, dried over MgSO4, and concentrated using a rotary evaporator. The product was purified by column chromatography (2:1 hexanes/EtOAc) to give 61.0 mg (65% yield) of the alcohol as a colorless oil. A minor amount of artemisinic acid contaminant was further removed by column chromatography over neutral alumina (Brockman activity 1). Characterization data was consistent with literature values3.

Synthesis of artemisinic aldehyde. Artemisinic alcohol was oxidized to artemisinic aldehyde following a procedure reported in the literature4. In a flame-dried 10-mL flask containing RuCl2(PPh3)3 (17.0 mg, 0.018 mmol) and N-methyl morpholine N-oxide (60.0 mg, 0.51 mmol) under an atmosphere of argon was added acetone (4.0 mL). To the solution was added artemisinic alcohol (55.0 mg, 0.25 mmol) dissolved in acetone (1.0 mL) via syringe. The mixture was stirred at 23 oC for 2 h and concentrated in vacuo. The crude product was purified by column chromatography (4:1 hexanes/EtOAc) to give 32.0 mg (59% yield) of artemisinic aldehyde as a colorless oil. Characterization data was consistent with literature report3.

EPY strain generation and characterization

Chemicals. Dodecane and caryophyllene were purchased from Sigma-Aldrich (St. Louis, MO). 5-fluoroortic acid (5-FOA) was purchased from Zymo Research (Orange, CA). Complete Supplement Mixtures for formulation of Synthetic Defined (SD) media were purchased from Qbiogene (Irvine, CA). All other media components were purchased from either Sigma-Aldrich or Becton, Dickinson (Franklin Lakes, NJ).

Strains and media. Escherichia coli strains DH10B and DH5α were used for bacterial transformation and plasmid amplification in the construction of the expression plasmids used in this study. The strains were cultivated at 37 ºC in Luria-Bertani medium with 100 mg L-1 ampicillin with the exception of pδ-UB–based plasmids which were cultivated with 50 mg L-1 ampicillin using DH5α.

Saccharomyces cerevisiae strain BY47425, a derivative of S288C, was used as the parent strain for all yeast strains. This strain was grown in rich YPD medium6. Engineered yeast strains were grown in SD medium6 with leucine, uracil, histidine, and/or methionine dropped out where appropriate. For induction of genes expressed from the GAL1 promoter, S. cerevisiae strains were grown in 2% galactose as the sole carbon source.

Plasmid construction. To create plasmid pRS425ADS for expression of ADS with the GAL1 promoter, ADS was PCR amplified from pADS7 using primer pair 9 and 10. (Supplementary Table I). Using these primers the nucleotide sequence 5'-AAAACA-3' was cloned immediately upstream of the start codon of ADS. This consensus sequence was used for efficient translation8,9 of ADS and the other galactose-inducible genes used in this study. The amplified product was cleaved with SpeI and HindIII and cloned into SpeI and HindIII digested pRS425GAL110.

For integration of an expression cassette for tHMGR, plasmid pδ-HMGR was constructed. First SacII restriction sites were introduced into pRS426GAL110 at the 5' end of the GAL1 promoter and 3' end of the CYC1 terminator. To achieve this, the promoter-multiple cloning site-terminator cassette of pRS426GAL1 was PCR amplified using primer pair 11 and 12. The amplified product was cloned directly into PvuII-digested pRS426GAL1 to construct vector pRS426-SacII. The catalytic domain of HMG1 was PCR amplified from plasmid pRH127-311 with primer pair 13 and 14. The amplified product was cleaved with BamHI and SalI and cloned into BamHI and XhoI digested pRS426-SacII. pRS-HMGR was cleaved with SacII and the expression cassette fragment was gel extracted and cloned into SacII digested pδ-UB12.

The upc2-1 allele of UPC2 was PCR amplified from plasmid pBD33 (provided by Jasper Rine) using primer pair 15 and 16. The amplified product was cleaved with BamHI and SalI and cloned into BamHI and XhoI digested pRS426-SacII to create plasmid pRS-UPC2. For the integration of upc2-1, pδ-UPC2 was created in an identical manner by digesting pRS-UPC2 with SacII and moving the appropriate fragment to pδ-UB.

To replace the ERG9 promoter with the MET3 promoter, plasmid pRS-ERG9 was constructed. Plasmid pRH973 (provided by Randy Hampton)13 contained a truncated 5' segment of ERG9 placed behind the MET3 promoter. pRH973 was cleaved with ApaI and ClaI and cloned into ApaI and ClaI digested pRS403 which has a HIS3 selection marker14.

For expression of ERG20, plasmid pδ-ERG20 was constructed. Plasmid pRS-SacII was first digested with SalI and XhoI which created compatible cohesive ends. The plasmid was then self-ligated, eliminating SalI and XhoI sites to create plasmid pRS-SacII-DX. ERG20 was PCR amplified from the genomic DNA of BY4742 using primer pair 17 and 18. The amplified product was cleaved with SpeI and SmaI and cloned into SpeI and SmaI digested pRS-SacII-DX. pRS-ERG20 was then cleaved with SacII and the expression cassette fragment was gel extracted and cloned into SacII digested pδ-UB.

Yeast transformation and strain construction. S. cerevisiae strain BY47425, a derivative of S288C was used as the parent strain for all S. cerevisiae strains. Transformation of all strains of S. cerevisiae was performed by the standard lithium acetate method15. Three to ten colonies from each transformation were screened for the selection of the highest amorphadiene producing transformant. Strain EPY201 was constructed by the transformation of strain BY4742 with plasmid pRS425ADS and selection on SD-LEU plates. Plasmid pδ-HMGR was digested with XhoI before transformation of the DNA into strain EPY201. After initial selection on SD-LEU-URA plates, transformants were cultured and plated on SD-LEU plates containing 1 g L-1 5-FOA as a selection for the loss of the URA3 marker. The resulting uracil auxotroph, EPY208 was then transformed with XhoI-digested pδ-UPC2 plasmid DNA. After initial selection on SD-LEU-URA plates, transformants were cultured and plated on SD-LEU plates including 1 g L-1 5-FOA for the construction of EPY210. Plasmid pRS-ERG9 was cleaved with HindII for the integration of the PMET3-ERG9 fusion at the ERG9 loci of EPY208 and EPY210 for the construction of EPY213 and EPY225, respectively. These strains were selected for on SD-LEU-HIS-MET plates. EPY213 was then transformed with XhoI digested pδ-HMGR plasmid DNA. After initial selection on SD-LEU-URA-HIS-MET plates, transformants were cultured and plated on SD-LEU-HIS-MET plates containing 1 g L-1 5-FOA for the construction of EPY219. EPY219 was transformed with XhoI digested pδ-ERG20 plasmid DNA. After initial selection on SD-LEU-URA-HIS-MET plates, transformants were cultured and plated on SD-LEU-HIS-MET plates including 1 g L-1 5-FOA for the construction of EPY224.

Integration of pRS-ERG9 was verified by PCR analysis using two sets of primers. Each set contained one oligo to bind to the inserted DNA and one to bind to the genomic DNA surrounding the insertion. All other integrations were verified for full length insertion using a primer binding to the 5'-end of the GAL1 promoter and 3'-end of the fused gene.

Yeast cultivation. All optical densities at 600 nm (OD600) measurements were taken using a Beckman DU-640 spectrophotometer. To measure amorphadiene production, culture tubes containing 5 mL of SD (2% galactose) medium (with appropriate amino acid omissions as described above) were inoculated with the strains of interest. These innocula were grown at 30ºC to OD600 between 1 and 2. Unbaffled culture flasks (250 mL) containing 50 mL SD medium were inoculated to an OD600 0.05 with these seed cultures. Amorphadiene production was measured after 6 days of growth. 1 mM methionine was present in each culture for repression of the PMET3-ERG9 fusion at the ERG9 loci. All flasks also contained 5 mL dodecane. This dodecane layer was sampled and diluted in ethyl acetate for determination of amorphadiene production by GC-MS.

Supplementary Discussion

Process considerations.

With the development of an industrial strain and an optimized fermentation and purification process, we project yields in excess of 25 g L-1 artemisinic acid, well below yields for other high-value commodity chemicals produced by fermentation16-18. Given production levels of artemisinic acid at 25 g L-1 in fermentation, published yields for chemical transformations (conversion of artemisinic acid to artemisinin or various derivatives)19,20, and production levels in excess of 100 tons annually, we project that artemisinin or its derivatives could be produced at costs significantly below current prices, thereby lowering the cost of an artemisinin combination therapy by a significant amount.

Supplementary References

1. Durst, F. & Nelson, D. R. Diversity and evolution of plant P450 and P450-reductases. Drug Metab. Drug Interact. 12, 189-206 (1995).

2. Woerdenbag, H. J., Pras, N., Bos, R., Visser, J. F., Hendriks, H. & Malingre, T. M. Analysis of artemisinin and related sesquiterpenoids from Artemisia annua L. by combined gas chromatography/mass spectrometry. Phytochem. Anal., 2, 215-219 (1991).

3. Bertea, C. M. et al. Identification of intermediates and enzymes involved in the early steps of artemisinin biosynthesis in Artemisia annua. Planta Med 71, 40-47 (2005).

4. Sharpless, K. B., Akashi, K. & Oshima, K. Ruthenium catalyzed oxidation of alcohols to aldehydes and ketones by amine-n-oxides. Tetrahedron Letters 17, 2503-2506 (1976).

5. Brachmann, C. B., Davis, A., Cost, G. J., Caputo, E., Li, J., Hieter, P., Boeke, J. D. Designer deletion strains derived from Saccharomyces cerevisiae S288C: A useful set of strains and plasmids for PCR-mediated gene disruption and other applications. Yeast 14, 115-132 (1998).

6. Burke, D., Dawson, D. & Stearns, T. Methods in yeast genetics: a Cold Spring Harbor laboratory course manual (Cold Spring Harbor Laboratory Press, Plainview, NY, 2000).

7. Martin, V. J., Pitera, D. J., Withers, S. T., Newman, J. D. & Keasling, J. D. Engineering a mevalonate pathway in Escherichia coli for production of terpenoids. Nat. Biotechnol. 21, 796-802 (2003).

8. Looman, A. C. & Kuivenhoven, J. A. Influence of the 3 nucleotides upstream of the initiation codon on expression of the Escherichia-coli lacz gene in Saccharomyces cerevisiae. Nucleic Acids Research 21, 4268-4271 (1993).