Supplementalmethods File

SupplementalMethods File

Zager and Lange (2018) Assessing Flux Distribution Associated with Metabolic Specialization in Glandular Trichomes. Trends in Plant Sciences.

Reconstructing a Consensus Base Model

The development of a well curated model of metabolism requires the integration of large amounts of data from many different sources. The recently developed YASMEnv toolbox [1] ( an open-source modeling environment, was adapted for the reconstruction of glandular trichome (GT) metabolism. We first created a “Consensus Base Model” consisting of 674 stoichiometrically balanced metabolic reactions, largely based on the set of metabolic reactions within the Arabidopsis Core Model, to cover the reactions pertaining to primary metabolism of a leaf mesophyll cell of Arabidopsis thaliana, a well-characterized model plant [2]. We assumed that these reactions would be common to all species. To ensure the consistency of nomenclature, names of metabolites were then exchanged, if necessary, with the corresponding name of the MetaCyc Metabolic Pathway Database ( [3]. A YASMEnv script was used to extract locus identifiers for transcripts that support the reactions in the Arabidopsis Core Model. Peptide sequences corresponding to each locus were obtained by bulk download from The Arabidopsis Information Resource (TAIR) ( [4].

Deciding about Species to Include in Glandular Trichome Metabolism Meta-Analysis

The process to develop species-specific GT models from the Consensus Base Model involved the assessment of relevant transcriptome data sets. This approach assumes that the transcriptome reflects the specialization of GTs and can therefore be employed to infer which pathways to include or exclude in any given model. The Sequence Read Archive (SRA) at the National Center for Biotechnology Information ( houses raw sequencing data and alignment information from high-throughput sequencing platforms, including next-generation transcriptome sequencing (RNA-Seq). Currently, the Illumina platform is the most widely used RNA-Seq technology for both its depth/coverage and comparatively low cost. To enable platform-independent comparisons, we therefore decided to restrict our Meta-Analysis to Illumina data sets (criterion 1). Keywords searches were then employed to findGT-specific Illumina RNA-Seq data sets within the SRA database. Data sets for metabolically inactive trichomes (based on the available literature) were not considered further (criterion 2). Finally, if RNA-Seq data sets had been deposited for closely related species with (based on the available literature) essentially identical metabolism, a representative species was selected for further consideration (criterion 3). The RNA-Seq data sets for the selected species (Camptotheca acuminata, Humulus lupulus, Hyoscyamus niger, Mentha longifolia, Mentha piperita, Salpiglossis sinuata, Salvia divinorum, Salvia pomifera, Solanum nigrum, and Xanthium strumarium) were then subjected to the Trinity pipeline forde novo transcriptome assemblies (Inchworm, Chrysalis and Butterfly modules), transcript abundance quantitation(RSEM algorithm), and functional annotation (Trinotate suite) [5] (Scheme 1).

Scheme 1. Data and resources for genome-scale model development.

Species / NCBI Short Read Archive Accession # / Reactions in Network / Biomass Objective Function (concentrations as mg/g) / References
Salpiglossis sinuata / SRR2229603
SRR2229604 / 681 / Flavonoids: 0.242 (Genkwanin), 0.242 (7,3'-O-dimethyl Luteolin), 0.242 (3,7,3'-O-Trimethyl-quercetrin), 0.242 (Luteolin), 0.242 (3,7-O-Dimethyl-quercetin), 0.242 (3-O-Methyl-quercetin) / [8-15]
Acyl sugars: 6.0 (S4:17-2,5,5,5)
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids),
7.7997 Lipid, 29.1109 (Protein; amino acid composition of isocitrate dehydrogenase)
Solanum quitoense / SRR2229596
SRR2229599 / 667 / Acyl sugars: 6.0 (S4:17-2,5,5,5)
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids),
7.7997 Lipid, 29.1109 (Protein; amino acid composition of caffeoyl-CoA O-methyltransferase) / [9-15]
Xanthium strumarium / SRR1942914
SRR1928165
SRR1942944 / 667 / Sesquiterpene lactones: 11.0 (Costunolide)
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids),
7.7997 Lipid, 29.1109 (Protein; amino acid composition of germacrene A oxidase) / [11-17]
Camptotheca acuminata / SRR173047 / 677 / Monoterpenoid indole alkaloid: 1.045 (Camptothecin) / [11-15;
18-20]
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids),
7.7997 Lipid, 29.1109 (Protein; amino acid composition of RUBISCO, small subunit)
Humulus luplulus / SRR575191
SRR575197
SRR575193
SRR575195
SRR575189 / 687 / Polyketides: 66.51 (Humulone), 15.15 (Adhumulone), 27.57 (Cohumulone), 36.33 (Lupulone), 36.68 (Colupulone)
Prenylated flavonoids: 7.53 (Xanthohumol),
1.70 (Desmethylxanthohumol)
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids),
7.7997 Lipid, 29.1109 (Protein; amino acid composition of phloroisovalerophenone synthase) / [11-15;
21,22]
Salvia pomifera / SRR2134916
SRR2134915
SRR2136651 / 667 / Terpenoids: 6.0 (Carnosic acid), 20.0 (monoterpenes),
3.2 (sesquiterpenes)
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids),
7.7997 Lipid, 29.1109 (Protein; amino acid composition of stearoyl-ACP-9-desaturase) / [11-15;
23,24]
Salvia divinorum / SRR3716680 / 663 / Diterpenes: 0.6 (Clerodienyl diphosphate) / [11-15;
25,26]
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids),
7.7997 Lipid, 29.1109 (Protein; amino acid composition of 1-deoxy-D-xylulose-5-phosphate synthase)
Mentha x piperita / SRR1271557
SRR1271558 / 676 / Monoterpenes and Sesquiterpenes: 10.9 (Pinene),
11.3 (Sabinene), 13.8 (Limonene), 16.3 (Menthofuran),
20.5 (Sabinene hydrate), 21.7 (Isomenthone),
232.8 (Menthone), 35.8 (Menthol), 8.0 (Pulegone),
93.0 (1,8-Cineole), 20.0 (Sesquiterpenes) / [11-15;
27]
Polymethoxylated Flavones: 4.0
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids),
7.7997 Lipid, 29.1109 (Protein; amino acid composition of CYP1B32)
Arabidopsis thaliana / SRR5865480 SRR5865479 / 681 / Glucosinolates: 0.163 Glucoiberin, 0.0213 Glucobrassicin
Other: 3.6388 (Cellulose), 0.0425 (Nucleic acids), 7.7997 Lipid, 29.1109 (Protein; amino acid composition of RUBISCO, small subunit), 2.9486 (Starch) / [11-15;
28]

Associating Reactions Represented in Species-Specific Models withGenes Expressed in GTs

As part of the functional annotation process with Trinotate, E.C. numbers are assigned to gene products with enzymatic functions. Since reactions in both the Arabidopsis Core Model and MetaCyc are associated with E.C. numbers, we used a YASMEnv script to map transcript identifier(s) to each of the metabolic reactions. It is important to note that each species selected for model development accumulates a unique set of specialized metabolites in GTs. To capture the reactions leading to such metabolites, the model for each species was manually amended based on information available in the literature and the MetaCyc database. Transcripts were associated withthese reactions as described above. We then mapped the cumulative sum of expression values (Transcripts Per kilobase Million, TPM) to each of the reactions within each model. In instances where a pathway has not yet been fully elucidated (e.g., salvinorin A in S. divinorum, diterpenoids in S. pomifera, and xanthosides in X. strumarium), we assumed that the pathway terminates with thelast known reaction (e.g., kolavenyl diphosphatesynthase for the salvinorin A pathway in S. divinorum, ferruginol synthase for the diterpenoid pathway inS. pomifera, and costunolide synthase for the xanthoside pathway in X. strumarium). The experimentally determined concentrations of end products were added up and assumed to be represented by the intermediate that is formed by the last known enzyme. For example, for the Salvia divinorum model, the concentration of salvinorin A (only one major end product in this case) was transferred to kolavenyl diphosphate. We chose this approach to avoid making assumptions about reaction stoichiometries that have not been confirmed experimentally, as such assumptions alter the metabolic demand of the objective function. Reactions of the Arabidopsis Core Model (and therefore those of our Consensus Base Model) are also associated with subcellular localization information for each reaction. For reactions imported from MetaCyc, which does not feature information on compartmentation, a series of online tools was employed to predict subcellular localization based on sequence properties [6]. In summary, the model for each species is based uponstoichiometrically balanced metabolic and transport reactions(including subcellular metabolite exchange) that are active in GT cells.

Generating Species-Specific Biomass Objective Functions

The solution space in underdetermined constraint-based model with hundreds of reactions is exceedingly large. It is therefore necessary to define an objective function to compute an optimal network state and the resulting flux distribution. In models for microbial cells, this is generally satisfied with a biomass objective function (the assumption is that metabolism supports optimal growth under defined conditions). In the case of GTs, we assume that the production of specific specialized metabolites is the ‘objective’ of the metabolic network. The objective function of each species was formulated based on published measurements of such metabolites. Scheme 1 provides an overview of the parameters used for the species-specific objective functions.

Predicting Flux Distribution Based on the Flux Minimization Principle

To predict how flux is distributed across reactions and pathways,while integrating transcript abundance and meeting the biomass objective function, we employed the ‘expression-guided flux minimization’ (E-Fmin) approach [7]. The optimization focuses on minimizing the weighted sum of flux magnitudes:

Subject to

Where ri is the ithflux, its weight wi is a function of gene expression level, S is the stoichiometric matrix, r is the vector of n fluxes, rL and rUare vectors of upper and lower limits of r. RMF stands for Required Metabolic Functionalities, and  is a minimum input threshold (in our case, this corresponds to an imported carbon source, usually an oligosaccharide such as sucrose or raffinose).

Allowed import metabolites are: oligosaccharide (carbon source), ammonia, nitrate, oxygen, hydrogen sulfide, sulfate, inorganic phosphate, water, and carbon dioxide.

Allowed biomass outputs are: metabolite concentrations within the species-specific biomass objective functions.

To simulate the various heterotrophic/autotrophic scenarios, we first set photon import equal to zero to obtain the maximum value of imported oligosaccharide required to meet the biomass objective (S100). We then used the inverse strategy to determine the maximum value of photon import required to meet the biomass objective (P100). We then used S100 and P100 values as reference values so that S75 = 0.75 * S100, S50 = 0.50 * S100, etc. The same was done for P75, P50 etc.

Supplemental References

[1]Johnson, S.R. et al. (2017) Bioenergetics of monoterpenoid essential oil biosynthesis in non-photosynthetic glandular trichomes. Plant Physiol. 175, 681-695

[2]Arnold, A. and Nikoloski, Z. (2014) Bottom-up metabolic reconstruction of Arabidopsis and its application to determining the metabolic costs of enzyme production. Plant Physiol. 165, 1380–1391

[3]Caspi, R. et al. (2014) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 42: D459–D471

[4]Haas, B.J. et al. (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512

[5]Lamesch, P. et al. (2012). The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 40, D1202–D1210

[6]Emanuelsson, O. et al. (2000). Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300, 1005–1016

[7]Song,H.S. et al. (2014) Prediction of metabolic flux distribution from gene expression data based on the flux minimization principle. PLoS ONE9, e112524

[8]Wollenweber, E. et al. (2005) Chemodiversity of surface flavonoids in Solanaceae. Z. Naturforsch. 60c, 661–670

[9]Moghe, G.D. et al. (2017) Evolutionary routes to biochemical innovation revealed by integrative analysis of a plant-defense related specialized metabolic pathway. eLife 6, e28468

[10]Kang, J.H. et al. (2014) The flavonoid biosynthetic enzyme chalcone isomerase modulates terpenoid production in glandular trcihomes of tomato. Plant Physiol. 164, 1161-1174

[11]DeBolt, S. (2009) Mutations in UDP-glucose:sterol glucosyltransferase in Arabidopsis cause transparent testa phenotype and suberization defect in seeds. Plant Physiol. 151, 78–87

[12]Sharrock, R.A. and Clack, T. (2002) Patterns of expression and normalized levels of the five Arabidopsis phytochromes. Plant Physiol. 130, 442–456

[13]Dörmann P, et al. (1995) Isolation and characterization of an Arabidopsis mutant deficient in the thylakoid lipid digalactosyl diacylglycerol. Plant Cell 7, 1801–1810

[14]Mooney, B.P. (2006) Using quantitative proteomics of Arabidopsis roots and leaves to predict metabolic activity. Physiol. Plant. 128, 237–250

[15]Sulpice R, et al. (2013) Impact of the carbon and nitrogen supply on relationships and connectivity between metabolism and biomass in a broad panel of Arabidopsis accessions. Plant Physiol. 162, 347–363

[16]Chen, F. et al. (2013) Identifying three ecological chemotypes of Xanthium strumarium glandular trichomes using a combined NMR and LC-MS method. PLoS ONE 8, e76621

[17]Li, Y. et al. (2016) Comparative transcriptome analysis identifies putative genes involved in the biosynthesis of xanthanolides in Xanthium strumarium L. Front. Plant Sci. 7, 1317

[18]Valetta, A. et al. (2013) Trichomes in Camptotheca acuminata Decaisne (Nyssaceae): Morphology, distribution, structure, and secretion. Plant Biosys. 147, 548–556

[19]Góngora-Castillo, E. et al. (2012) Development of transcriptomic resources for interrogating the biosynthesis of monoterpene indole alkaloids in medicinal plant species. PLoS One 7, e52506

[20]Sadre, R. et al. (2016) Metabolite diversity in alkaloid biosynthesis: A multilane (diastereomer) highway for camptothecin synthesis in Camptotheca acuminata. Plant Cell 28, 1926–1944

[21]Kavalier, A.R., et al. (2011) Phytochemical and morphological characterization of hop (Humulus lupulus L.) cones over five developmental stages using high performance liquid chromatography coupled to time-of-flight mass spectrometry, ultrahigh performance liquid chromatography photodiode. J. Agric. Food Chem. 59, 4783–4793

[22]Clark, S.M. et al. (2013) Transcriptome analysis of bitter acid biosynthesis and precursor pathways in hop (Humulus lupulus). BMC Plant Biol. 13, 12

[23]Pitarokili, D. et al. (1999) Composition and antifungal activity of the essential oil of Salvia pomifera subsp. calycina growing wild in Greece. J. Essential Oil Res. 11, 655–659

[24]Trikka, F.A. et al. (2015) Combined metabolome and transcriptome profiling provides new insights into diterpene biosynthesis in S. pomifera glandular trichomes. BMC Genomics 16, 935

[25]Siebert, D.J. (2004) Localization of salvinorin A and related compounds in glandular trichomes of the psychoactive sage, Salvia divinorum. Ann. Bot. 93, 763–771

[26]Pelot, K.A. et al. (2017) Biosynthesis of the psychotropic plant diterpene salvinorin A: Discovery and characterization of the Salvia divinorum clerodienyl diphosphate synthase. Plant J. 89, 885–897

[27]Ahkami, A. et al. (2015) Multiple levels of regulation determine monoterpenoid essential oil compositional variation in the mint family. Mol. Plant 8, 188–191

[28]Brown, P.D. et al. (2003) Variation of glucosinolate accumulation among different organs and developmental stages of Arabidopsis thaliana. Phytochemistry 62, 471–481