TITLE:

Integration of metabolomic and transcriptomic networks in pregnant women reveals biological pathways and predictive signatures associated with preeclampsia

AUTHORS:

Rachel S. Kelly,1 Damien C. Croteau-Chonka,1 Amber Dahlin,1 Hooman Mirzakhani,1 Ann C. Wu,1-3 Emily S. Wan,1 Michael J. McGeachie,1 Weiliang Qiu,1 Joanne E. Sordillo,1 Amal Al-Garawi,1 Kathryn J. Gray,4 Thomas F. McElrath,4 Vincent J. Carey,1 Clary B. Clish,5 Augusto A. Litonjua,1 Scott T. Weiss,1 and Jessica A. Lasky-Su,1

AFFILIATIONS:

1.  Channing Department of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA

2.  Children’s Hospital and Harvard Medical School, Boston, MA 02115, USA

3.  Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, MA 02215, USA

4.  Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Brigham and Women’s Hospital, Boston, MA 02115, USA

5.  Metabolomics Platform, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA 02141, USA

Corresponding Author

Rachel S. Kelly PhD.

Channing Division of Network Medicine, Department of Medicine

Brigham and Women’s Hospital and Harvard Medical School

Boston, MA 02115

Tel: 617-525-0065

Email:

Abbreviations

AUC – Area Under the Curve

FDR – False Discovery Rate

HILIC – Hydrophilic Interaction Liquid Chromatography

LCMS- Liquid Chromatography Mass Spectroscopy

UHPLC – Ultra High Performance Liquid Chromatography

m/z - mass-to-charge ratio; RT – retention time; OR – odds ratio; CI – confidence interval

Online Resource 1: Supplemental Methods

Metabolite Profiling

Plasma samples were shipped from the sample repository to the metabolomics lab on dry ice. Samples were thawed on ice for sub-aliquoting for each of the metabolomics methods and then re-frozen on dry ice and stored at -80C until work up for LC-MS analyses. Metabolite profiles were measured using four distinct liquid chromatography tandem mass spectrometry (LC-MS) methods designed to measure complementary sets of metabolites: polar metabolites measured in the positive ion mode (HILIC-positive), polar metabolites measured in the negative ion mode (HILIC-negative), metabolites of intermediate polarity (e.g. free fatty acids and bile acids; C18-negative), and lipids (C8-positive). Briefly, hydrophilic interaction liquid chromatography (HILIC) analyses of water soluble metabolites in the positive ionization mode were conducted using an LC-MS system comprised of a Shimadzu Nexera X2 U-HPLC (Shimadzu Corp.; Marlborough, MA) coupled to a Q Exactive hybrid quadrupole orbitrap mass spectrometer (Thermo Fisher Scientific; Waltham, MA). Plasma samples (10 µL) were extracted using 90 µL of 74.9:24.9:0.2 v/v/v acetonitrile/methanol/formic acid containing stable isotope-labeled internal standards (valine-d8, Sigma-Aldrich; St. Louis, MO; and phenylalanine-d8, Cambridge Isotope Laboratories; Andover, MA). The samples were centrifuged (10 min, 9,000 x g, 4°C), and the supernatants were injected directly onto a 150 x 2 mm, 3 µm Atlantis HILIC column (Waters; Milford, MA). The column was eluted isocratically at a flow rate of 250 µL/min with 5% mobile phase A (10 mM ammonium formate and 0.1% formic acid in water) for 0.5 minute followed by a linear gradient to 40% mobile phase B (acetonitrile with 0.1% formic acid) over 10 minutes. MS analyses were carried out using electrospray ionization in the positive ion mode using full scan analysis over 70-800 m/z at 70,000 resolution and 3 Hz data acquisition rate. Other MS settings were: spray voltage 3.5 kV, capillary temperature 350°C, and heater temperature 300°C. HILIC analyses of water soluble metabolites in the negative ionization mode were conducted using an LC-MS system comprised of an AQUITY UPLC system (Waters; Milford, MA and a 5500 QTRAP mass spectrometer (SCIEX; Framingham, MA). Plasma samples (30 µL) were extracted using 120 µL of 80% methanol containing inosine-15N4, thymine-d4 and glycocholate-d4 internal standards (Cambridge Isotope Laboratories; Andover, MA). The samples were centrifuged (10 min, 9,000 x g, 4°C), and the supernatants were injected directly onto a 150 x 2.0 mm Luna NH2 column (Phenomenex; Torrance, CA). The column was eluted at a flow rate of 400 µL/min with initial conditions of 10% mobile phase A (20 mM ammonium acetate and 20 mM ammonium hydroxide in water) and 90% mobile phase B (10 mM ammonium hydroxide in 75:25 v/v acetonitrile/methanol) followed by a 10 min linear gradient to 100% mobile phase A. MS analyses were carried out using electrospray ionization and selective multiple reaction monitoring scans in the negative ion mode. To create the method, declustering potentials and collision energies were optimized for each metabolite by infusion of reference standards. The ion spray voltage was -4.5 kV and the source temperature was 500°C. Reversed-phase C18 chromatography/negative ion mode MS analyses of free fatty acids and bile acids were conducted using an LC-MS system comprised of a Shimadzu Nexera X2 U-HPLC (Shimadzu Corp.; Marlborough, MA) coupled to a Q Exactive hybrid quadrupole orbitrap mass spectrometer (Thermo Fisher Scientific; Waltham, MA). Plasma samples (30 µL) were extracted using 90 uL of methanol containing PGE2-d4 (Cayman Chemical Co.; Ann Arbor, MI) and centrifuged (10 min, 9,000 x g, 4°C). The samples were injected onto a 150 x 2 mm ACQUITY T3 column (Waters; Milford, MA). The column was eluted isocratically at a flow rate of 400 µL/min with 60% mobile phase A (0.1% formic acid in water) for 4 minutes followed by a linear gradient to 100% mobile phase B (acetonitrile with 0.1% formic acid) over 8 minutes. MS analyses were carried out in the negative ion mode using electrospray ionization, full scan MS acquisition over 200-550 m/z, and a resolution setting of 70,000. Metabolite identities were confirmed using authentic reference standards. Other MS settings were: spray voltage -3.5 kV, capillary temperature 320°C, and heater temperature 300°C. C8 chromatography analyses of polar and non-polar plasma lipids were conducted using an LC-MS system conducted using an LC-MS system comprised of a Shimadzu Nexera X2 U-HPLC (Shimadzu Corp.; Marlborough, MA) coupled to a Exactive Plus orbitrap mass spectrometer (Thermo Fisher Scientific; Waltham, MA). Plasma samples (10 µL) were extracted using 190 µL of isopropanol containing 1,2-didodecanoyl-sn-glycero-3-phosphocholine (Avanti Polar Lipids; Alabaster, AL). After centrifugation, supernatants were injected directly onto a 100 x 2.1 mm, 1.7 µm ACQUITY BEH C8 column (Waters; Milford, MA). The column was eluted isocratically with 80% mobile phase A (95:5:0.1 vol/vol/vol 10mM ammonium acetate/methanol/formic acid) for 1 minute followed by a linear gradient to 80% mobile-phase B (99.9:0.1 vol/vol methanol/formic acid) over 2 minutes, a linear gradient to 100% mobile phase B over 7 minutes, then 3 minutes at 100% mobile-phase B. MS analyses were carried out using electrospray ionization in the positive ion mode using full scan analysis over 200–1000 m/z at 70,000 resolution and 3 Hz data acquisition rate. Other MS settings were: spray voltage 3 kV, capillary temperature 300°C, and heater temperature 300°C. Lipid identities were determined based on comparison to reference plasma extracts and were denoted by total number of carbons in the lipid acyl chain(s) and total number of double bonds in the lipid acyl chain(s).

To account for potential batch effect a cassette of two pooled plasma samples was run at intervals of 20 study samples. Each pooled plasma sample cassette included a pooled sample prepared from the study samples for data standardization using a "nearest neighbor" approach as well as a pooled plasma sample for determination of data quality and precision both before and after data standardization.

Raw data from Q Exactive/Exactive Plus MS systems were processed using TraceFinder software (Thermo Fisher Scientific; Waltham, MA) and Progenesis QI (Nonlinear Dynamics; Newcastle upon Tyne, UK). Raw data collected using the HILIC-negative method were processed using MultiQuant 2.1 software (SCIEX; Framingham, MA). Compounds were identified by their exact mass and by matching their retention times to authentic reference standards. In many cases, isomeric compounds were analyzed and in cases where the compound could not be resolved by chromatography, a general name for the compound is reported (e.g. pentose phosphate for ribulose 5-phosphate/ribose 5-phosphate)

Identification of Transcriptomic Signature

Tests for differential expression adjusted for expression heterogeneity and were performed using the Bioconductor package "RankProd". Differentially expressed genes were identified through permutation analysis, after setting the percentage of false prediction threshold for which the FDR is less than 0.05 (Mirzakhani et al., 2016)

Weighted correlation network analysis

A network based approach, weighted correlation network analysis (WGCNA) (Langfelder and Horvath, 2008), was applied to identify modules of co-regulated genes and metabolites that can inform on the biological processes involved in preeclampsia development at both the genomic and metabolomic level.

A dynamic clustering approach was utilized, and a minimum module size of 30 was specified. As the dynamic tree-cut approach may identify a number of highly correlated or nested modules, such modules were then merged using a deep split of 0.8. Initial WGCNA detected five gene-metabolite modules, and module membership ranged from 53 to 566. Genes and metabolite features that could not be assigned to any of these co-regulated modules (n=7) were excluded (grey module); after merging there were five modules. As the genes and features of these modules are highly co-expressed/co-regulated, merging provides a more parsimonious and biologically meaningful analysis and reduces the multiple testing burden. Merging resulted in five modules. The cluster dendrogram before and after the merge is shown in Supplemental Figure S3.

Each module was summarized by an eigenvalue based on its first principal component. Modules significantly associated with preeclampsia were identified and the genes and metabolites within these modules investigated for biological relevance.

Online Resource 2: Supplemental Tables

Supplemental Table S1: Summary of the metabolomic profiling analysis of the early pregnancy gestation samples from four platforms

C8-positive platform / C18-negative platform / HILIC-negative platform / HILIC-positive platform
(n=2821) / (n=2716) / (n=61) / (n=2501)
Logistic Regression modela
n (%) significant features
p<0.05 / 215 (7.6%) / 140 (5.2%) / 1 (1.6%) / 128 (5.1%)
p<0.01 / 40 (1.4%) / 17 (0.6%) / 0 / 15 (0.6%)
p<0.001 / 0 / 1 (0.04%) / 0 / 1 (0.04%)
ROC Analysisb
AUC Model 1 (95% CI) / 0.573 (0.463, 0.682) / 0.573 (0.463, 0.682) / - / 0.573 (0.463, 0.682)
AUC Model 2 (95% CI) / 0.731 (0.638, 0.838) ¥ / 0.731 (0.638, 0.838) ¥ / - / 0.731 (0.638, 0.838) ¥
AUC Model 3 (95% CI) / 0.736 (0.638, 0.834) ¥ / 0.798 (0.713, 0.882) ¥ / - / 0.750 (0.659,0.842) ¥
AUC Model 4 (95% CI) / 0.786 (0.691, 0.881) ¥ / 0.800 (0.710, 0.891) ¥ / - / 0.803 (0.711-0.895) ¥

a Logistic regression model adjusting for mother’s age, race, site and gestational age (wks)

b Model1– mother’s age, race, site, and gestational age

Model2 – model1+ serum vitamin D at baseline (<30ng/ml or ≥30ng/ml)+ pre-pregnancy BMI

Model3 – model1+ Principal component1

Model4 – model1+ serum vitamin D at baseline (<30ng/ml or ≥30ng/ml) + pre-pregnancy BMI+ Principal component1

¥ Significantly (p<0.05) different to model 1

∞ Significantly (p<0.05) different to model 2

Ω Significantly (p<0.05) different to model 3

ROC; Receiver Operator Characteristic; AUC – Area Under the Curve; CI – Confidence Interval

Supplemental Table S2: Top 72 metabolite features associated with preeclampsia in the first trimester blood samples (p<0.01)

Platform / Metabolite / m/z / rt / ORa (95% CI) / p-value / ORb (95% CI) / p-value
HILIC-pos / Cohibin A / 549.49 / 5.03 / 4.52 (1.97,11.33) / 6.7E-04 / 3.95 (1.52,11.46) / 7.2E-03
C18-neg / 16-alpha-Hydroxypregnenolone / 376.22 / 9.39 / 11.89 (2.98,57.54) / 9.5E-04 / 6.42 (1.4,35.75) / 2.3E-02
C8-pos / 835.1001m/z_8.99 / 835.10 / 8.99 / 3.53 (1.73,7.93) / 1.1E-03 / 3.13 (1.37,8.18) / 1.1E-02
C8-pos / 824.1093m/z_8.99 / 824.11 / 8.99 / 3.55 (1.73,8.04) / 1.1E-03 / 3.46 (1.48,9.26) / 7.4E-03
C8-pos / 425.7859m/z_8.99 / 425.79 / 8.99 / 7.67 (2.38,28.93) / 1.2E-03 / 5.98 (1.55,27.14) / 1.3E-02
C8-pos / 964.5572m/z_8.99 / 964.56 / 8.99 / 5.65 (2.1,17.64) / 1.3E-03 / 4.51 (1.44,16.76) / 1.5E-02
C8-pos / C38:3 PC / 850.57 / 8.99 / 4.36 (1.87,11.47) / 1.3E-03 / 3.8 (1.44,11.49) / 1.1E-02
C8-pos / 970.5742m/z_8.99 / 970.57 / 8.99 / 5.38 (2.02,16.12) / 1.4E-03 / 4.32 (1.35,15.82) / 1.8E-02
C8-pos / 385.7541m/z_8.22 / 385.75 / 8.22 / 3.57 (1.72,8.4) / 1.5E-03 / 4.37 (1.79,12.67) / 2.9E-03
C8-pos / Ins-1-P-Cer(d18:1/22:0) / 902.59 / 8.99 / 7.76 (2.31,30.44) / 1.7E-03 / 6.01 (1.44,29.89) / 1.9E-02
C8-pos / C38:3 PC / 812.62 / 8.99 / 6.25 (2.11,21.22) / 1.7E-03 / 4.82 (1.33,20.22) / 2.2E-02
C18-neg / 181.0490m/z_1.00 / 181.05 / 1 / 7.15 (2.24,26.76) / 1.8E-03 / 5.7 (1.58,25.02) / 1.3E-02
C8-pos / 896.5696m/z_8.99 / 896.57 / 8.99 / 7.12 (2.23,26.62) / 1.8E-03 / 5.5 (1.42,25.52) / 1.9E-02
C18-neg / 467.3127m/z_14.53 / 467.31 / 14.53 / 6.24 (2.04,21.63) / 2.2E-03 / 5.01 (1.38,20.72) / 1.9E-02
C18-neg / 13'-Carboxy-gamma-tocopherol / 445.33 / 14.53 / 5.75 (1.98,18.87) / 2.2E-03 / 5.18 (1.49,20.8) / 1.3E-02
C8-pos / C40:0 PC plasmalogen / 870.67 / 8.99 / 4.21 (1.75,11.26) / 2.3E-03 / 3.34 (1.19,10.51) / 2.8E-02
C8-pos / 1038.5621m/z_8.99 / 1038.56 / 8.99 / 4.73 (1.83,14.05) / 2.7E-03 / 3.77 (1.26,13.18) / 2.5E-02
C8-pos / Dimethamine / 426.29 / 8.99 / 5.35 (1.87,17.64) / 3.1E-03 / 4.85 (1.39,20.13) / 1.9E-02
HILIC-pos / Quinoline / 152.05 / 11.07 / 70.67 (5.2,1496.23) / 3.1E-03 / 21 (1.41,566.62) / 4.8E-02
HILIC-pos / Montecristin / 575.50 / 5.01 / 7.12 (2.05,28.68) / 3.3E-03 / 5.87 (1.35,30.61) / 2.5E-02
C18-neg / 163.1110m/z_4.13 / 163.11 / 4.13 / 0.07 (0.01,0.37) / 3.4E-03 / 0.04 (0,0.34) / 5.4E-03
HILIC-pos / Terpinolene oxide / 170.15 / 2.3 / 2.51 (1.41,4.92) / 3.5E-03 / 2.15 (1.16,4.41) / 2.3E-02
C18-neg / 712.3335m/z_11.73 / 712.33 / 11.73 / 0.03 (0,0.27) / 3.5E-03 / 0.02 (0,0.27) / 5.7E-03
C8-pos / C34:2 PI / 835.53 / 8.06 / 5.36 (1.83,17.94) / 3.7E-03 / 4.51 (1.33,18.03) / 2.3E-02
C8-pos / C50:1 TAG / 871.72 / 11.66 / 4.43 (1.68,12.9) / 4.0E-03 / 3.84 (1.19,13.88) / 3.0E-02
C8-pos / 887.5493m/z_8.25 / 887.55 / 8.25 / 4.52 (1.69,13.39) / 4.0E-03 / ns
C8-pos / C40:6 PC / 834.60 / 8.99 / 5.34 (1.78,17.87) / 4.1E-03 / 4.02 (1.11,16.54) / 4.1E-02
C18-neg / 709.0492m/z_0.68 / 709.05 / 0.68 / 21.99 (2.93,212.81) / 4.4E-03 / 24.42 (2.38,364.05) / 1.2E-02
C8-pos / C50:1 TAG / 855.74 / 11.67 / 4.66 (1.68,14.35) / 4.6E-03 / 4.01 (1.13,15.95) / 3.8E-02

Supplemental Table S2: continued

Platform / Metabolite / m/z / rt / ORa (95% CI) / p-value / ORb (95% CI) / p-value
C18-neg / 467.3129m/z_14.70 / 467.31 / 14.7 / 5.79 (1.81,21.05) / 4.7E-03 / 4.73 (1.23,20.85) / 2.9E-02