Supplementary Material to paper “Between metabolite relationships: an essential aspect of metabolic change”

Analytical methods:

The UPLC-MS Lipidome Platform was developed, validated and applied in this study at the Demonstration and Competition Laboratory (DCL) of the Netherlands Metabolomics Centre, situated at LeidenUniversity in The Netherlands.

For each serum sample, one 30 µl aliquot was taken. For preparation of quality control samples (QC) a pooled sample consisting of equal amounts of plasma from all study samples was taken, as well as 30 µl of HPLC-MS grade water for blank samples. Study samples, quality control samples and blank samples underwent the same sample preparation procedure. Sample preparation consisted of addition of mixture of internal standards (a mixture of dichloromethane and methanol was used for the blank samples) and liquid-liquid extraction (LLE) with use of dichloromethane and methanol and has been followed by procedure described before [46]. The lipid-containing organic phase was taken for further analysis.

The Acquity UltraPerformance Liquid Chromatography, (Waters, Milford, MA, USA) and the 6530 Accurate mass Quadrupole-Time of Flight LC/Mass Spectrometry (Q-TOF LC/MS) (Agilent Technologies, Waldbronn, Germany) platforms were used in this study. The Ascentis Express C8 2.1 x 150 mm (2.7 µm particle size) column was used for chromatographic separation. All conditions described by Hu et al.[46] for the sample separation and analysis were applied, aside from slight modifications in gradient time and profile. A binary gradient system of acetonitrile-water (60:40, v:v) (eluent A) and isopropanol-acetonitrile (90:10, v:v) (eluent B), both containing 10 mM ammonium formate was run for 20 minutes in the following gradient: 0-1 min isocratic elution with 32% B, 1-3 min increase to 50% B, 3-9 min increase to 65% B, 9-14 min increase to 80% B, 14-14.1 min increase to 97% B and 14.1-17 min maintenance of 97% B, 17-17.1 decrease to 32% B and 17.1-20 min isocratic elution with 32% B. The flow rate was 0.5 ml/min throughout.

The detected metabolites were quantified with the MassHunter Workstation software (Agilent Technologies, Waldbronn, Germany): the peak areas of studied lipids were all normalized by an internal standard for all subclasses. This approach led to the discovery of 136 different lipid metabolites that could be identified based on retention time and mass-to-charge ratio (m/z). They consist of cholesterol esters, di- and triacylglycerols, (lyso)phosphatidylethanolamines, (lyso)phosphatidylcholines and sphingomyelins. A complete list of the detected metabolites is given in Supplementary Table 1.

Data Analysis methods

Univariate testing

Shapiro-Wilk tests (α=0.05) have been used in the evaluation of the normal distribution of levels of 136 metabolites. Depending on whether the level was normally distributed, unpaired U-Mann-Whitney tests (α=0.05) have been employed to assess if differences in lipid levels between last time point (time point 12, T12) and baseline (time point 0, T0) were statistically significant. These tests were calculated separately for treatment and placebo group. All obtained p-values are presented in Supplementary Table 2 and have been controlled by False Discovery Rate [47] at q=0.05.

Multivariate methods

Principal component analyses (PCA) have been performed separately on lipid profiles of each time point (time point 0, 4, 8 and 12; T0, T4, T8 and T12) and presented in Supplementary Figure 1.

In partial least squares-discriminant analysis (PLS-DA) models differences of levels between baseline and last time point (time point 12) were used to discriminate the GTE group from the placebo. PLS-DA method included double cross validation and diagnostic outcomes of models: number of misclassified samples, area under receiver operator (AUROC) and Q2 were validated with 2000 permutations tests. Diagnostic outcomes and their p-values (lower than 0.05) are presented in Supplementary Table 3.

In multi-way PLS-DA (N-PLS-DA) data matrix 186 samples x 136 metabolites x 4 time points was used without any pre-processing steps. N-PLS-DA procedure included double cross validation and diagnostic outcomes of model were evaluated with 2000 permutations tests. The results of N-PLS-DA are summarized in Supplementary Table 3.

Resampling approach for INDSCAL models

The INDSCAL scores obtained from equation (3) can be interpreted to quantify the significance of differences between ‘groups’ . However, a sole interpretation of the model on all data does not suffice for this: knowledge on the scorevariabilitythen lacks, such that resampling techniques are required.

We assess the score variability and the associated significance of between-group differences here by removing the data of one biological replicate per treatment group for each time-point from: a model is fittedonly on the remaining data. This procedure is repeated many timesfor different left-out replicates to reveal the variability of the estimated scores. This procedure thereby explicitly uses the experimental design, unlike the conventional jack knife analysis.

Each resamplingrun —leaving out two individuals—gives matrices of size containing the remaining biological replicates. These lead to matrices through equation (1)of the same size. Matrices (calculated by ) have size , which is the same as . INDSCAL analysis of matrices (equation(3)) leads to a series of score matrices of dimensions .

A resampling-based comparison between PCA models would require a posteriori rotations of the model results in order to distinguish rotational freedom from true model variability [30,48]. The INDSCAL score comparison is more parsimonious, owing to the uniqueness of the model solutions. Because INDSCALhas two trivial breaches of uniqueness, an automated score comparison needs to control the model solutions for two aspects: the INDSCAL components are not arranged according to their relative importance (which is impossible due to their non-orthogonality), so that their order is random: the components of a resampling model need to be permuted to find the solution compliant to that of a common reference. Secondly, there may be a freedom of component scaling in PARAFAC, which for the model constraints imposed here is irrelevant. Finally, the INDSCAL model is iterative and may therefore not converge to a stable solution. Such model scores also need to be discarded from the comparison.

The variability of the scores in the remaining resampling runs can be calculated by and the deviation of each model from this average can be calculated in a value by equation (3).

(3) /

We calculate the deviation of each run from the average based on the scores of all groups: these are all necessarily related, which needs to be reflected in the model results. The value is then used to select the 95% of resampling runs with smallest deviation. The corresponding scores are subsequently enclosed by alpha-bags to depict the variability for each group. This approach does not assume any distribution of the model results, unlike an earlier proposed INDSCAL jack-knife analysis and works on biological replicates rather than entire experimental groups, like earlier work in jack-knife analysis of PARAFAC models.

Tables:

Supplementary Table 1. Symbol list

Vector/matrix / Size / Description
/ / vector of ones
/ / INDSCAL and PARAFAC chemical loading matrix
/ / INDSCAL loading vector of component
/ / INDSCAL loading matrix of component
/ / PARAFAC score matrix of the ‘time’ mode
/ / Model Residual matrix
/ / INDSCAL score matrix of group
/ / centroid vector of all samples,
/ / centroid for group as deviation from
/ / PARAFAC score matrix of individual
/ / covariance matrix of group
/ / deviation of each biological replicate/individual from
/ / Data for group
/ / Data for biological replicate/individual , specific to PARAFAC
Scalar/Index / Description
/ diagonal elements of
/ Index for biological replicates/individuals
/ Index for metabolic descriptor/metabolite
/ Index for groups
/ Index for INDSCAL component

Supplementary Table 2. List and identity of lipid metabolites included in the study.

no / symbol / Identity / no / Symbol / identity / no / symbol / identity
1 / CholE02 / CholE 18:2 / 47 / PC27 / 1-acyl 38:7 / 93 / TG14 / 1-acyl 48:1
2 / CholE04 / CholE 20:5 / 48 / PC28 / 1-acyl 38:6 / 94 / TG15 / 1-acyl 48:0
3 / CholE05 / CholE 20:4 / 49 / PC30 / 1-acyl 38:4 / 95 / TG17 / 1-acyl 40:7
4 / CholE06 / CholE 22:6 / 50 / PC31 / 1-acyl 38:3 / 96 / TG18 / 1-acyl 40:6
5 / DG02 / 1-acyl 36:3 / 51 / PC32 / 1-acyl 38:2 / 97 / TG20 / 1-acyl 40:5
6 / LPC01 / 1-acyl 14:0 / 52 / PC35 / 1-acyl 40:8 / 98 / TG22 / 1-acyl 40:4
7 / LPC02 / 1-alk 16:1 / 53 / PC37 / 1-acyl 40:6 / 99 / TG24 / 1-acyl 50:5
8 / LPC04 / 1-acyl 16:1 / 54 / PE01 / 1-acyl 34:2 / 100 / TG25 / 1-acyl 50:4
9 / LPC05 / 1-acyl 16:0 / 55 / PE02 / 1-alk 36:5 / 101 / TG26 / 1-acyl 50:3
10 / LPC06 / 1-alk 18:1 / 56 / PE03 / 1-acyl 36:4 / 102 / TG28 / 1-acyl 50:2
11 / LPC07 / 1-acyl 18:3 / 57 / PE05 / 1-alk 38:7 / 103 / TG29 / 1-acyl 50:1
12 / LPC08 / 1-acyl 18:2 / 58 / PE06 / 1-alk 38:5 / 104 / TG30 / 1-acyl 50:0
13 / LPC09 / 1-acyl 18:1 / 59 / PE07 / 1-acyl 38:6 / 105 / TG32 / 1-acyl 51:3
14 / LPC10 / 1-acyl 18:0 / 60 / PE09 / 1-acyl 38:2 / 106 / TG35 / 1-acyl 51:1
15 / LPC11 / 1-acyl 20:5 / 61 / SM01 / 2-amido 14:0 / 107 / TG40 / 1-acyl 52:4
16 / LPC12 / 1-acyl 20:4 / 62 / SM02 / 2-amido 15:0 / 108 / TG41 / 1-acyl 52:3
17 / LPC13 / 1-acyl 20:3 / 63 / SM03 / 2-amido 16:1 / 109 / TG42 / 1-acyl 52:2
18 / LPC14 / 1-acyl 22:6 / 64 / SM04 / 2-amido 16:0 / 110 / TG44 / 1-acyl 52:1
19 / LPC16 / 1-acyl 20:1 / 65 / SM05 / 2-amido 17:0 / 111 / TG47 / 1-acyl 53:2
20 / LPC17 / 1-acyl 20:0 / 66 / SM06 / 2-amido 18:2 / 112 / TG50 / 1-acyl 54:7
21 / LPE02 / 1-acyl 18:0 / 67 / SM07 / 2-amido 18:1 / 113 / TG51 / 1-acyl 54:6
22 / LPE04 / 1-acyl 22:6 / 68 / SM08 / 2-amido 18:0 / 114 / TG52 / 1-acyl 54:5
23 / PC01 / 1-acyl 32:2 / 69 / SM09 / 2-amido 20:1 / 115 / TG53 / 1-acyl 54:4
24 / PC02 / 1-acyl 32:1 / 70 / SM10 / 2-amido 20:0 / 116 / TG54 / 1-acyl 54:3
25 / PC03 / 1-acyl 32:0 / 71 / SM11 / 2-amido 21:0 / 117 / TG55 / 1-acyl 54:2
26 / PC04 / 1-alk 34:3 / 72 / SM12 / 2-amido 22:1 / 118 / TG56 / 1-acyl 54:1
27 / PC05 / 1-alk 34:2 / 73 / SM13 / 2-amido 22:0 / 119 / TG58 / 1-acyl 55:3
28 / PC06 / 1-alk 34:1 / 74 / SM14 / 2-amido 23:1 / 120 / TG60 / 1-acyl 55:2
29 / PC07 / 1-acyl 34:4 / 75 / SM15 / 2-amido 23:0 / 121 / TG63 / 1-acyl 55:1
30 / PC09 / 1-acyl 34:2 / 76 / SM16 / 2-amido 24:2 / 122 / TG64 / 1-acyl 56:7
31 / PC10 / 1-acyl 34:1 / 77 / SM17 / 2-amido 24:1 / 123 / TG65 / 1-acyl 56:6
32 / PC11 / 1-acyl 34:0 / 78 / SM18 / 2-amido 24:0 / 124 / TG66 / 1-acyl 56:5
33 / PC12 / 1-alk 36:6 / 79 / SM19 / 2-amido 25:1 / 125 / TG68 / 1-acyl 56:3
34 / PC13 / 1-alk 36:5 / 80 / SM20 / 2-amido 25:0 / 126 / TG69 / 1-acyl 56:2
35 / PC14 / 1-alk 36:4 / 81 / TG01 / 1-acyl 42:2 / 127 / TG70 / 1-acyl 56:1
36 / PC15 / 1-alk 36:3 / 82 / TG03 / 1-acyl 42:0 / 128 / TG71 / 1-acyl 56:0
37 / PC16 / 1-alk 36:2 / 83 / TG04 / 1-acyl 44:2 / 129 / TG74 / 1-acyl 58:9
38 / PC17 / 1-acyl 36:6 / 84 / TG05 / 1-acyl 44:1 / 130 / TG77 / 1-acyl 58:8
39 / PC18 / 1-acyl 36:5 / 85 / TG06 / 1-acyl 44:0 / 131 / TG79 / 1-acyl 58:6
40 / PC19 / 1-acyl 36:4 / 86 / TG07 / 1-acyl 46:3 / 132 / TG83 / 1-acyl 58:3
41 / PC20 / 1-acyl 36:3 / 87 / TG08 / 1-acyl 46:2 / 133 / TG84 / 1-acyl 58:2
42 / PC21 / 1-acyl 36:2 / 88 / TG09 / 1-acyl 46:1 / 134 / TG85 / 1-acyl 58:1
43 / PC22 / 1-acyl 36:1 / 89 / TG10 / 1-acyl 46:0 / 135 / TG87 / 1-acyl 59:2
44 / PC23 / 1-alk 38:7 / 90 / TG11 / 1-acyl 48:4 / 136 / TG90 / 1-acyl 60:3
45 / PC25 / 1-alk 38:5 / 91 / TG12 / 1-acyl 48:3
46 / PC26 / 1-alk 38:4 / 92 / TG13 / 1-acyl 48:2

Supplementary Table 3. Results (p-values) of univariate testing (U-Mann-Whitney tests) of levels of metabolites at time point 0 (T0) and time point 12 (T12) for GTE group (GTE-T0 and GTE-T12) and placebo group (placebo-T0 and placebo-T12). In bold metabolites with p-values smaller than 0.05.

metabolite/
comparison / GTE-T0/GTE-T12 / placebo-T0/placebo-T12 / metabolite/
comparison / GTE-T0/GTE-T12 / placebo-T0/placebo-T12 / metabolite/ comparison / GTE-T0/GTE-T12 / placebo-T0/placebo-T12
CholE02 / 0.594 / 0.873 / PC27 / 0.791 / 0.244 / TG13 / 0.535 / 0.443
CholE04 / 0.400 / 0.291 / PC28 / 0.627 / 0.741 / TG14 / 0.734 / 0.407
CholE05 / 0.933 / 0.569 / PC30 / 0.425 / 0.798 / TG15 / 0.933 / 0.495
CholE06 / 0.793 / 0.536 / PC31 / 0.866 / 0.586 / TG17 / 0.784 / 0.389
DG02 / 0.030 / 0.297 / PC32 / 0.418 / 0.541 / TG18 / 0.767 / 0.742
LPC01 / 0.243 / 0.368 / PC35 / 0.729 / 0.171 / TG20 / 0.753 / 0.664
LPC02 / 0.397 / 0.859 / PC37 / 0.651 / 0.778 / TG22 / 0.930 / 0.336
LPC04 / 0.509 / 0.772 / PE01 / 0.679 / 0.879 / TG24 / 0.741 / 0.818
LPC05 / 0.497 / 0.851 / PE02 / 0.165 / 0.094 / TG25 / 0.869 / 0.700
LPC06 / 0.935 / 0.257 / PE03 / 0.732 / 0.774 / TG26 / 0.815 / 0.612
LPC07 / 0.556 / 0.792 / PE05 / 0.062 / 0.289 / TG28 / 0.928 / 0.529
LPC08 / 0.400 / 0.350 / PE06 / 0.202 / 0.396 / TG29 / 0.751 / 0.476
LPC09 / 0.681 / 0.725 / PE07 / 0.320 / 0.966 / TG30 / 0.649 / 0.352
LPC10 / 0.501 / 0.719 / PE09 / 0.926 / 0.762 / TG32 / 0.718 / 0.198
LPC11 / 0.360 / 0.522 / SM01 / 0.901 / 0.867 / TG35 / 0.755 / 0.143
LPC12 / 0.611 / 0.520 / SM02 / 0.906 / 0.924 / TG40 / 0.622 / 0.937
LPC13 / 0.633 / 0.898 / SM03 / 0.607 / 0.772 / TG41 / 0.493 / 0.881
LPC14 / 0.658 / 0.879 / SM04 / 0.311 / 0.754 / TG42 / 0.322 / 0.570
LPC16 / 0.968 / 0.096 / SM05 / 0.748 / 0.863 / TG44 / 0.501 / 0.346
LPC17 / 0.980 / 0.711 / SM06 / 0.911 / 0.853 / TG47 / 0.438 / 0.295
LPE02 / 0.368 / 0.945 / SM07 / 0.535 / 0.572 / TG50 / 0.552 / 0.444
LPE04 / 0.188 / 0.962 / SM08 / 0.447 / 0.702 / TG51 / 0.640 / 0.354
PC01 / 0.131 / 0.760 / SM09 / 0.366 / 0.482 / TG52 / 0.358 / 0.555
PC02 / 0.660 / 0.711 / SM10 / 0.554 / 0.800 / TG53 / 0.402 / 0.495
PC03 / 0.470 / 0.608 / SM11 / 0.676 / 0.816 / TG54 / 0.246 / 0.744
PC04 / 0.447 / 0.266 / SM12 / 0.447 / 0.549 / TG55 / 0.219 / 0.466
PC05 / 0.255 / 0.297 / SM13 / 0.466 / 0.849 / TG56 / 0.267 / 0.594
PC06 / 0.815 / 0.303 / SM14 / 0.556 / 0.539 / TG58 / 0.481 / 0.861
PC07 / 0.361 / 0.865 / SM15 / 0.818 / 0.792 / TG60 / 0.430 / 0.608
PC09 / 0.398 / 0.398 / SM16 / 0.305 / 0.299 / TG63 / 0.645 / 0.313
PC10 / 0.339 / 0.692 / SM17 / 0.402 / 0.289 / TG64 / 0.575 / 0.707
PC11 / 0.487 / 0.700 / SM18 / 0.983 / 0.802 / TG65 / 0.289 / 0.776
PC12 / 0.654 / 0.382 / SM19 / 0.842 / 0.353 / TG66 / 0.165 / 0.824
PC13 / 0.743 / 0.285 / SM20 / 0.598 / 0.885 / TG68 / 0.329 / 0.790
PC14 / 0.146 / 0.221 / TG01 / 0.201 / 0.558 / TG69 / 0.533 / 0.451
PC15 / 0.470 / 0.549 / TG03 / 0.391 / 0.960 / TG70 / 0.440 / 0.391
PC16 / 0.968 / 0.331 / TG04 / 0.607 / 0.984 / TG71 / 0.654 / 0.590
PC17 / 0.114 / 0.859 / TG05 / 0.491 / 0.599 / TG74 / 0.566 / 0.241
PC18 / 0.558 / 0.352 / TG06 / 0.395 / 0.332 / TG77 / 0.472 / 0.473
PC19 / 0.627 / 0.454 / TG07 / 0.918 / 0.304 / TG79 / 0.304 / 0.883
PC20 / 0.676 / 0.824 / TG08 / 0.497 / 0.474 / TG83 / 0.825 / 0.279
PC21 / 0.157 / 0.481 / TG09 / 0.513 / 0.405 / TG84 / 0.945 / 0.800
PC22 / 0.383 / 0.849 / TG10 / 0.791 / 0.349 / TG85 / 0.388 / 0.658
PC23 / 0.762 / 0.943 / TG11 / 0.965 / 0.430 / TG87 / 0.663 / 0.898
PC25 / 0.146 / 0.151 / TG12 / 0.852 / 0.486 / TG90 / 0.495 / 0.800
PC26 / 0.552 / 0.832

Supplementary Table 4.Diagnostic statistics of PLS-DA and n-PLS-DA models. Diagnostic statistics: AUROC – area under receiver operating curve, NMC(%) – number of misclassified samples expressed in percent of all samples of data set, Q2 – another diagnostic outcome of PLS-DA models. Groups included are GTE – green tea extract and placebo. Time points included are T0 – start of the intervention and T4, T8 and T12 – after 4,8 and 12 weeks of intervention.

model / Variables included: / Time points included: / AUROC / NMC(%) / Q2
PLS-DA
Differences GTE vs differences placebo / all / T0 and T12 / 0.537 / 49.7 / -0.003
n-PLS-DA
trajectories GTE vs trajectories placebo / all / T0, T4, T8, T12 / 0.454 / 47.1 / -0.056

Supplementary Table 5. Pearson correlation coefficient between TG28 and TG29 and between TG41 and TG42 in GTE and placebo group at different time points (T4,T8 and T12).

TG28-29 / TG41-42
Time point / GTE / placebo / GTE / placebo
T4 / 0.913 / 0.935 / 0.927 / 0.927
T8 / 0.948 / 0.917 / 0.921 / 0.898
T12 / 0.948 / 0.932 / 0.907 / 0.901

Supplementary Table 6. Isomer composition of lipid metabolites selected as important for INDSCAL model.

no / symbol / identity / Isomer composition
102 / TG28 / 1-acyl 50:2 / TG(18:1/18:1/14:0) + TG(18:1/16:0/16:1) + TG(18:2/16:0/16:0)
103 / TG29 / 1-acyl 50:1 / TG(16:0/16:0/18:1)
108 / TG41 / 1-acyl 52:3 / TG(18:1/18:1/16:1) + TG (18:1/18:2/16:0)
109 / TG42 / 1-acyl 52:2 / TG(18:1/18:1/16:0)

Simulation example

Depending on characteristic of the individual phenotype (system 1 or system 2) a perturbation of levels of one metabolite/enzyme (A) by tested treatment (e.g. GTE) can induce the increase (system 1) or the decrease (system 2) of levels of another metabolite/enzyme(B). This situation can be simulated with following parameters:

System 1:

The steady state of this system is A=3 and B=3.

If the steady state is perturbed say A=3.1 B goes also up 3.751

System 2:

The steady state of this system is A=3 and B=3.

If the steady state is perturbed say A=3.1 B goes down to 2.511

Figures:

Supplementary Figure 1. BMR modelling procedure

Supplementary Figure 2. PCA scores for the plant data set, describing all variation in the glucosinolate composition except for the dynamic variation present in all plants. Components 1-3 are given from top to bottom. The loadings of each component are given on the left and the scores on the right. The average scores per time per treatment have been given: CON plants by crosses, RJA plants by filled squares and SJA plants by open triangles. The error bars indicate the 95 % confidence for the spread of these scores. The loadings on the left have been normalized and the vertical axes are equal for all three components.

Supplementary Figure 3. PCA results of human nutrition dataset. Loadings (left) showing metabolite profiles of the main sources of variability. The loadings are normalized and the vertical axes are equal for all two components. The corresponding scores have been indicated on the right with a group average (blue full circles – GTE group and red empty circles – placebo group) and a 95% confidence interval surrounding it.

Supplementary Figure 4Mean levels ± standard deviation, variance and covariance of metabolites selected by INDSCAL approach as significant for GTE effect. A) mean levels ± SD of TG28, B) mean levels ± SD of TG29, C) variance of TG28, D) variance of TG29, E) covariance of TG28 and TG29, F) covariance of TG41 and TG42. Legend: BL – baseline group, GTE – catechin-enriched green tea extract group, placebo – placebo group, significantly different * P<0.1, **P<0.05, ***P<0.01.

Page 1 of 14