Electronic Supporting Material:

State-of-the-art non-targeted metabolomics in the study of chronic kidney disease

J. Boelaert*, R. t’Kindt*, E. Schepers, L. Jorge, G. Glorieux, N. Neirynck, F. Lynen, P. Sandra, R. Vanholder, K. Sandra

* Both authors contributed equally (alphabetical order)

Institutional affiliations

R. t’Kindt, L. Jorge, P. Sandra, K. Sandra

Metablys, Research Institute for Chromatography, President Kennedypark 26, Kortrijk, Belgium

J. Boelaert, F. Lynen

Separation Science Group, Department of Organic Chemistry, Ghent University, Krijgslaan 281, S4-bis, Gent, Belgium

G. Glorieux, E. Schepers, N. Neirynck, R. Vanholder

Nephrology Section, University Hospital Ghent, De Pintelaan 185, Ghent, Belgium

Corresponding author:

Koen Sandra

Metablys, Research Institute for Chromatography, President Kennedypark 26, Kortrijk, Belgium

Email:

Tel.: +32 (0)56 20 40 31

Fax: +32 (0)56 20 48 59

Figure S1. Schematic representation of the applied data analysis approach for LC-MS and GC-MS. (MH Qual - MassHunter Qualitative Analysis; MPP - MassProfiler Professional; MSC – Molecular Structure Correlator; DB – database; AMDIS - Automated Mass Spectral Deconvolution and Identification System; MSDChem – MSD Chemstation; NIST - National Institute of Standards and Technology).

LC-MS raw data files were first subjected to “naïve” untargeted data processing by the Molecular Feature Extraction (MFE) algorithm. This algorithm localizes and combines all related ions (molecular ion, isotopes and adducts) with the generation of a single mass (median), retention time (peak apex) and an abundance (sum of all ions). After MFE, MassProfiler Professional was used to align and filter the extracted feature lists. Subsequently, a targeted feature extraction, based on the Find by Ion algorithm, searched the raw data files de novo for these listed features based on matching their composite spectrum (mass, isotopes, adducts) and their specific retention time in a user-defined mass and retention time window. This targeted approach greatly reduces the number of false negatives and positives generated by the MFE algorithm by re-evaluating the missing values in the data matrix hereby increasing the quality of the dataset. Targeted feature lists of each sample were again aligned and filtered in MassProfiler Professional, thereby obtaining a final dataset ready for statistical analysis. Statistically significant metabolites were identified using the capabilities of the Q-TOF mass spectrometer and its accompanying software tools. GC-MS data processing started with XCMS based untargeted feature extraction and alignment. The XCMS algorithm does not classify spectral ions originating from the same compound. Hence, the resulting feature list encloses multiple ions for each individual metabolite detected. An in-house built feature reduction tool selected and retained the most abundant feature per defined retention time. Only the most abundant spectral ion for each detected metabolite was withdrawn in the reduced feature list. The aligned XCMS data matrix was subsequently imported in MassProfiler Professional for statistical analysis. Differential metabolites were investigated de novo in a targeted data processing step, which incorporates a deconvolution in AMDIS and manual integration in MSD ChemStation. After statistical confirmation of these metabolites, the identification process, based on EI (RTL) library searching, took place.

Figure S2. Details of the data processing pipeline applied for LC-MS data: (a) visual representation of the workflow suggested by the instrument vendor, (b) untargeted feature extraction: features are detected using the Molecular Feature Extraction (MFE) algorithm in MassHunter Qualitative Analysis 5.0 (Agilent Technologies). The resulting feature list is imported into MassProfiler Professional 12.0 (Agilent Technologies), aligned and filtered. (c) Targeted feature extraction: compounds listed in a .CEF are extracted from the raw data files based on accurate mass and retention time by Find By Ion (FBI) in MassHunter Qualitative Analysis 5.0. The Find by Ion algorithm enhances the quality of the dataset by reducing the number of false negatives and positives.

Figure S3. RSD distribution plot displaying the technical repeatability of (a) the LC-MS analysis in both positive and negative ESI mode and (b) the GC-MS analysis. The stability of the feature signals is expressed as relative standard deviation (RSD) values, calculated for each feature as the standard deviation of the peak area in all QC samples divided by the average of the peak area in all QC samples. RSD is calculated for features with 100% frequency after all data processing steps.

Table S1. Clinical characteristics of the included patients (p < 0.05 in Tukey’s test vs. Healthy*, vs. CKD3°; NS: not significant; CTN: creatinine; CRP: C reactive protein).

Healthy / CKD3 / CKD5HD / p-value
Number / 20 / 20 / 19 / -
Age / 33.8 ± 13.6 / 60.3 ± 14.0* / 67.9 ± 12.6* / 1.00E-04
Male/Female / 9/11 / 12/8 / 13/6 / NS
BMI (kg/m²) / 22.7 ± 3.9 / 26.0 ± 4.0 / 27.8 ± 6.3* / 5.00E-03
Syst BP (mm Hg) / 125 ± 16 / 134 ± 24 / 146 ± 26* / 1.50E-02
Diast BP (mm Hg) / 77 ± 10 / 82 ± 11 / 67 ± 15*° / 1.00E-03
Pulse / 69 ± 10 / 68 ± 8 / 70 ± 14 / NS
CTN / 0.92 ± 0.19 / 1.49 ± 0.30 / 7.55 ± 2.52*° / 1.00E-04
CRP (mg/L) / 0.16 ± 0.17 / 0.21 ± 0.15 / 1.08 ± 1.47*° / 1.00E-03

Table S2. Relative standard deviation (RSD) of peak area and retention time, and average mass accuracy of randomly selected metabolites measured in QC samples (n=12), detected in LC-MS. Creatinine, 3-carboxy-4-methyl-5-propyl-2-furanpropanoic acid (CMPF), sucrose, tryptophan and warfarin were detected in positive ESI mode, while guanosine, pantothenic acid, phenyl sulphate, tyrosine and uric acid were detected in negative ionization mode.

Metabolite / Mass / Average mass accuracy (ppm) / tR / RSD
tR / RSD
AUC
Creatinine / 113.0589 / 1.91 / 0.84 / 0.28% / 3.19%
CMPF / 240.0998 / 0.99 / 10.48 / 0.26% / 6.16%
Sucrose [M+Na]+ / 364.0982 / 0.61 / 1.01 / 0.21% / 7.22%
Tryptophan / 204.0899 / 3.27 / 5.60 / 0.20% / 4.18%
Warfarin / 308.1049 / 2.67 / 12.24 / 0.84% / 7.34%
Guanosine / 283.0917 / 1.87 / 4.25 / 0.33% / 4.24%
Pantothenic acid / 219.1107 / 2.67 / 5.11 / 0.14% / 5.62%
Phenyl sulphate / 173.9987 / 4.96 / 5.60 / 0.21% / 5.26%
Tyrosine / 181.0739 / 2.64 / 2.35 / 0.41% / 4.19%
Uric acid / 168.0283 / 3.73 / 1.75 / 0.57% / 4.62%

Table S3. Relative standard deviation (RSD) of retention time (locked) and peak area of the target ion of randomly selected metabolites measured in QC samples (n=4 within-day, n=20 between-day), detected in GC-MS. The between-day RSD values are calculated as the averages of the within-day RSD of five days. The between-day RSD of the peak area has been subjected to normalization to correct for the between-day variability of the GC-MS analysis.

Metabolite / Target Ion
Mass / tR / RSD tR
within-batch / RSD tR
between-batch / RSD AUC
within-batch / RSD AUC
between-batch
L-(+) lactic acid / 147 / 7.12 / 0.06% / 0.09% / 10.02% / 10.33%
L-valine / 144 / 9.20 / 0.01% / 0.05% / 8.10% / 14.32%
Citric acid / 73 / 16.59 / 0.01% / 0.04% / 8.43% / 14.38%
Hippuric acid / 105 / 16.86 / 0.01% / 0.04% / 9.29% / 13.56%
D-sorbitol / 205 / 17.83 / 0.01% / 0.02% / 4.07% / 9.20%
Palmitic acid / 117 / 18.88 / 0.00% / 0.04% / 3.39% / 8.76%
L-tryptophan / 202 / 20.44 / 0.01% / 0.03% / 9.47% / 11.68%
Oleic acid / 117 / 20.46 / 0.01% / 0.02% / 3.17% / 8.68%
Pseudouridine / 357 / 21.47 / 0.01% / 0.02% / 7.77% / 12.43%
Cholesterol / 129 / 27.54 / 0.00% / 0.02% / 3.52% / 6.58%

Table S4. Feature reduction throughout the LC-MS data processing.

Data processing step / Positive ESI / Negative ESI
Molecular feature extraction (MFE) / 24,735 / 11,312
Find by Ion; Frequency filter (75%; used for univariate statistics) / 10,355 / 4,240
p < 0.05ᶲ / 592 / 685

ᶲ Only features that showed an increase or decrease throughout the different stages of CKD were withdrawn: Healthy > CKD3 > CKD5HD or Healthy < CKD3 < CKD5HD.

Table S5. Feature reduction throughout the GC-MS data processing.

Data processing step / Feature number
XCMS feature detection / 2,657
Feature reduction tool / 229
Removal of siloxane contaminant peaks / 206
p < 0.05ᶲ / 23

ᶲ Only features that showed an increase or decrease throughout the different stages of CKD were withdrawn: Healthy > CKD3 > CKD5HD or Healthy < CKD3 < CKD5HD.

Supplementary Method Material.

Mass Spectrometer Settings

For GC-MS, electron ionization (EI) was used and MS was performed in scan mode (m/z 50-600) with the MS quadrupole at 150°C and MS ion source at 250°C. The system was tuned using perfluorotributylamine (PFTBA).

For LC-MS, needle voltage was optimized to +/- 3.5 kV, the drying and sheath gas temperatures were set to 300°C and the drying and sheath gas flow rates were set to 6 and 8 L/min, respectively. Data were collected in centroid mode from m/z 50–1,700 at an acquisition rate of 1 spectrum/sec in the extended dynamic range mode (2 GHz), offering an in-spectrum dynamic range of 105 and a resolution of ± 10,000 full width at half maximum (FWHM) in the metabolite m/z range. To maintain mass accuracy during the analysis sequence, a reference mass solution was used containing reference ions (121.0508 and 922.0097 for positive ESI mode, and 112.9856 and 1033.9881 for negative ESI mode). The Q-TOF instrument was tuned using the ESI-L low concentration tuning mix (Agilent Technologies) prior to the analysis sequence. For auto MS/MS mode, a survey MS scan was alternated with three DDA MS/MS scans resulting in a cycle time of 2 s (acquisition rate of 2 spectra/sec). Singly charged precursor ions were selected based on abundance. After being fragmented twice, a particular m/z value was excluded for 30 s, allowing the MS/MS fragmentation of chromatographically resolved isomers. The quadrupole was operated at narrow resolution and the collision energy was fixed at 10, 20 or 40 eV, respectively. Targeted MS/MS mode fragmented listed precursor ions in a defined retention time window (± 0.15 min) at fixed collision energies of 10 and 20 eV.

9