Supporting Information

Informative Metabolites Identification by Variable Importance Analysis Based on Random Variable Combination

Yong-Huan Yun1, Fu Liang1,Bai-Chuan Deng1, Guang-Bi Lai2,Carlos M. Vicente Gonçalves3, Hong-Mei Lu1, Jun Yan1, Xin Huang1, Lun-Zhao Yi4,*, Yi-Zeng Liang1,**

Authors affiliations:

1College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, P.R. China

2Heilongjiang University of Chinese Medicine, Heilongjiang, Ha'erbin 150040, P.R. China

3Department of Chemistry, Faculty of Mathematics and Natural Sciences, University of Bergen, Bergen 5020, Norway

4Yunnan Food Safety Research Institute, Kunming University of Science and Technology, Kunming, 650500, China

Correspondence information

*Corresponding author. Tel.:86-0731-8830824;fax: +86-0731-8830831.

E-mail address: (Y-Z Liang);

Corresponding author. Tel.: +86-0871-65920302.

E-mail address: (L-Z Yi)

SI-Table 1.Metabolites corresponding to the variable ID on childhood overweight data.

ID / Metabolites
1 / Lactate
2 / 2-Ketoisocaproic acid
3 / Alanine
4 / α-Hydroxybutyric acid
5 / β-Hydroxybutyric acid
6 / Urea
7 / Phosphate
8 / Leucine
9 / Glycerol
10 / Proline
11 / Isoleucine
12 / Glyceric acid
13 / Serine
14 / Threonine
15 / Pyroglutamic acid
16 / 2,3,4-Trihydroxybutyric acid
17 / Phenylalanine
18 / Lauric acid
19 / Dodecylacrylate
20 / Glycerophosphoric acid
21 / Myristic acid
22 / Tyrosine
23 / Palmitic acid
24 / Linoleic acid
25 / Oleic acid
26 / Stearic acid
27 / Arachidonic acid
28 / Monopalmitin
29 / Monostearin
30 / Cholesterol

SI-Table2.Variable ranking of different methods on childhood overweight data.

Ranking / RC / VIP / SPA / VIAVC / P value / Adjusted P valuea
1 / 12 / 22 / 23 / 5 / 8.808e-157 / 1.145e-155
2 / 5 / 27 / 5 / 12 / 8.977e-119 / 1.077e-117
3 / 19 / 19 / 12 / 23 / 2.127e-57 / 2.340e-56
4 / 22 / 12 / 10 / 27 / 1.176e-38 / 1.176e-37
5 / 2 / 2 / 26 / 13 / 2.733e-35 / 2.460e-34
6 / 27 / 5 / 2 / 15 / 4.257e-21 / 3.406e-20
7 / 23 / 23 / 16 / 22 / 3.720e-15 / 2.605e-14
8 / 13 / 13 / 19 / 10 / 6.172e-12 / 3.703e-11
9 / 4 / 10 / 30 / 17 / 1.146e-05 / 5.731e-05
10 / 15 / 26 / 9 / 1 / 1.744e-04 / 6.977e-04
11 / 10 / 7 / 13 / 26 / 1.552e-02 / 4.657e-02
12 / 26 / 4 / 27 / 4 / 2.631e-02 / 5.261e-02
13 / 29 / 8 / 6 / 24 / 0.9224 / 0.9224
14 / 9 / 11 / 7
15 / 24 / 17 / 28
16 / 3 / 20 / 14
17 / 28 / 15 / 18
18 / 25 / 3 / 29
19 / 18 / 24 / 21
20 / 14 / 29 / 24
21 / 7 / 28 / 11
22 / 6 / 25 / 22
23 / 20 / 9 / 15
24 / 17 / 1 / 17
25 / 11 / 30 / 20
26 / 1 / 18 / 3
27 / 16 / 6 / 4
28 / 8 / 16 / 1
29 / 30 / 14 / 8
30 / 21 / 21 / 25

a Adjusted P value by multiple testing with the Bonferroni-Holm correction method.
SI-Table 3.Variable ranking of Simulated CHOD dataa.

Ranking / VIAVC
On original dataset / Adjusted P valueb / VIAVC
On simulated dataset / Adjusted P value of t test
1 / 5 / 1.145e-155 / 42 / 1.582e-109
2 / 12 / 1.077e-117 / 12 / 4.207e-104
3 / 23 / 2.340e-56 / 23 / 5.697e-84
4 / 27 / 1.176e-37 / 53 / 2.479e-67
5 / 13 / 2.460e-34 / 35 / 1.062e-51
6 / 15 / 3.406e-20 / 5 / 6.545e-50
7 / 22 / 2.605e-14 / 22 / 7.519e-49
8 / 10 / 3.703e-11 / 52 / 2.009e-48
9 / 17 / 5.731e-05 / 27 / 2.637e-36
10 / 1 / 6.977e-04 / 57 / 4.207e-33
11 / 26 / 4.657e-02 / 9 / 5.816e-27
12 / 4 / 5.261e-02 / 15 / 2.508e-24
13 / 24 / 0.9224 / 45 / 5.926e-20
14 / 17 / 1.984e-05
15 / 43 / 0.3128
16 / 13 / 0.3628

aThe simulated CHOD data(29×60): original CHOD data (29×30)+ original CHOD data with small amount of noise (29×30).For instance, the 12th and 42th variables are highly correlated.

bAdjusted P value by multiple testing with the Bonferroni-Holm correction method.

SI-Table 4.Metabolites corresponding to the variable ID on nasopharyngeal carcinoma dataset.

ID / Metabolites
1 / Lactate
2 / Alanine
3 / Glycine
4 / α-Hydroxy butyrate
5 / β-Hydroxy butyrate
6 / Norvaline
7 / Isoleucine
8 / Proline
9 / Serine
10 / Threonine
11 / Pyroglutamate
12 / Phenylalanine
13 / Fructose
14 / Glucose
15 / Hexadecanoic acid
16 / Pyranoid type glucose
17 / Inositol
18 / Linoleic acid
19 / Oleic acid
20 / Stearic acid
21 / Arachidonic acid
22 / Glycerol 1-hexadecanoate
23 / Glycerol 1-octadecanoate
24 / Cholesterol

SI-Table5.Variable ranking of different methods on nasopharyngeal carcinoma dataset.

Ranking / RC / VIP / SPA / VIAVC / P value / Adjusted P value
1 / 14 / 14 / 14 / 20 / 2.591e-209 / 5.701e-208
2 / 20 / 18 / 18 / 14 / 3.903e-160 / 8.197e-159
3 / 18 / 20 / 21 / 22 / 5.598e-112 / 1.120e-110
4 / 21 / 21 / 8 / 18 / 1.380e-105 / 2.621e-104
5 / 22 / 8 / 20 / 21 / 2.343e-96 / 4.217e-95
6 / 5 / 5 / 7 / 7 / 2.838e-75 / 4.825e-74
7 / 7 / 7 / 4 / 24 / 8.596e-57 / 1.375e-55
8 / 11 / 11 / 10 / 11 / 1.999e-55 / 2.998e-54
9 / 6 / 22 / 22 / 15 / 2.003e-55 / 2.999e-54
10 / 24 / 4 / 11 / 6 / 1.088e-51 / 1.414e-50
11 / 10 / 10 / 5 / 23 / 1.413e-45 / 1.695e-44
12 / 16 / 2 / 2 / 4 / 1.225e-37 / 1.347e-36
13 / 19 / 12 / 19 / 8 / 3.409e-31 / 3.409e-30
14 / 13 / 1 / 6 / 5 / 2.238e-27 / 2.014e-26
15 / 12 / 23 / 24 / 9 / 2.093e-16 / 1.674e-15
16 / 3 / 9 / 12 / 19 / 2.831e-12 / 1.981e-11
17 / 4 / 6 / 23 / 10 / 8.655e-08 / 5.193e-07
18 / 2 / 13 / 15 / 3 / 1.731e-04 / 8.658e-04
19 / 1 / 19 / 9 / 13 / 4.572e-04 / 1.829e-03
20 / 8 / 15 / 17 / 2 / 7.633e-04 / 2.290e-03
21 / 15 / 17 / 3 / 1 / 3.129e-03 / 6.258e-03
22 / 17 / 3 / 1 / 12 / 3.468e-03 / 6.258e-03
23 / 9 / 24 / 13
24 / 23 / 16 / 16

SI-Table6. Metabolites corresponding to the variable ID on cancer-associated skeletal muscle wasting data.

ID / Metabolites / ID / Metabolites
1 / 1,6-Anhydro-beta-D-glucose / 40 / N,N-Dimethylglycine
2 / 1-Methylnicotinamide / 41 / O-Acetylcarnitine
3 / 2-Aminobutyrate / 42 / Pantothenate
4 / 2-Hydroxyisobutyrate / 43 / Pyroglutamate
5 / 2-Oxoglutarate / 44 / Pyruvate
6 / 3-Aminoisobutyrate / 45 / Quinolinate
7 / 3-Hydroxybutyrate / 46 / Serine
8 / 3-Hydroxyisovalerate / 47 / Succinate
9 / 3-Indoxylsulfate / 48 / Sucrose
10 / 4-Hydroxyphenylacetate / 49 / Tartrate
11 / Acetate / 50 / Taurine
12 / Acetone / 51 / Threonine
13 / Adipate / 52 / Trigonelline
14 / Alanine / 53 / Trimethylamine N-oxide
15 / Asparagine / 54 / Tryptophan
16 / Betaine / 55 / Tyrosine
17 / Carnitine / 56 / Uracil
18 / Citrate / 57 / Valine
19 / Creatine / 58 / Xylose
20 / Creatinine / 59 / cis-Aconitate
21 / Dimethylamine / 60 / myo-Inositol
22 / Ethanolamine / 61 / trans-Aconitate
23 / Formate / 62 / pi-Methylhistidine
24 / Fucose / 63 / tau-Methylhistidine
25 / Fumarate
26 / Glucose
27 / Glutamine
28 / Glycine
29 / Glycolate
30 / Guanidoacetate
31 / Hippurate
32 / Histidine
33 / Hypoxanthine
34 / Isoleucine
35 / Lactate
36 / Leucine
37 / Lysine
38 / Methylamine
39 / Methylguanidine

SI-Table 7.Variable ranking of different methods on cancer-associated skeletal muscle wasting data.

Ranking / RC / VIP / SPA / VIAVC / P value / Adjusted P value
1 / 42 / 60 / 40 / 42 / 1.825e-164 / 3.833e-163
2 / 60 / 42 / 57 / 40 / 2.449e-154 / 4.898e-153
3 / 45 / 33 / 43 / 33 / 9.397e-151 / 1.785e-149
4 / 33 / 45 / 60 / 45 / 3.589e-147 / 6.460e-146
5 / 56 / 56 / 36 / 60 / 2.277e-105 / 3.871e-104
6 / 40 / 40 / 16 / 19 / 2.149e-103 / 3.438e-102
7 / 10 / 10 / 45 / 56 / 1.893e-77 / 2.839e-76
8 / 2 / 2 / 54 / 11 / 1.974e-74 / 2.763e-73
9 / 19 / 19 / 20 / 9 / 4.945e-44 / 6.429e-43
10 / 26 / 26 / 7 / 36 / 4.156e-38 / 4.987e-37
11 / 49 / 36 / 27 / 10 / 1.129e-34 / 1.242e-33
12 / 31 / 57 / 59 / 34 / 3.578e-29 / 3.578e-28
13 / 16 / 49 / 14 / 59 / 8.637e-26 / 7.774e-25
14 / 36 / 22 / 38 / 2 / 3.036e-21 / 2.429e-20
15 / 34 / 16 / 21 / 57 / 2.211e-15 / 1.548e-14
16 / 39 / 34 / 18 / 26 / 5.700e-15 / 3.420e-14
17 / 57 / 31 / 31 / 20 / 2.892e-13 / 1.446e-12
18 / 11 / 39 / 19 / 7 / 2.000e-08 / 8.000e-08
19 / 48 / 21 / 28 / 22 / 1.376e-05 / 4.128e-05
20 / 22 / 43 / 8 / 18 / 9.626e-04 / 1.925e-03
21 / 38 / 38 / 17 / 21 / 0.2119 / 0.2119
22 / 21 / 11 / 46
23 / 43 / 7 / 4
24 / 54 / 54 / 53
25 / 12 / 20 / 11
26 / 1 / 14 / 1
27 / 7 / 48 / 44
28 / 9 / 27 / 51
29 / 53 / 15 / 9
30 / 13 / 8 / 26
31 / 8 / 55 / 34
32 / 35 / 51 / 63
33 / 17 / 9 / 52
34 / 58 / 46 / 24
35 / 3 / 12 / 15
36 / 20 / 59 / 32
37 / 52 / 24 / 13
38 / 14 / 32 / 47
39 / 27 / 28 / 61
40 / 62 / 1 / 55
41 / 37 / 18 / 29
42 / 50 / 4 / 56
43 / 28 / 13 / 35
44 / 4 / 52 / 62
45 / 5 / 47 / 39
46 / 55 / 53 / 25
47 / 63 / 3 / 23
48 / 44 / 23 / 41
49 / 41 / 44 / 33
50 / 25 / 17 / 48
51 / 59 / 61 / 30
52 / 6 / 41 / 10
53 / 18 / 25 / 42
54 / 46 / 50 / 3
55 / 15 / 63 / 58
56 / 61 / 35 / 49
57 / 23 / 29 / 37
58 / 32 / 37 / 22
59 / 30 / 58 / 50
60 / 51 / 62 / 12
61 / 47 / 5 / 2
62 / 24 / 6 / 5
63 / 29 / 30 / 6