Supplement I.

Cohort description

To explore the potential of GWAS with traits of different origin for nutrition related studies, there were recruited a group of 265 subjects from the general population of the São Paulo metropolitan area of Brazil. This particular area of the country was selected because of its high degree of ethnic and environmental diversity. Subjects were aged 18-47 (mean 32.8) and population was balanced according to gender (49% of subjects were male and 51% female). Recruitment and sampling procedures were approved by the Institutional Review Board of the Sírio Libanês Hospital, and by the National Committee of Research Ethics at the Brazilian Ministry of Health (HSL 2007/25 Process no. 25000.114841/2007-17)

Genotyping services were outsourced to Expression Analysis Inc. (Durham, NC, USA). Briefly, genomic DNA was extracted from whole blood and genotyping was performed on the Illumina Human Omni-Quad1 platform following standard protocols. Genotype calling was performed with Beadstudio software (Illumina). Calls with a genotyping score below 0.2 were excluded from further analysis. Single nucleotide polymorphisms (SNPs) with a call rate below 90% and individuals with a call rate below 95% were also excluded.

Metabolomics

Urine samples of each individual were collected along three different visits and frozen at -80°C prior to metabolomic analysis. Each sample was further used to obtain the untargeted metabolic profile by 1H NMR. Each urine sample (400 µL) was adjusted to pH 6.8 using 200 µL of a deuterated phosphate buffer solution (KD2PO4, final concentration of 0.2 M) containing 1 mM of sodium 3-(trimethylsilyl)-[2,2,3,3-2H4]-1-propionate (TSP) and transferred into a 5 mm NMR tube. Metabolic profiles were registered on a Bruker Avance III 600 MHz spectrometer equipped with a 5 mm inverse probe at 300 K (Bruker Biospin, Rheinstetten, Germany).

For each sample, a 1H NMR spectra was acquired using the first increment of the Noesy sequence, (D1-90°-t1-90°-tm-90°-free induction decay [FID]) with water suppression. Spectra were acquired using a relaxation delay of 4 s and a mixing time of 100 ms., 16 scans were collected into 98’000 data points using a spectral width of 18315.02 Hz and an acquisition time of 2.7 s.

Spectra were processed using the software package TOPSPIN (version 2.1, Bruker, Germany). The FIDs were multiplied by an exponential weighting function corresponding to a line broadening of 1 Hz. The acquired spectra were manually phase and baseline corrected, and referenced to the chemical shift of TSP at 0.00. The NMR spectra were converted into 12K data points over the range of δ 0.4-10.0 excluding the water residue signal between δ 4.70-5.00. Chemical shift intensities were normalized to the sum of all intensities within the specified range prior to binning to 0.004 ppm. Metabolite identification was achieved using an in-house developed database of reference spectra of pure compounds and literature data(Oostendorp, Engelke, Willemsen, & Wevers, 2006)(Siddiqui, Sim, Silwood, Toms, Iles, & Grootveld, 2003), and confirmed by 2D 1H NMR spectroscopy experiments performed on selected samples.

1 of 2