Growth and culture conditions

The 4 bacteria were grown in anaerobic chamber (except for E. coli) at 37°C in LYHBHI medium [brain–heart infusion medium supplemented with 0.5% yeast extract (Difco) and 5 mg/L hemin] supplemented with cellobiose (1 mg/mL; Sigma–Aldrich), maltose (1 mg/mL; Sigma), and cysteine (0.5 mg/mL; Sigma). The 2 yeasts were grown at 37°C in Sabouraud medium (bioMérieux).

HT-29 cell line was maintained in RPMI 1640 supplemented with 2mM L glutamine, 50 IU/mL penicillin, 50 μg/mL streptomycin and 10% heat-inactivated fetal calf serum (Sigma-Aldrich), at 37°C and 5% CO2. HUVEC cell line was cultured in EGMTM -2 medium (Lonza) at 37°C and 10% CO2.

Histology

Four-micrometer sections were stained with hematoxylin/eosin or periodic acid-Schiff (PAS) prior to blinding for histological scoring. The samples were also processed with a Starr Trek kit (Biocare Medical) or a Novolink Polymer Detection System (Leica Biosystems) to stain several mouse cell markers by immunohistochemistry, according to the manufacturer’s instructions, with a mouse monoclonal anti-Ki67 antibody (Leica Biosystems) for proliferative cells, a rat monoclonal anti-F4/80 antibody (AbD Serotec) for macrophages, a rabbit polyclonal anti-CD3 antibody (Dako) for T cells, a rabbit polyclonal anti-FoxP3 antibody (Abcam) for Treg cells, a rat monoclonal anti-CD45R antibody (eBioscience) for B cells, and a rabbit polyclonal anti-CD31 antibody (Abcam) for endothelial cells. For each sample, 5 to 10 areas or crypts were processed for quantification. Immunohistochemistry staining for F4/80, CD3 and CD45R was quantified with ImageJ software (Schneider et al., 2012), with the color deconvolution plugin.

Gene expression

Transcriptional profiling was performed on mouse colon samples using the SurePrint G3 Mouse GE 8x60K Microarray (Design ID: 028005, Agilent Technologies). A total number of 46 samples (5 to 7 per group) were processed. Cyanine-3 (Cy3) labeled cRNAs were prepared with 100 ng of total RNA using the One-Color Low Input Quick Amp Labeling kit (Agilent Technologies) and following the recommended protocol. Specific activities and cRNA yields were determined using a NanoDrop ND-1000 (Thermo Fisher Scientific). For each sample, 600 ng of Cy3-labeled cRNA (specific activity > 11.0 pmol Cy3/µg of cRNA) were fragmented at 60°C for 30 minutes and hybridized to the microarrays for 17 hours at 65°C in a rotating hybridization oven (Agilent Technologies). After hybridization, microarrays were washed and then dried immediately. After washing, the slides were scanned using a G2565CA Scanner System (Agilent Technologies), with a resolution of 3 µm and a dynamic range of 20 bit. The resulting TIFF images were analyzed with the Feature Extraction Software v10.7.3.1 (Agilent Technologies), using the GE1_107_Sep09 protocol.

Furthermore, SuperScript® II Reverse Transcriptase then TaqMan® Gene Expression Assays (Life Technologies) were used to quantify by RT-qPCR the expression of selected mouse genes in colon, according to the manufacturer’s instructions: Tbx21 (T-bet, assay ID: Mm00450960_m1), Gata3 (assay ID: Mm00484683_m1), Foxp3 (assay ID: Mm00475162_m1), Rorc (RORγt, assay ID: Mm01261022_m1), Ifng (assay ID: Mm01168134_m1), Tlr4 (assay ID: Mm00445273_m1), Il1b (assay ID: Mm00434228_m1), Ccl2 (assay ID: Mm00441242_m1), Pdcd1 (PD-1, assay ID: Mm01285676_m1), Il10 (assay ID: Mm00439614_m1), Tgfb1 (assay ID: Mm01178820_m1), Ido1 (assay ID: Mm00492586_m1), Esm1 (assay ID: Mm00469953_m1) and Egfl7 (assay ID: Mm00618004_m1). We used the ΔΔCt quantification method, with mouse GAPDH gene (assay ID: Mm99999915_g1) as an endogenous control and the germ-free group as a calibrator.

SuperScript® II Reverse Transcriptase then SYBR Green technologies (Life Technologies) were used to quantify by RT-qPCR the expression of selected human endothelial genes in HUVEC cell line, with human GAPDH gene as an endogenous control. These genes and the primers we obtained from PrimerBank internet site (http://pga.mgh.harvard.edu/primerbank) are presented in Supplementary Table 1.

In vitro experiments

A 1.6-Kb section of the human IDO-1 promoter was cloned into the pGL4.14 (Promega) luciferase plasmid (referred as pIDO-luc) and used to establishing the HT-29-IDO reporter cell-line. To monitor IDO-1 transcription activity, a stable HT-29-IDO reporter cell line was selected by using hygromycin (600 µg/mL, InvivoGen) and validated with IFNγ (100 U/mL, Peprotech) and IL1β (10 ng/mL, Peprotech). For each experiment, HT-29-IDO reporter cells were seeded at 2.5x104 cells per well in 96-well plates 24 h prior to infection or co-culture with bacterial fractions. Bacterial cultures were centrifuged at 5000 ×g for 10 min to separate bacterial supernatants and pellets. The pellets were resuspended in PBS and the resulting supernatants were filtered on 0.2-μm PES filters. Non-inoculated bacteria culture medium served as the control. The cells were stimulated for 24 hours with 10 μL of bacterial fractions (pellet or supernatant) in a total culture volume of 100 μL per well (i.e., 10% vol/vol) prior to the luciferase assay. The luciferase activity was quantified as relative luminescence units by a microplate reader (Infinite 200, Tecan) and the Neolite Luminescence Reporter Assay System (Perkin-Elmer) according to the manufacturer’s instructions. The IDO-1 activity was normalized to the controls, i.e., the unstimulated cells. Experiments were performed in triplicates for two independent assays.

The vascular effects of S. boulardii were tested on the HUVEC cell line. Lyophilized yeast was grown overnight in HUVEC culture medium (EGM -2; 100 mg/mL) at 37°C, and then the suspension was centrifuged and the supernatant was passed through a 0.22-mm filter (VWR). Culture medium or filtered supernatant (final dilution: ¼) was added to the HUVEC culture. After 6 hours, the cells were lysed and the gene expression was determined as described above.

Microarray analysis

Data analysis consisted in pre-processing raw data and statistical analysis in order to identify differentially expressed genes or gene sets between mono-associated models. Statistical analysis strategies consisted in two approaches: gene-by-gene and gene sets approaches. All statistical analyses were done under R platform (R Core Team, 2014).

Pre-processing data

The Agilent Feature Extraction Software v10.7.3.1 was used to convert scanned signal into tab-delimited text files, which could be analyzed by third-part software. The R package agilp (Chain, 2012) was used for the pre-processing of raw data. This package provided some functions to extract signal, normalize and filter genes but this did not met with our purposes, thus more functions were specifically developed. Boxplot and Principal Component Analysis (PCA) were used to get a general overview of the data in terms of within array distribution of signal and between samples variability.

The Agilent Feature Extraction Software computes a p-value for each probe in each array to test whether the scanned signal is significantly higher than background signal. In this case, the null hypothesis is “the measured signal is equal to background signal”. Detected probes are considered if the p-value is lower than 0.05. Probes must be presents in at least 60% of samples per group and in at least one condition that could be considered for analysis.

To compare data from multiple arrays, a normalization of data is used to minimize the effect of non-biological differences. Quantile normalization (Bolstad et al., 2003) is a method that can quickly normalize a set of samples without using a reference base. After this step, spike-in probes, and positive and negative control probes were removed.

Differential expression analysis

For differential expression analysis, we used limma's eBayes test (Smyth, 2004), which finds a compromise between the variance estimate for the gene under consideration and the average variance from all the genes. This gives more reliable results than other classical tests under small sample size situations (Jeanmougin et al., 2010). Benjamini-Hochberg's correction method (Benjamini and Hochberg, 1995) was used to control false discovery rate (FDR).

Signature approach

The signature (or gene set) approach is based on a strategy previously described (Pham et al. 2014). Briefly, potential molecular signatures were generated by independent component analysis (ICA) (Comon, 1994), a statistical model that allows to separate mixed signals into independent signal sources. It is widely used for blind sources separation problems. The use of ICA to analyze microarray data is motivated by the hypothesis that a scanned microarray is a mix of signals from underlying pathways. Therefore, each source is supposed to correspond to a pathway or a function. Generated signatures are added to a database, and then each signature is tested for its enrichment between two biological conditions by Gene Set Enrichment Analysis (GSEA) (Subramanian, 2005). GSEA is a statistical tool implemented at the Broad Institute that tests for the enrichment of one or several sets of genes between two biological conditions (mono-associated versus germ-free or conventionalized groups in our case). For each gene set, in addition to a graphical representation of the result, GSEA gives a normalized enrichment score (NES), a nominal p-value and an adjusted FDR q-value to control false discovery rate. In addition to the signature database generated from our datasets, we also performed GSEA on a public signature database downloaded from Bader’s lab website (http://www.baderlab.org/). GSEA was launched using the ‘pre-rank’ option. Gene or probe lists were ranked according to the modified t-statistic from limma’s eBayes test.

Signature annotation and representation

All the significant signatures (up-regulated or down-regulated in monoxenic group) were annotated for enriched biological functions and pathways using Ingenuity® Pathway Analysis (IPA). Significant canonical pathways had a p-value below 0.05. These signatures were also annotated with the DAVID platform (Huang et al., 2009a; Huang et al., 2009b) for Gene Ontology (GO) and Kyoto Encyclopaedia of Genes and Genomes (KEGG) terms.

We used Cytoscape (Shannon et al., 2003) to globally visualize the connections between the top significantly enriched signatures in mono-associated/germ-free/conventionalized comparisons. Firstly, for the significant signatures, we retained the significantly enriched IPA pathways. Then, these IPA pathways were labelled with +1 if the signature was up-regulated and -1 if the signature was down-regulated in mono-associated groups compared to germ-free or conventionalized mice. The process was repeated for all the IPA pathways in all the significant signatures. Eventually, this process produced a matrix (0/+1/-1) with pathways in rows and comparisons in columns. The 0 value indicates no enrichment, -1 indicates a down-regulated pathway and +1 an up-regulated one in the concerned mono-associated group. The absolute value of this matrix allowed the generation of an adjacency matrix which was then transformed into a graph using RCytoscape package (Shannon et al., 2013) and visualized in Cytoscape platform.

References

Benjamini Y, Hochberg Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B 57: 289–300.

Bolstad BM, Irizarry RA, Astrand M, Speed TP. (2003). A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance. Bioinformatics 19: 185-193.

Chain B. (2012). agilp: Agilent expression array processing package. R package version 3.6.0. (Bioconductor.org)

Comon P. (1994). Independent Component Analysis: a new concept?. Signal Processing 36: 287–314.

Huang DW, Sherman BT, Lempicki RA. (2009a). Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. Nature Protocols 4: 44-57.

Huang DW, Sherman BT, Lempicki RA. (2009b). Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research 37: 1-13.

Jeanmougin M, de Reynies A, Marisa L, Paccard C, Nuel G, et al. (2010). Should We Abandon the t-Test in the Analysis of Gene Expression Microarray Data: A Comparison of Variance Modeling Strategies. PLoS ONE 5: e12336. doi:10.1371/journal.pone.0012336.

Pham HP, Dérian N, Chaara W, Bellier B, Klatzmann D, Six A. (2014). A novel strategy for molecular signature discovery based on independent component analysis. International Journal of Data Mining and Bioinformatics 9: 277-304.

R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL: http://www.R-project.org/.

Schneider CA, Rasband WS, Eliceiri KW. (2012). NIH Image to ImageJ: 25 years of image analysis. Nature Methods 9: 671-675.

Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Research 13: 2498-2504.

Shannon PT, Grimes M, Kutlu B, Bot JJ, Galas DJ. (2013). RCytoscape: tools for exploratory network analysis. BMC Bioinformatics 14: 217.

Smyth GK. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, Volume 3, Article 3.

Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America 102: 15545-15550.