Supplemental Figure S1. Selection of candidate plasma miRNAs for study. A. Expression heat map of 135 miRNAs that were reproducibly detected in asthma. Two way unsupervised hierarchical clustering with the average linkage method was used. B. Volcano plot of the differentially expressed miRNAs in our screening.

Supplemental Figure S2. A. Representative raw data showing real-time PCR curves for miR-145, miR-148a, and let-7a run in triplicate from a random subject. B. Gel electrophoresis of the miRNA real-time PCR reaction shows one band at a size between 50-100 nucleotides, consistent with the expected size of the miRNA products.

Supplemental Figure S3. Boxplots of the 30 differentially expressed miRNAs. A. Group 1 miRNA expression pattern shows intermediate dysregulation in allergic patients and more exaggerated dysregulation in asthmatic individuals. B. Group 2 miRNA expression pattern shows no statistically significant difference between the healthy and allergic rhinitis cohorts. C. Group 3 miRNA expression pattern shows no difference in median miRNA expression levels between the allergic and asthmatic populations. D. Group 4 miRNA expression pattern shows no difference between healthy and asthmatic groups. E. Group 5 miRNA expression pattern shows intermediate dysregulation in asthmatic individuals and more exaggerated miRNA dysregulation in allergic subjects.

Supplemental Figure S4. MicroRNA-KEGG pathway network. A. 115 unique KEGG pathways are predicted to be regulated by the 30 differentially expressed miRNAs. The top 20 most regulated pathways and the miRNAs that target them are depicted in this network diagram. In general, there are extensive overlaps in the pathways targeted by each miRNA group. B. Bar graph of the top 10 KEGG pathways based on number of genes regulated by the differentially expressed miRNAs.

Supplemental Figure S5. MicroRNAs are differentially expressed in eosinophilic asthmatic patients compared to non-eosinophilic asthmatic subjects. A. The volcano plot illustrates which miRNAs are differentially expressed between Cluster 1 and Cluster 2. Log2 fold change is calculated from the ratio of median miRNA copy/µL in Cluster 1 to Cluster 2. B. PCA mapping of the 35 asthmatic individuals using 39 candidate miRNAs confirms the formation of two major clusters. PC1 and PC2 account for 25.1% and 17.2% of the variance in the data, respectively.

Supplemental Figure S6. Establishment of the six miRNA prediction model. A. Example of a decision tree classification model. The tree is constructed with all 79 samples and 44 features as implemented by scikit-learn1. The split at each decision node is based on minimizing the Gini index2 which describes the probability of assigning the incorrect class labels to the set of samples at that particular node. The random forest classification we utilized takes the consensus of 10 to 100000 decision trees to predict a subject’s disease status. B. 44 different prediction models were generated by successively adding features in order of their predictive importance (Supplemental Tables S4 and S5). Our analyses reveal that a predictive model containing the top six most important miRNAs (highlighted in orange) performs with the highest accuracy (91.1%). Blue points represent outliers from 10 runs. C. The ideal number of trees for our prediction model was confirmed by performing random forest classification analyses with our top six miRNAs using 10, 100, 1000, 10000, and 100000 decision trees. A random forest algorithm with 100 trees (highlighted in orange) is sufficient to attain the highest predictive accuracy. Blue points represent outliers from 10 runs.

1.Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 2011; 12:2825-30.

2.Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees. Wadsworth. Belmont, CA 1984.