Supplementary Note 1. Differences Between the Datasets

Supplementary Note 1. Differences Between the Datasets

Dataset S1.

Supplementary Note 1. Differences between the datasets

Hematopoietic stem and progenitor cell(HSPC), granulocytes (GN), monocytes (MO), dendritic cells (DCs), B-cells, natural killer (NK) cells, and CD4 and CD8 T-cells were measured in both species . The human compendium includes megakaryocytes, eosinophiles, basophils, and several erythroid progenitors that are absent from the mouse compendium. Conversely, the mouse compendium includes several committed progenitors missing from the human HSPC group, and many more distinct cell types in every lineage, especially in the T and B lineages and their progenitors. All human cells were collected from blood, but most mouse cells were isolated from other compartments (thymus, spleen, lymph nodes, etc.). Different markers were used to sort the populations in each organism. In addition, different samples of the same cell type in human are typically from unrelated healthy donors (newborns or adults), whereas ImmGen mouse data were almost exclusively derived from inbred C57BL/6 (B6) mice raised in the Jackson Laboratory barrier facility.

Supplementary Note 2. Controlling for the effect of expression level on COE

We considered the possibility that the higher COE of genes with higher maximal expression might result from decreased measurement noise at higher expression values. We found this possibility to be unlikely for two reasons;First, we note that noise levels are reduced in the calculation of COE measures,sincethe values used to calculate the correlation are the gene’s mean expression over several arrays (3-31arrays, depending on speciesand lineage). This likely reduces, if not completely eliminates, COE’s sensitivity to the higher noise inlowly expressed genes. Second, we performstatistical analyses to control (1) for the maximal expression (by decile), and (2) for the degree of noise in a gene’s expression (estimated from the replicate samples in each cell type). Both analyses excluded a strong relation between gene’s noise level and COE.

To control for maximal expression level, we defined 10 deciles of maximal expression per species, and in each decile, permuted the ortholog pairs and computeda null, decile specific, COE distribution.Within each decile the COE is significantly higher than the permutation (Fig. SN1), whereas the null COE distributions of all of the decilesare similar, and there is no consistent change with deciles (Fig. SN2). For mouse, the null COE distribution of the 1st and 10th decile is the lowest (Fig. SN2right, bold solid and dashed line, respectively). For human, the null COE distribution of all expression decilesare similar (Fig. SN2left). Thus, the high level of expression does not by itself induce a different distribution of COE.Nevertheless, this analysis cannot completely rule out the possibility of some residual effect of expression level on COE.

COE bias by Maximal human expression pngCOE bias by Maximal mouse expression png

Figure SN1 -distribution of COE per quantile of maximal expression(blue) and the null distribution for that quantile (black), in human(top) and mouse (bottom).

X MouseHuman Site draft Revision TS6 Figures SuppNoteFigSN1 png

Figure SN2. Background distribution of COE per quantile of maximal expression.Those are the same distributions shown in black in Fig. SN1.

To control for noise level, we define 10 deciles of noise per species (defined as the mean of the coefficient of variation in replicates) and show that the correlation of COE and maximal expression is still positive in all but the highest noise decilein each species (Table SN1). The correlation is significant (FDR = 10%) and positive in 5 of the 20 human or mouse noise decilebins (Table SN1, yellow). Surprisingly, those include both the lowest two noise decilesand the 9th-highest decilein each species. That is, for genes with similar noise levels, the COE is still higher, on average,for highly expressed genes.

Table SN1. Pearson correlation and p-value between maximal expression level and COE in noise deciles.

Quantile / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10
Pearson r mouse / 0.10 / 0.03 / 0.06 / 0.07 / 0.08 / 0.03 / 0.03 / 0.12 / 0.14 / -0.02
Pearson r human / 0.12 / 0.12 / 0.02 / 0.06 / 0.03 / 0.06 / 0.05 / 0.08 / 0.04 / -0.10
Pearson p mouse / 0.017 / 0.399 / 0.134 / 0.071 / 0.066 / 0.540 / 0.525 / 0.003 / 0.001 / 0.608
Pearson p human / 0.004 / 0.002 / 0.566 / 0.150 / 0.494 / 0.173 / 0.232 / 0.060 / 0.306 / 0.013