Supplementary material

Statistical analyses

(a) Random resampling

In addition to the generalized linear models, ad hoc randomization tests were performed to compare the actual frequencies of infected individuals carrying each allele with the frequency distribution expected from a null model. This method allowed us to confirm that (i) observed site specific associations between some alleles and infection status did not arise by chance, (ii) the effects of some alleles on infection status were different, and even antagonistic, among sites. The procedure used for this analysis is described below.

i) Site specific associations between MHC alleles and infection status

In a first step, in all sites (separately), and for all alleles (separately), we checked whether the presence/absence of alleles and the infection status were independent. For each site, 50,000 randomizations of the infection status and the presence/absence of alleles were performed (null hypothesis). Two-tailed p-values (adjusted for multiple testing by Bonferroni correction) were computed by comparing actual frequencies of infected individuals carrying each allele to the distributions obtained under the null hypothesis. These results were then compared to the results from generalized linear models (results in Table 2).

ii) Interaction between site and allele effects

The results from the GLM described above revealed that there is a significant site × allele interaction term for three alleles (pado83, pado109, pado133, see results), suggesting that associations between presence/absence of the allele and infection are site specific. This result was confirmed for the three alleles considered.

a) As a first step, we built an expected frequency distribution of infected individuals among the carriers of allele x when assuming that the association between presence/absence of the allele and infection is the same in all sites. For this purpose, 15 infected individuals and 15 non infected individuals were drawn at random (without replacement) from each site (nx sites considered, see below). Among these individuals, one carrier of allele x was drawn at random for each site. Among these nx individuals, the number of infected individuals Xall was counted. The operation was repeated 50,000 times to provide an expected frequency distribution of infected individuals among the carriers of allele x after having removed differences due to non equal proportions of allele carriers, non equal proportions of infected individuals, and non equal sample sizes among sites.

b) The same operation was repeated by drawing all individuals from the same site: 15 infected individuals and 15 non infected individuals were drawn at random (without replacement) from site i. Among these individuals, nx carriers of allele x were drawn. Among these individuals, the number of infected individuals Xi was counted. This operation was performed for the nx sites considered and was repeated 50,000 times in each case.

The number of sites used in this analysis (nx) varied according to the allele x considered because the analysis of the interaction required that 1) both infected and non infected individuals occur in the analyzed sites, and 2) both carriers and non carriers of the allele occur. Therefore, sites for which less than five carriers of allele x occurred, and less than 15 individuals occurred were removed from the analysis (since this restriction is independent of the association between infection and presence of allele, it is not expected to bias the analysis with respect to the infection × allele interaction). Values of nx were equal to 5 (N = 277), 5 (N = 377) and 10 (N = 550) for pado83, pado109, pado133, respectively.

Results

In addition to the overall model, we ran a series of supplementary analyses in order to test the robustness of the site × allele interactions with respect to two potential confounding factors: the time of the year when the samples were collected (winter vs. spring) and the parasite lineage (SGS1 vs. GRW11).

Seasonal effect

To take into account the potential bias arising from unequal sampling in winter and spring across the 13 populations, we ran three supplementary models. In the first one, we discarded all the populations where birds had not been sampled in both seasons. This substantially reduced the sample size from 13 sites and 658 individuals to 6 sites and 328 individuals. In spite of this decreased statistical power we found that i) overall, the model including the site × allele interaction terms had a better fit than a model without the interaction terms (c² = 115.2, df = 70, p = 0.0005); ii) the same three alleles were involved in site × allele interactions as for the overall model (pado83, LR c² = 17.8, df = 5, p = 0.003, pado109, LR c² = 6.9, df = 2, p = 0.032, and pado133, LR c² = 14.5, df = 5, p = 0.013). Moreover a fourth allele (pado126) was also involved in a statistically significant site × allele interaction (LR c² = 16.8, df = 5, p = 0.0049). These results, therefore, suggest that the site × allele interactions are robust with respect to seasonal variation in malaria prevalence. To go further in the analysis of the potential effect of season on the observed site × allele interactions we also analysed winter and spring data separately. Prevalence is higher in winter (perhaps because of higher mortality of primary infected birds and/or relapses in spring), therefore statistical power higher in the winter sample. When restricting the analysis to the winter data we found a global site × allele interaction (c² = 102.9, df = 75, p = 0.018). The alleles involved in site × allele interactions for the winter were pado83 (c² = 24.2, df = 7, p = 0.001), pado133, (c² = 24.1, df = 7, p = 0.001) and pado123 (c² = 21.1, df = 7, p = 0.004). When restricting the analysis to the spring data we also found a global site × allele interaction (c² = 120.0, df = 93, p = 0.031). The only allele involved in the site × allele interaction for the spring was pado109 (c² = 8.4, df = 3, p = 0.039). These results again suggest that the finding of significant site × allele interactions is robust to splitting of the data, although a new interaction emerged in the winter sample.

Overdispersion of binomial variances

In the main analysis, we compared two models. The first model contained the site, season, allele and all the two-way interactions between site and allele (Mod1). The second model only included site, season and allele (all the two-way interactions between site and allele having been removed, Mod2). We found that Mod1 had a statistically significant better fit than Mod2 (χ2 = 230.5, df = 162, p = 0.0003). In order to test the robustness of the conclusion to any overdispersion of the binomial data, we included the empirical overdispersion factors to Mod1 and Mod2 (see Crawley 2007). The residual deviances and residual degrees of freedom were respectively 516.1 and 466 for Mod1 and 746.6 and 628 for Mod2. These yielded overdispersion factors very close to one for both models (ĉ=1.11 for Mod1 and ĉ=1.19 for Mod2). Mod1 and Mod2 were re-implemented with a quasi-binomial error structure. An analysis of deviance indicated that Mod1 still had a statistically significant better fit than Mod2 (F = 1.41, df = 162, p = 0.0027).

Parasite lineages

We considered the two lineages SGS1 and GRW11 as belonging to the same species (P. relictum), because both molecular and morphological data indicate very minor differences between the two lineages. However, SGS1 and GRW11 might differ in their virulence and therefore we decided to check whether the site × allele interactions were robust to the splitting of the data depending on the parasite lineage involved. However, the prevalence of GRW11 was too low to allow us to analyse GRW11 infected individuals alone. We therefore ran a model where we only considered SGS1 infected sparrows. We found i) the model including the site × allele interaction terms had a better fit than a model without the interaction terms (c² = 220.1, df = 162, p = 0.002); ii) two site × allele interactions involving two the alleles already highlighted in the overall model (pado83, c² = 19.2, df = 8, p = 0.014, and pado109, c² = 18.0, df = 8, p = 0.021). As for the seasonal effect, the finding of site × allele interactions does not seem to depend on the parasite lineage involved.

References

Crawley, M.J. 2007 The R Book. John Wiley & Sons Ltd, Chichester, England. pp. 569-590.