Correlated Interaction and Group Selection

Bruce Glymour

Forthcoming, BJPS

Abstract: Okasha ([2005]) argues that correlated interactions are necessary for group selection. His argument turns on a particular procedure for measuring the strength of selection, and employs a restricted conception of correlated interaction. It is here shown that the procedure in question is unreliable, and that while related procedures are reliable in special contexts, they do not require correlated interactions for group selection to occur. It is also shown that none of these procedures, all of which employ partial regression methods, are reliable when correlated interactions of a specific kind arise, and it is argued that such correlated interactions will likely be ubiquitous in natural populations.

1. Introduction

2. Process and Product

3. Fitness, Mean Fitness and Phenotypic Change

4. Correlated Interactions

5. Causation

6. Implications

1. Introduction

A correlated interaction occurs when the phenotype of an arbitrary individual differentially affects the fitness of other individuals, depending on their phenotypes. That is, if we add to an existing population an extra member, Extra, with phenotype T, this will change the fitnesses of other individuals in the population, but the net change in absolute fitness w for any one of these, the arbitrarily chosen Target, depends on the phenotype of Target: wTarget given (Extra has T & Target has T )is not equal to wTarget given (Extra has T & Target lacks T). In typical models, such correlated interactions arise when the phenotype of a given individual has effects on the fitness of others in its local group, but not on others outside its local group, and group formation is non-random with respect to phenotype. On standard definitions of ‘altruism’, correlated interactions are necessary for the evolution of altruistic traits under group selection. Okasha ([2005]) has recently argued, tentatively, that correlated interactions are necessary not just for the evolution of altruism, but for group selection to occur at all. His discussion is illuminating in a number of respects. Unfortunately, several of his conclusions, including the motivating result, are in error. Correlated interactions are not necessary for group selection.

In the relevant part of his discussion, Okasha employs a conception of group selection made explicit by a particular multi-level selection theory (Heisler and Damuth[1987]), and I will assume that conception throughout this paper. On this conception group selection occurs in a population just in case, in that population, properties of the groups to which individuals belong causally influence individual fitness. Given this conception, the influence of group selection on frequency change may be estimated by either of two linear regression methods, contextual analysis or neighborhood analysis. Both methods are applied to variables that vary in value over individuals, and each method represents group properties by the values of surrogate variables measured on individuals. The methods differ in the surrogate employed.

Let P be an arbitrary population of organisms i1-in, partitioned in exclusive subsets, i.e. groups, of individuals, where the function g(i) maps individuals to their groups:g(ij)=gk iff ijgk. Let n(ij) map individuals to neighborhoods, n(ij)=g(ij)\{ij}, so that the neighborhood of an individual is the set of all members of that individual’s group, excluding that individual itself. Let T(i) be a trait variable measured on individuals, the values for which record mutually exclusive genotypic or phenotypic properties of individual organisms, and let MT(g) be some moment of the distribution of T in group g. Define Bg(i)as a variable measured on individuals, whose value for a given individual is equal to MT(g) for the group to which the individual belongs: Bg(i)=MT(g(i)). Let MT(n) be the moment of the distribution of T in the neighborhood n, and define Bn(i)=MT(n(i)) as a variable measured on individuals, such that for arbitrary individual i, Bn(i)is equal in value toMT(n) for the neighborhood n(i)=n. Thus, values of Bg(i)and Bn(i) denote properties of individual organisms, which in turn correspond to properties of the group or neighborhood, respectively, of the individual. Bgand Bn are, in effect, surrogates for distinct but definitionally related ‘group level’ variables; because these surrogates are defined for individuals, they can be used in a causal or statistical analysis in which the individual is the unit of analysis. Call such surrogate variables, whether defined with respect to moments over groups or neighborhoods, ‘belonging to’ variables.1 For ease, we will henceforth take the moment of concern to be the mean. So by definition Bg(i)=, where g(i)=g is the group to which i belongs, and Bn(i)= where n(i)=n is i’s neighborhood.

Contextual analysis employs Bg as a surrogate for group properties in an individual-level regression model of fitness; neighborhood analysis employs Bn as a surrogate for neighborhood properties in an otherwise identical individual level regression model of fitness. The regression models are given respectively by a pair of regression equations. Equation 1 is the model for contextual analysis:

(1)

where wi is the absolute fitness of an individual,  is the partial regression of w on T controlling for Bg and  is the partial regression of w on Bg on controlling for T. Equation 2 is the model for neighborhood analysis:

(2)

where w is again absolute fitness, ’ is the partial regression of w on T controlling for Bn and ’ is the partial regression of w on Bn controlling for T.

The regression coefficients are useful because they can be used to assess strength of the selection processes acting on the trait variables. Since Bg and Bnare surrogates for group-level traits, the selection process acting on them must be a group-selection process. Provided, then, that the strength of selection on the belonging to variables is non-zero, group selection acts on the population. There are however two different ways in which one may estimate the strength of a selection process. One could, following Lande and Arnold ([1983]), take the coefficients in the resulting regression equations as selection gradients for the respective characters, and simply estimate the strength of a selection process in terms of the associated selection gradient. Many who employ contextual analysis do just this (e.g. Tsuji[1995] ).2 Others, (e.g. Wolf et al.[1999]), prefer to estimate the strength of a selection process by its contribution to the change in the mean population phenotype between successive generations, i.e. the strength of a selection process is estimated by the population’s response to that process. Those who prefer this understanding of the strength of a selection process use the coefficients in the regression model to decompose the change in mean phenotype between generations, . The decomposition is derived from one of two different modifications of Price’s equation:

(3)

where w is individual fitness, is mean fitness in the population, is the change in mean phenotype in the population between subsequent generations and the covariance is calculated over individuals in the population (Price[1972]).3

Which modification one uses depends on the regression model one has. The details by which the modifications are developed need not concern us (interested readers may consult Okasha ([2005]), in which the details are clearly specified and motivated). The equation employed in contextual analysis is:

(4)

where is mean fitness in the population, Var(T) is the variance in T among individuals in the population, Var() is the variance in among groups, and  and  are taken from equation 1. The first term on the right hand side of equation 4 represents the contribution of individual selection to change in the population mean of T, the second the contribution of group selection.

Okasha suggests equation 5 when the regression model is a neighborhood analysis:

(5).

The variance and covariance, here, are both taken over all individuals in the population, and ’ and ’ are taken from equation 2. As with equation 4, the first term on the right hand side of equation 5 is taken to estimate the contribution of individual selection to  while the second estimates the contribution of group selection. Whether equation 4 or 5 is used, the relative importance of the two selection processes is then estimated by some metric on the terms in the right hand side of the respective equations, say the ratio of their absolute values.

It will become important to distinguish between the two ways of using contextual and neighborhood analysis to assess the strength of group selection. Recollect that on the first procedure, the regression coefficient resulting from equation 1 or 2 is taken as a selection gradient, and that gradient provides a direct estimate of the strength of the corresponding selection process. I will call this procedure ‘contextual analysis proper’ if equation 1 is used, and ‘neighborhood analysis proper’ if equation 2 is used. On the second procedure, the selection gradients are not direct estimates of the strength of selection, but are used to estimate the phenotypic response to group and individual selection using equations 4 or 5, and then the relative strength of group and individual selection is estimated as the ratio of the contributions each makes to phenotypic change. I will call this procedure ‘contextual analysis writ large’ if equations 1 and 4 are used, and ‘neighborhood analysis writ large’ if equations 2 and 5 are used. And again, whatever method is used to estimate the strength of group selection, group selection is taken to occur, to act on the population, just in case the strength of group selection is not zero. In his discussion, Okasha is concerned with contextual and neighborhood analysis writ large.

Okasha’s case for the necessity of correlated interactions proceeds as follows. He points out that when group formation is random, Var() need not be zero, but ’Cov(T,Bn) must be, since barring sampling error Cov(T,Bn) will be zero in such circumstances. Suppose that random group formation forbids the presence of a correlated interaction, as Okasha does. Then when group formation is random and the partial regression of w on Bg is non-zero,  is non-zero by definition, and because of the definitional dependencies between group and neighborhood, ’ will also be non-zero. However, while contextual analysis writ large will yield the judgment that group selection is occurring in the population, neighborhood analysis writ large will yield the contrary judgment that it is not: while ’ is not zero, ’Cov(T,Bn) will be. Hence, if we endorse neighborhood analysis over contextual analysis we have at least a provisional reason for thinking that correlated interactions are necessary for group selection.

Okasha does endorse neighborhood analysis writ large over contextual analysis writ large, at least provisionally. Okasha prefers neighborhood analysis writ large because he prefers neighborhood analysis to contextual analysis proper. He does so on the grounds that arguably ‘[…] any causal link between individual fitness and group character must necessarily be indirect, mediated by a direct causal link between individual fitness and neighborhood character.’ because ‘[…] an individual organism interacts directly with its neighbors not with its neighbors-plus-itself.’(Okasha [2005], p. 721). Okasha is explicit in regarding the reasoning here as inconclusive, but he must think it at least relevant and plausible. To the extent that one accepts it, one will by the previous result regard correlated interactions as necessary for group selection.

The conclusion is demonstrably incorrect. The problem is not so much with either contextual or neighborhood analysis. Neither is reliable in general, but both are reliable under certain conditions. Almost exactly the same conditions, in fact: when one is reliable, so is the other, except in very special cases. The problem lies rather with contextual and neighborhood analysis writ large. Both systematically misrepresent both the causal dependencies and their relative importance. I’ll develop the problems only for neighborhood analysis writ large; with suitable minor changes in the arguments, the reader can see why parallel results hold for contextual analysis writ large.

2. Process and Product

Neighborhood analysis writ large is an unreliable method for determining whether or not group selection acts on a population. By assumption, group selection is operating just in case group properties causally influence individual fitness. Neighborhood analysis writ large assesses such influence by determining the degree to which neighborhood belonging to variables influence change in the population mean of relevant traits. But it is one thing to determine whether or not one variable is a cause of second variable, and quite a different thing to estimate the total effect of the first variable on yet a third variable which is influenced by the second. Neighborhood analysis writ large conflates the two tasks. It is perfectly possible for group selection to occur, i.e. for Bn to cause w, and yet for  not to be influenced by group selection by the lights of neighborhood analysis writ large. Under such circumstances neighborhood analysis writ large will incorrectly judge that group selection is not acting.

Neighborhood analysis assesses the causal influence of neighborhood properties on fitness by the partial regression of w on Bn controlling for T. Under certain conditions, this is a reliable method for detecting the presence of a causal relation between the two variables. The relevant conditions are that 1) the dependence between Bn and w is linear, 2) there are no common causes of Bn and w.4 Supposing the conditions hold, ’ can be non-zero only if w depends causally or definitionally on Bn, or the converse. We can rule out the converse in virtue of temporal constraints, and the dependence is clearly not definitional. Assuming the conditions hold, and absent sampling error, a non-zero value of ’ implies that Bn causes w, and hence that group selection occurs.

That selection pressure may yet explain none of the difference in reproductive success between classes defined by values of T, simply because the mean value of Bn does not differ among classes. In such cases, neighborhood analysis writ large will imply that Bn explains none of the change in mean phenotype because Bn explains none of the difference in fitness between phenotypic classes, and will therefore incorrectly imply that group selection is not occurring. This is easiest to see when T is discrete, say with values 1, 5 and 10. In Table 1 below I list the members of a population with 6 groups of 6 members each. Consider what happens if individuals reproduce according to r1=3T+2Bn (which I will call Model 1).5

[INSERT TABLE 1 ABOUT HERE]

The mean fitness in the population is 26.25, and the mean fitnesses of the phenotypes T=1, 5, and 10 are 13.17, 25.6 and 40, respectively. The mean population phenotype is 5.33, the mean population Bn is the same, 5.33, and the mean Bn for types T=1, 5 and 10 is respectively 5.28, 5.45 and 5.27. The covariance between T and Bn in this population is ~ -.05, so the two variables are only very weakly associated. As a consequence, Bn accounts for nearly none of the difference in absolute fitness between the phenotypes: Bn accounts for 10.56 offspring, per capita, for the phenotype T=1, 10.9 offspring per capita for the phenotype T=5 and 10.54 offspring per capita for the phenotype T=10. Essentially all of the difference in fitness between phenotypic classes is due the difference in phenotype used to partition the population into classes. Hence, Bncauses reproductive success, group selection acts, and yet Bn accounts for little or none of the difference in fitness between phenotypic classes and so, by the lights of neighborhood analysis writ large, explains little of the change in mean phenotype. Consequently, neighborhood analysis writ large will misdiagnose the situation as a case of pure individual selection.

3. Fitnesses, Mean Fitnessess and Phenotypic Change

In fact, neighborhood analysis writ large is an unreliable method even for estimating the contribution of group selection to . Equation 5 partitions the change in the population between successive generations, and neighborhood analysis takes the second term in equation 5 as an estimate of the effect of group selection. The estimate is biased. Rewriting 5, we have:

(6)

As equation 6 shows,  depends on . But Bn influences even when Cov(T,Bn) is zero. It is true that Bn does not influence differences in the mean fitness of classes defined by values of T, because the mean value of Bn in each class does not differ across these classes. However, Bn does influence the magnitude of the mean fitness for each class; it is just that it makes roughly the same contribution for every class and so explains none of the difference between these magnitudes. Since is the mean fitness for the total population, it depends on the magnitudes of the class fitnesses (as well as their relative frequencies). Hence, the second term in equation 5 is a biased estimate of the influence of group selection on frequency change, and more importantly, a biased estimate of the relative influence of group and individual selection when Cov(T,Bn) goes to zero.

One might use equation 6 instead, where the first term on the right hand side of 6 is taken to estimate the effect of individual selection on . Since the covariance in the second term goes to zero when migration is random, we judge that individual selection alone accounts for the change. But this is a confounded estimate: the mean fitness in the population is determined by both group and individual selection, and mean fitness occurs in the first term on the right hand side of equation 6. Individual selection alone does not account for the total change in the population mean.

The effect of Bn on mean fitness, and hence on change in the mean phenotype, can again be illustrated by comparing the fate of our population under Model 1 (in which r1=3T+2Bn) with its fate under a model in which Bn has no effect on reproductive success: r2=3T (Model 2). Recollect that under Model 1 the mean fitness in the population was 26.25. Under model 2, the mean fitness is 16. Similarly, the relative fitness of each phenotypic class changes. Recollect that under Model 1, the absolute fitnesses for the classes were 13.17, 25.6 and 40 for T=1, 5 and 10 respectively; this yields relative fitnesses of .33, .64 and 1, respectively. Under Model 2 although the absolute fitnesses are considerably lower, the difference in relative fitness between classes is considerably greater. Under Model 2 the absolute class fitness are respectively 3, 15 and 30, with relative fitnesses .1, .5 and 1.