Supplementary material for:
Turbidity affects social dynamics in Trinidadian guppies
Behavioral Ecology and Sociobiology
Karoline K. Borner, Stefan Krause, Thomas Mehner, Silva Uusi-Heikkilä, Indar W. Ramnarine and Jens Krause
Corresponding author: Karoline K. Borner
Email:
Address: Leibniz-Institute of Freshwater Ecology and Inland Fisheries, Müggelseedamm 310, 12587 Berlin, Germany
Modelling approach and parameter estimation
Our observations consist of sequences of ‘behavioural states’. At each time point a focal fish can either be social (with a conspecific) denoted by i or alone (no conspecific within four body lengths) denoted by x. Our task is to find a model that captures the dynamics of these sequences. The simplest possible approach would be to use a single parameter p that specifies the probability of the next state being i. This parameter can be estimated by computing the relative frequency of the state i. However, this very simple approach is based on the unlikely assumption that the next state is independent of the preceding states. Wilson et al. (2014) have shown that this assumption cannot be made for guppies in the wild (see also the below section on the BIC analysis). Following their approach, we introduced conditional probabilities into our model such that each state depends on the immediately preceding state. More specifically, we used the two probabilities
p2 = Prob(staten+1 = x | staten = i), and
p3 = Prob(staten+1 = i | staten = x).
Here, p2 (p3) denotes the probability that the focal fish switches to state x (i) when its current state is i (x). Both probabilities can be estimated based on relative frequencies (Fink 2008). Figure 1a in the main text shows a graphical representation of the resulting model. Its probabilities define the transition probabilities of a (first-order) Markov chain with states i and x without having to introduce any further assumptions or parameters. This allows predictions that will be used in the following sections to analyse the goodness of fit of our model.
If a focal individual has more than one potential neighbour, it can switch between them while staying social. The more detailed model (Figure 1b in the main text) extends the simple model by additionally describing the internal dynamics of the state i. This means, both models are exactly the same, if the switching behaviour is ignored. The more detailed model adds one parameter to the simple model, the probability p1 of leaving the current neighbour. Like p2 and p3, p1 is a conditional probability,
p1 = Prob(staten+1 ≠ ik | staten = ik).
It can be estimated in the same way as p2 and p3, i.e. based on relative frequencies.
The probability ps for switching the neighbours while staying social can then be computed as follows:
ps = Prob(staten+1 = il | staten = ik), where il ≠ ik = p1 - p2.
Goodness of fit of the Markov chain model
The time spent in a state of a Markov chain follows a geometric distribution. In our study system this means, the frequencies of phase lengths of being social or alone should decrease exponentially with increasing phase length. To compare the model predictions with the observed data we simulated observations of the model’s behaviour where we took into account the 3 min observation time per focal individual. This is necessary because incompletely observed phases (that started or ended outside the observation period) will lead to higher numbers of short phases than theoretically expected. We repeated the simulation 104 times and computed the mean frequencies and the 2.5% and 97.5% percentiles for each phase length. The simulation was based on the estimated probabilities and did not take into account their confidence intervals. Therefore, the predicted percentile ranges are conservative. Our results show that the observed data are well approximated by the model predictions (Figure S1). A few deviations concerning long contacts exist but this has to be seen in the context of the very low frequency of such unexplained contact lengths and the relatively conservative model predictions.
Another prediction of a Markov chain model regards the so-called mixing time (see, for example, Levin et al. 2008). For our study system this means that after a small number of time steps the probability of being in state i or state x should be (almost) independent of the initial state. In other words, regardless of whether an individual was in state i or state x at the beginning of a sequence, after some time steps it should be in state i with a certain probability π(i) and in state x with probability 1 - π(i). We used this prediction for a test of the goodness of fit in the following way. We performed 105 simulations of our model and determined the frequency distribution of the states i and x after the n-th time step for each of the initial states i and x separately. Theoretically, these distributions should be approximately equal after the 5th time step. We measured their difference using the total variation distance (which is the maximum difference between the probabilities assigned to a single event by the two distributions, Levin et al. 2008). To make the numbers comparable with our observed numbers we started each simulation such that the initial states were identical to those in our observations. Figure S2 shows the goodness of fit for our simple model. In accordance with theory independence of the initial state is reached after about 5 steps and most of the observations are well within the boundaries of the 95% percentiles (as predicted by the Markov model). To demonstrate that this is not self-evident the figure additionally shows the ‘goodness of fit’ for a fictitious scenario, where the ‘true’ behaviour did not follow a Markov chain but where the lengths of phases were uniformly distributed in the range 4-6 time points for being social and 1-2 time points for being alone. In this case, deviations from the model predictions are clearly visible with many values exceeding the boundaries of the 95% percentiles (as predicted by the Markov model) (Figure S2).
We performed the mixing time analysis only for the simple model with two states. For the more detailed model the number of states depends on the number of potential neighbours and more data is required to be able to detect deviations from the predicted mixing time. Instead, we tested whether the mean lengths of contact with a particular neighbour differed between the individuals. The more detailed model assumes that there are no such differences. To test this we applied a randomization test where the identities of the focal individuals and their neighbours were randomized. As a test statistic we used the variance of the mean length of contact across all pairs of individuals. This test statistic yields large values, if the mean lengths differ between the individuals. The results showed that this was not the case in our observations (104 randomisations, p = 0.64 in clear water and p = 0.73 in turbid water).
Bayesian Information Criterion (BIC)
Although our simple Markov chain model contains only two parameters, the question arises whether the very simple unconditional model with just one parameter (mentioned in the first section of this supplement) can explain the data as well. To answer this question, we applied the Bayesian Information Criterion (BIC) to our data and found that it clearly favours the Markov chain model. The difference between the BIC for the two models was 172.1 in clear water and 531.2 in turbid water. A difference of 9.2 or more is commonly regarded as sufficient for the conclusion that the favoured model is substantially better (Guttorp 1995).
References
Fink GA (2008) Markov Models for Pattern Recognition. Springer-Verlag
Guttorp P (1995) Stochastic Modeling of Scientific Data. Chapman & Hall/CRC
Levin DA, Peres Y, Wilmer EL (2008) Markov Chains and Mixing Times. American Mathematical Society
Wilson ADM, Krause S, James R, Croft DP, Ramnarine IW, Borner KK, Clement RJG, Krause J (2014) Dynamic social networks in guppies (Poecilia reticulata). Behav Ecol Sociobiol
Figures
Fig. S1. Frequency distributions of (a) the lengths of contact with a particular nearest neighbour, (b) the lengths of social contact, i.e. the numbers of successive times a focal individual retained state i, and (c) the lengths of being alone in the observed data (circles) for the treatment in clear water. Also shown are the means (x’s) and the 2.5% and 97.5% percentiles as predicted by our Markov chain models. The graphs (d), (e), and (f) show the corresponding data for the treatment in turbid water.
(Note that 0 values cannot be displayed in a logarithmic plot and are omitted.)
Fig. S2. Distance of the frequency distribution of states i and x for the initial state i from the frequency distribution of these states for the initial state x as a function of elapsed time in the observed sequences (circles) for (a) clear water, (b) turbid water, and (c) a fictitious observation where the behaviour did not follow a Markov chain. The horizontal bars mark the 95% percentiles of this distance as predicted by the Markov chain model.
Figure S1
a) d)
b) e)
c) f)
Figure S2
a)
b)
c)