Supplementary Material for: Dynamic social networks in guppies (Poeciliareticulata).

Alexander D.M. Wilson1, 7, Stefan Krause2,&, Richard James3, Darren P. Croft 4, Indar W. Ramnarine5, Karoline K. Borner1, Romain J.G. Clement1 & Jens Krause1,6

1Leibniz Institute of Freshwater Ecology and Inland Fisheries, Department of the Biology and Ecology of Fishes, 12587 Berlin, Germany;

2Department of Electrical Engineering and Computer Science, LübeckUniversity of Applied Sciences, Lübeck, Germany;

3Department of Physics, University of Bath, BathBA2 7AY, UK;

4Centre for Research in Animal Behaviour, College of Life and Environmental Sciences, Washington Singer Labs, University of Exeter, Perry Road, Exeter, EX4 4QG, UK.

5Department of Life Sciences, University of the West Indies, St Augustine, Trinidad & Tobago;

6HumboldtUniversity, LGF, Germany

7Department of Biology, CarletonUniversity, Ottawa, CanadaK1S 5B6

ADM Wilson and S Krause are shared first authors

Corresponding author: A.D.M. Wilson. E-mail:

Maximum likelihood estimation of the models’ probabilities

In order to determine the transition probabilities of the Markov chain models, it would in principle be possible to estimate them based on the observed lengths of contact with a particular nearest neighbour (q1), of social contact (q2), and of being alone (q3). However, since our observation periods per focal individual were limited to 1.5 min. (10 data points) this estimation would be biased, and the number of shorter contact phases would be overestimated. As an alternative the transition probabilities can be estimated by computing the relative frequencies of the occurrences of these transitions in the observed data which constitutes a maximum likelihood estimation (Fink 2008). This is explained in more detail in the next paragraph.

In our Markov chain model the next state only depends on the current state. For our simple model which only distinguishes the states i (“focal fish is social”) and x (“focal fish is alone”), we need to estimate the conditional probabilities p2 = P(staten+1 = x | staten = i), and p3 = P(staten+1 = i | staten = x). Here, p2 (p3) denotes the probability that the focal fish switches to state x (i) when its current state is i (x). The probability p2 can be estimated by looking at the observed pairs of states and dividing the number of (i,x) pairs by the number of (i,s) pairs, where s is any state (i or x). In the same way p3 can be estimated based on the relative frequency of state x being followed by state i.

As an example let us assume that for some focal individual the following state sequence is observed:

x, x, x, i, i, x, i, x, x, i, i, i, x, i.

This sequence consists of the pairs (x,x), (x,x), (x,i), (i,i), (i,x),(x,i),(i,x),(x,x), (x,i), (i,i), (i,i), (i,x),and (x,i). The estimate for p2 is 3/6 because 3 out of 6 pairs beginning with i end with x, and the estimate for p3 is 4/7, because 4 out of 7 pairs beginning with x end with i. In an analogous way the probability q1 of the detailed model can be estimated.

It is possible to take not only the single preceding state into account when estimating the probabilities of the next states but the preceding pair, triple or n-tuple of preceding states (and construct higher-order Markov chains). However, the more preceding states are taken into account the more data is needed for a robust estimation of the probabilities. In our case, the behaviour did not seem to depend on more than the current state (Fig. 2). Therefore, we did not use more than that for the estimation of our model probabilities.

Tests

We used the following types of Monte Carlo tests.

Markov chain Monte Carlo test

Based on a list of groups this test randomises the group compositions while keeping constant the group sizes and the numbers of occurrences of the group members. Krause et al. (2009) describe this test in more detail. In our study we used it to analyse the composition of the groups formed by individuals that were present at the hotspot during the same session.

Model-based Monte Carlo test

In this test a “randomisation” step consists of running a model to simulate an observation that has the same number of sessions, the same group compositions per session, and the same focal individuals as our original observation. In contrast to a pure randomisation that, e.g., simply permutes individual identities, in the model-based test each data point of a sequence for some focal individual is generated by the model.

Example:

If the more detailed model is used and in a session the individualsi1, i2, i3, and i4 were present, then for a focal individual i0 the following sequence might be generated, regardless of what the originally observed sequence looked like

x, i2, i1, i2, i2, x, i3, x, x, i1.

For each simulated observation the value of the test statistic was computed. Finally, the rank position of the originally observed value of the test statistic was determined and the p-value was computed followingthe usual definition of Monte Carlo tests (Manly 2007).

Because the simulation does not keep constant the number of contacts and contact phases we used percentages rather than absolute values in order to compute a test statistic that measured the association strength of pairs of individuals.

For our tests we used the models described in the methods of the main text. In particular, we used the more detailed model as a null model for the analysis of differences in individual-specific behaviour, and the individual-specific models to demonstrate the goodness of fit of these models regarding certain network measures.

Simulation of disease transmission

The more detailed model and the individual specific models can be used to generate a behavioural sequence of arbitrary length for some focal individual in the presence of k potential nearest neighbours, e.g.

x, i2, i1, i2, i2, i2, x, x, x, i3, x, x, i1, i1, i1, i1, x, ...

Under the assumption that it takes m consecutive time steps to transfer a piece of information or transmit a disease from one individual to another it can easily be determined how many time steps it takes from the beginning of such a sequence until the focal individual has been involved in contact phases of length m with all interesting (e.g. infected) individuals. By repeating this for multiple sequences the distribution of this number of time steps can be approximated. In our study we used N=10000 repetitions.

References

FinkGA (2008) Markov Models for Pattern Recognition. Springer-Verlag

Krause S, Mattner L, James R, Guttridge T, Corcoran MJ, Gruber SH, Krause J(2009) Social network analysis and valid Markov chain Monte Carlo tests of null models.

Behav Ecol Sociobiol 63:1089–1096

Manly BFJ (2007) Randomization, bootstrap, and Monte Carlo methods in biology, 3rd edn. Chapman and Hall, Boca Raton