Advanced level (August 2003)

Alice is having a party and has 20 guests, one of whom is her friend Bob. Bob starts a rumor about Alice. A person hearing this rumor for the first time will then tell the rumor to another person chosen uniformly and at random with the exception that no one will tell the rumor to Alice or to the person from whom they heard it. If a person who already knows the rumor (including Bob) hears it again, they will not tell it to anyone.

  • What is the probability that everyone, except Alice, will hear the rumor before it stops propagating?
  • What is the most likely number of people to hear the rumor (i.e the mode)?
  • What is the expected number of people to hear the rumor (i.e. the median)? A numerical approximation is fine.
  • If 20 is replaced by n, what can be said about the asymptotic behavior of the mode?
  • If 20 is replaced by n, what can be said about the asymptotic behavior of the median?

Solution

In the following A1 is Bob; A2, …, A20 are the other guests in order of reception of the rumor; pk means the probability that the chain will go on after step k, qk the probability that it will stop after step k. Assuming that each step takes place if no one that already knows the rumor has been told again in any of the previous steps, the following probability tree can be built:

step 1: A1 tells the rumor to A2 (p1 = 1)

step 2: A2 tells the rumor to A3 (p2 = 1)
step 3: A3 tells the rumor to A1 (q3 = 1/18) or to A4 (p3 = 17/18)

step 4: A4 tells the rumor to A1 or A2 (q4 = 2/18) or to A5 (p4 = 16/18)

step k: Ak tells the rumor to Aj (j = 1 to k-2) (qk = (k-2)/18) or to Ak+1 (pk = (20-k)/18)

step 19: A19 tells the rumor to Aj(j = 1 to 17) (q19 = 17/18) or to A20 (p19 = 1/18)

step 20: A20 tells the rumor to someone that already knows it (q20 = 1).

Of course, q1 = q2 = p20 = 0.

Let X be the stochastic variable “number of people that hear the rumor” (I include Bob within the persons that can be told the rumor); the probability that exactly k > 2 people hear the rumor is given by

P(X = k) = p1p2p3…pk-1qk = 1 x 1 x 17/18 x 16/18 x… x (21-k)/18 x (k-2)/18 =

17!(k-2)/(20-k)!18k-2

The calculations can now be easily made with the help of a spreadsheet:

  • P(X=20) = 17!/1817 = 1,63 10-7 = 0,000000163
  • Since P(X=6) = 0,155 is the maximum probability, 6 is the mode
  • The expected number E(X) of people that hear the rumor is the mean or average, not the median. The median, defined as the value above and below which are half (or, more generally, no more than half) of the total frequencies or probabilities, is 7 (it is the first value whose cumulated probability is more than 0,5); the mean is computed as the sum of the products of each value by its probability, which gives 7,007.
  • If 20 is replaced by n, the formula becomes P(X=k) = (n-3)!(k-2)/(n-k)!(n-2)k-2. Let Rk = P(X=k+1)/P(X=k) = (k-1)(n-k)/(k-2)(n-2). It can be shown that Rk is a decreasing function of k (see Appendix 1), so the mode will be the first integer value of k for which the ratio Rk is less than 1. By solving the 2nd degree inequality (k-1)(n-k)/(k-2)(n-2) < 1 we find k > (3 + sqrt(4n – 7))/2, which, for n, gives k > 1.5 + sqrt(n), so the mode is approximately sqrt(n).
  • The expected value for n people is E(X) = k P(X=k) =  (n-3)!(k2-2k)/(n-k)!(n-2)k-2 =  (n-3)!(n-2)n-k(k2-2k)/(n-k)!(n-2)n-2 (sums from k = 0 to k = n). Let (n-2)!/(n-2)n-2 = L and n-k = j, whence k2-2k = n(n-2) –2(n-1)j + j2. After substitution and rearrangement, we find

E(X) = Ln (n-2)j/j! – 2L(n-1)/(n-2) j(n-2)j/j! + L/(n-2) j2(n-2)j/j! (sums from j = 0 to j = n-3) = LnA – 2LB(n-1)/(n-2) + LC/(n-2)

It can be shown (see Appendix 2) that

B = (n-2) [A- (n-2)n-3/(n-3)!] and C = (n-2) [A+B - (n-2)n-2/(n-3)!], so

E(X) = LA + 2L(n-2)n-3/(n-3)! = (n-2)!/(n-2)n-2 (n-2)j/j! + 2

For n, (j=0 to n-3) (n-2)j/j! (j=0 to n-2) (n-2)j/j! = exp(n-2) (n-2)j/j! exp(-(n-2))

This is the sum of the probabilities from 0 to n-2 of a Poisson distribution whose parameter (and also mean and variance) is n-2. For very large values of the parameter, the mean and the median of a Poisson distribution tend to the same value (normal convergence), so the sum amounts to 1/2. By Stirling’s formula, (n-2)!/(n-2)n-2 sqrt(2(n-2))exp(-(n-2)), and, finally, E(X)  sqrt(2(n-2))/2 +2.

Being sqrt(/2)  1,253, it follows that the expected value tends to approximately 1,25 sqrt(n).

I couldn’t find a formula for the asymptotic median. It can be shown that the cumulative probability from X = 3 to X = x is given by F(x) = [(n-2)!/(n-2)n-2] (n-2)x/ x!; the median is the value x which satisfies F(x) = 0.5.

Appendix 1

Let’s prove that Rk is a decreasing function of k.

dRk/dk =Rk’ = [(n-2k+1)(nk-2k-2n+4) – (n-2)(nk-k2-n+k)] / [(k-2)(n-2)]2. The sign of Rk’ tells us where Rk increases or decreases. Some elementary algebra shows that Rk’ has the same sign as the trinomial –(n-2)k2 +4(n-2)k –(n2-4), or, being n > 2, -k2 +4k –(n+2). Since the discriminant 16 – 4(n+2) is always negative for any n > 3, and being negative the coefficient of the 2nd degree term, the trinomial is negative, and accordingly Rk’ < 0 for n > 3, so the ratio Rk is a decreasing function of k.

Appendix 2

B = 0,n-3j(n-2)j/j! = 1,n-3 (n-2)j/(j-1)! = (n-2) 1,n-3 (n-2)j-1/(j-1)! = (n-2) 0,n-4 (n-2)j/j! =

= (n-2) [0,n-3 (n-2)j/j! – (n-2)n-3/(n-3)!] = (n-2) [A – (n-2)n-3/(n-3)!]

C = 0,n-3j2(n-2)j/j! = 1,n-3j(n-2)j/(j-1)! = (n-2) 1,n-3j(n-2)j-1/(j-1)! = (n-2) 0,n-4 (j+1)(n-2)j/j! =

= (n-2) [0,n-4j(n-2)j/j! + 0,n-4 (n-2)j/j!] = (n-2) [0,n-3j(n-2)j/j! – (n-3)(n-2)n-3/(n-3)! +

+ 0,n-3 (n-2)j/j! – (n-2)n-3/(n-3)!] = (n-2) [A+B - (n-2)n-2/(n-3)!].