Practical 3: Applications of probability

Before starting off with this one, you should be equipped with this problem sheet, a needle, and a piece of lined paper (not used until question 2).

1. DNA evidence is often used in UK criminal trials. In one example of this, Denis John Adams was arrested for rape. Apart from living in the local area, the only evidence linking him to the rape was a DNA match between him and semen from the victim. The probability of such a match by chance was estimated at 1 in 200 million. There were approximately 150,000 males between 18 and 60 in the local area who might have committed the crime. The chance of someone from outside the area committing the crime was assessed at about 25%.

If there were no evidence at all against the suspect except where he lived, what is the probability Adams was guilty?

ANSWER: P(Guilty)=1/150,000*0.75=1/200,000

a)  Assume a chance DNA match probability of 1 in 200 million, and that if the suspect is the origin of the semen at the scene, the match probability is 1. Use Bayes’ theorem, and your answer above, to calculate the probability of the suspect being guilty (if the semen is from the rapist), conditional on his matching this DNA.

ANSWER:

P(Guilty | Match)=P(Match | Guilty)*P(Guilty)/P(Match)

P(Match)= P(Match | Guilty)*P(Guilty)+ P(Match | Not)*P(Not)

P(Match|Guilty)=1, P(Match| Not)=1/200,000,000, P(Guilty)=1/200,000 and P(Not)=199,999/200,000 giving

P(Guilty | Match)= 0.999001

b)  Recalculate this probability if the match probability was instead 1 in 20,000,000, or 1 in 2,000,000, as claimed by the defence.

ANSWER: 0.990099, 0.9090913 respectively

c)  Repeat (a) and (b) if instead of 150,000 local suspects, the whole UK population of 30,000,000 males were considered equally likely as suspects so the original probability of guilt is now 1 in 30,000,000.

ANSWER: 0.8695652, 0.4, 0.0625 respectively

d)  (Harder) The victim did not pick out the suspect in an identity parade. Assuming now that the probability of this occurring given the suspect is the source of the semen is 10%, and if the defendant is innocent, the probability is 90%, how would the probability of (a) change if this additional evidence were included?

ANSWER: We now seek

P(Guilty | Match, Not picked)=P(Match, Not picked| Guilty) *P(Guilty)/P(Match, not picked)

P(Match, not picked)= P(Match, not picked | Guilty)*P(Guilty)

+ P(Match, not picked | Not)*P(Not)

Assuming independence for the match probability and the chance of being picked, given either the suspect is guilty or not guilty, then

P(Match, not picked |Guilty)=1* 0.1,

P(Match, not picked| Not)=1/200,000,000*0.9, P(Guilty)=1/200,000 and P(Not)=199,999/200,000 giving

P(Guilty | Match)= 0.99108, so this counts for the defendant

e)  Suppose in a case like this it were revealed that a database search of 10,000 individuals had been conducted and found the match (such database searches are common in practice). Should this influence the probabilities calculated above? Does this knowledge add weight in favour of, or against, the defence?

ANSWER: In our framework, this actually counts against the defence (slightly) because we have eliminated 10,000 other suspects. The reason it does not count for the defence is that we already took into account the fact that a random member of the population might test positively by chance, in calculating the probability of guilt.

3. Buffon’s needle is an experiment that can be used to estimate p. The idea is simple. Take the horizontally lined paper, and throw the needle up into the air, so it lands on the paper (putting a bit of spin into the needle helps). Repeating a few times, you should see that some of the time the needle crosses one of the lines, other times it does not. The experiment is to see what fraction of the time the needle crosses the line. In pairs if you like, throw the needle 30 times, and see what fraction of these the line is crossed.

Now as shown in Figure 1, suppose the needle has length L, and the lines are a distance d apart (and d>L). After a single throw, let q measure the acute angle between the needle and the horizontal lines (assume any value of q between 0 and p/2 is equally likely). For part (b) on let x measure the vertical distance of the centre of the needle from the nearest horizontal (and assume any value of x between 0 and d/2 is equally probable).

Figure 1: Buffon’s needle

(a)  Measuring relative to the centre of the needle, what is the vertical distance of the top of the needle above this centre in terms of theta?

(b)  What is the condition on x for the needle to cross the line?

ANSWER: x<=L/2*sin(q)

(c)  Given the angle theta, what is the chance of the needle crossing the line, assuming any value between 0 and d/2 is equally likely for x?

ANSWER: L/d*sin(q)

(d)  Draw a graph with horizontal axis q (angle) and vertical axis x (distance of centre from nearest line).

ANSWER: see below

Using the answer for (b), shade the area corresponding to combinations q and x where the needle will intersect the line.

I. What is the equation of the boundary curve?

ANSWER: x=L/2*sin(q)

II.  What is the total area in of the region where the needle can land? ANSWER: p/2*d/2

III.  What is the shaded area? (You will need to perform a simple integration to get this result)

ANSWER: Integrate L/2*sin(q) from 0 to p/2, to get L/2

(e)  Assuming when the needle is dropped that any point of the region shown in the diagram you produced for (d) where the needle can land is equally likely, what is the probability that the needle intersects a line? ANSWER: divide last two results to give 2L/pd

(f)  Use your answer to (e) and the results of the 30 throws to estimate p.