1

Umpires, Referees and the Public Understanding of Decision Technology: Why cricket uses Hawk-Eye well while tennis uses it badly.

Introduction[1]

Public understanding of the capabilities of new technologies is a pressing problem with portentous political consequences. At the time of the first Gulf War the conclusion of a publicised debate over the accuracy of the Patriot anti-missile missile had consequences for the credibility of all star-wars-type defence systems (Collins and Pinch 1998 ch1). With increased computer power and speed it will not be long before it is impossible for the ordinary TV viewer to tell the difference between synthetic reality and photography with huge consequences for news-making should a government choose to use it the power corruptly. These and related technologies need to be understood by the public. Sport is at the cutting edge of the introduction of such technologies to a wide audience.

In an earlier paper in this journal (Collins and Evans 2008) we argued that decision support aids such as `Hawk-Eye’, which are used to assist lbw decisions in cricket and line calls in tennis, are less accurate than they seem. We now take the argument forward in the light of new developments in the use of this technology and new information about decision aids in general. In particular we argue that, in terms of public understanding, an example of correct application of technology to sport is Hawk-Eye as it is currently applied to cricket; acase of the incorrect application is Hawk-Eye as it is currently applied to tennis. We will show how cricket’s approach could be applied to tennis. We will also discuss a number of other simple practical changes that can and should be made to the use of decision-aid technology.

In another place (Collins, under submission) we introduce certain technical terms that are part of the philosophical approach that frames the argument. We use the term `ontological authority’ to refer to the power of match officials (or machines) to define reality (for example, whether the ball was `in’ or `out’); `epistemological privilege’ implies some advantage, due to vantage point or special skills, in respect of knowing what happened; `presumptive justice’ is the assumption that if you were the match official (or the judge in an ordinary courtroom at which you were not present) you would make the same decisions, or that you could do no better; `transparent justice’ implies that the correct decision is seen to be made directly; `transparent injustice’ is the complement of transparent justice; and `false transparency’ is the impression that justice is being seen to be done whereas it is not (as in, say, the early days of Stalinist show trials). We now show how these terms are employed in the case of sports judgements.

Traditionally, umpiring and refereeing has been the domain of presumptive justice: spectators were ready to accept the ontological authority of match officials because they were seen to have epistemological privilege. With the introduction of new technology such as television replays, epistemological privilege has, in the case of many decisions, shifted to the television viewer, bringing instances of transparent injustice. Some sports have tried to rectify the situation by introducing `off-field’ officials who review television replays and advise the on-field official, usually replacing transparent injustice with transparent justice. Certain sports decision aids, which present a clear and sharp reconstruction of the path of the ball, its impact footprint, and the line, give the impression that they can make exact determinations of ball and line position to a millimetre. Given that the admitted average error of the Hawk-Eye device, as used in tennis, is 3.6mm, this can give rise to false transparency.

Through the use of examples, and such information we can gather about available decision aid technologies, we now try to develop a series of recommendations that will maintain the credibility of match officials while allowing the smooth introduction of new technologies without misleading the public.

Failures in the off-field official system and suggested resolution

The first set of examples of decision-making in sport is concerned with the relationship between technology, off-field officials and on-field officials. As intimated above, off-field officials when they use technology to advise on-field officials can eliminate cases of transparent injustice and maintain transparent justice. In the case of incidents on the 15th and 16th of January 2010 during the South Africa-England test cricket match this process was seen by the public to fail (the examples are described in detail in Appendix 1). The only technology in use in respect of the set of examples described were television replays and a `stump’ microphone – a microphone embedded in the wicket to pick up the sound caused by any marginal impact. This set of examples and recommendations has more to do with the smooth introduction of technology to sport and with public satisfaction with it rather than the public understanding of technology.

Over two days of this test match three decisions that were referred to the off-field umpire were considered by commentators and, one must assume, most viewers (they certainly included this paper’s authors), to have delivered transparent injustice (one case being marginal). A fourth decision could not be referred to the off-field umpire because the England team had exhausted its quota of `reviews’ (at least one of which should have been successful rather than unsuccessful and so not deducted from the quota), but television replays revealed that the wrong decision had been made and that a review should have been successful. That makes a total of four (or at least 3.5) cases of transparent injustice over two days.

An important feature of the review system as implemented in this cricket series was that the umpire’s decision was to be taken as final unless it could be seen to be clearly wrong by the off-field umpire using the available technology. In the first case, the off-field umpire should probably have over-ruled the on-field umpire because the bowler’s foot was not clearly behind the relevant line, as could be seen by television viewers, but he did not. In the second case the off-field umpire should have over-ruled the on-field umpire because the fact that the ball hit the bat was evident from the sound transmitted from the stump microphone (the output of which was not available to the on-field umpire) but though television viewers and commentators heard the sound the off-field umpire did not hear it -- it appears he had the volume turned down. In the third case the off-field umpire did over-rule the on-field umpire on the basis of television replays but should not have done as the replays were indecisive – as everyone could see. In the fourth case replays showed the ball clearly deviating as it hit the bat whereas the on-field umpire’s decision was that it had not hit the bat.

The way to avoid these problems and use off-field officials successfully, at least in terms of many kinds of decision, is demonstrated by their use in rugby football. Here the on-field referee engages in a choreographed and audible (to television viewers) conversation with the off-field referee. The tenor of the conversation is set by the question `Is there anything to stop me awarding a try?’ The on-field referee then goes chooses to ask a series of questions which the off-field referee answers audibly. One may imagine a similar procedure applied to cricket which, as far as one can see, would have avoided the South-Africa problem. The audible dialogue, accompanied by replays visible to everyone at the ground and at home, might go like this (adjusted for different circumstances):

On-field-umpire (ONFU): I have given an `out’ decision on the grounds of caught by the wicket-keeper. Was it a legal delivery or a no-ball?

Off-field Umpire (OFFU): It was a legal delivery: some part of the bowler’s foot was definitely behind the line

ONFU: Do you have sufficient reason to believe it did not touch bat or glove so that you can confidently overturn my decision?

OFFU: Yes

ONFU: What is the evidence?

OFFU: There was no sound nor was there a mark on the bat visible with the use of infra-red technology (see below). Finally, there was no clear deflection of the ball visible on television replays. Though I cannot be sure the ball did not touch the bat through viewing the television replay the lack of supporting evidence from other technology makes me as sure as I can be that it did not touch the bat.

ONFU: In that case I will change my decision to `not out’.

This procedure restores ontological authority to the on-field umpire who usesthe technological aids indirectly via the off-field umpire. The audibility of the exchange allows for technicians or the on-field umpire to intervene where residual doubts remain (as in the case of the inaudible sound); this gives the best chance that the conclusion will satisfy everyone except the most partisan.

It might be still better if the on-field umpire choreographed the replays, giving instructions to the third umpire to present different views which were visible on the field of play. This would finally recognize the truth of the matter – which is that nowadays the crowd has epistemological parity in respect of at least some decisions and is in a good position to follow the reasoning of the umpires in those cases. Nevertheless, in virtue of the fact that the default, in case of any doubt, is the original umpire’s decision, there must be a clear over-rule or no change, the ontological authority stays where it has traditionally been – with the on-field umpire. In most cases this procedure would lead to transparent justice with readily acceptable presumptive justice in marginal cases. In such a case it might be that the on-field umpire could call on the use of technology at will (as in the case of rugby football) as well as a limited number of reviews to be called for by players.[2]

Reconstructed Track Devices (RTDs)

`Hawk-Eye’ is a well-known example of what we call a `Reconstructed Track Device’ or RTD. RTDs use visible-light TV cameras to follow the path of the ball and a procedure to filter pixels in each frame. Certain pixels are taken to represent an indicator of the position of ball and certain others an indicator of the position of the line or of other features of the playing arena, between-frame consistency helping to extract significant pixels from background. The space and time coordinates of these pixels are represented numerically and a statistical algorithm reconstructs the flight and impact point of the ball and crucial features of the playing area by combining information about the pixels in the different frames with information about the size of the ball, the physics of its distortion (in the case of tennis), the width of the line (in tennis), and so forth. The output of the calculations can then be used to make an `in/out’ decision in tennis and/or to construct an image of the playing area and the flight of the ball using colours that give an appearance approximating to the real setting but with sharpened edges and idealised precision. In the case of tennis, information about the likely distortion of the ball upon bouncing can be used to estimate the size and shape of the contact footprint and the visually reconstructed bounce point can be elongated to represent what is known of the physics. Every such system is, of course, subject to statistical uncertainty. However sharp the reconstructed images, they represent an `estimate’ which has errors, whether their distribution is fully understood or not. The errors will affect the accuracy of the estimated bounce point, the shape of the reconstructed footprint and the reconstruction of the line.

As mentioned, in an earlier paper we discussed the accuracy of Hawk-Eye. In this paper we will simply note certain facts:

1)While it is sometimes claimed that Hawk-Eye has passed its tests for accuracy of line-calling in tennis with a 100% record, the International Tennis Federation’s tests of line calling accuracy have a tolerance of 5mm in respect of the edge of the line and 10mm in respect of the true position of the ball. This appears to mean that a call would be deemed accurate even when a ball that was 5mm `in’ was called 5mm `out’.[3]

2)Hawk-Eye Innovations’ website agrees that Hawk-Eye has an average accuracy in tennis of 3.6mm[4] An average accuracy of 3.6mm implies that there are larger errors as well as smaller errors. For example the following `triangular’ distribution of 132 errors has an average accuracy of 3.3mm.[5]

3)While the ITF’s testing procedure rules out errors larger than 5mm when they affect a line call, the testing procedure means that few impacts within 5mm occur in the course of testing (around 10% of around 100 bounces).

4)It is in the nature of statistical inference that on a very few occasions large errors will occur. Such occasions might well not be detected by a limited calibration test such as is used by the ITF. We cannot say if this actually happens in the case of any particular RTD.

5)Devices, such as RTDs, that use ambient light as the source of their signal are likely to perform less well as conditions become darker. This is simply because there are few photons and illuminated pixels on which to base the statistical inferences. We cannot say if this problem actually applies to any particular RTD and note that the Hawk-Eye website states that testing occurs in a range of conditions, including ‘dark or overcast conditions’.[6]

6)We do not know if the ITF’s testing procedure includes conditions as dark as, say, those which pertained at the end of the Federer-Nadal Wimbledon final in 2008.[7] It seems unlikely, however, since the ITF uses high-speed cameras as one method of establishing the true bounce point of the ball and these perform badly in the dark.

7)In the case of that part of the lbw rule that involves projecting the flight of the ball after it has hit the pad in order to determine whether it would hit the stumps, relatively larger errors are likely to be more frequent because, in general, a longer flight path has to be estimated than in the case when the ball strikes the ground or some other object. In general, the longer the projected flight in relationship to the recorded flight the greater the uncertainty is likely to be.

8)It is believed by some that no projection or estimation is involved in the case of tennis.[8] Cameras have a finite frame-speed, however. If the frame-speed is, say, 50 frames per second, and the ball is moving at about 100 mph it will travel about 3 feet between frames. Therefore, at that ball speed the average distance between the last frame and the bounce of a tennis ball will be 1.5 feet. Sometimes the distance that has to be predicted will be 3 feet, or even more if the ball is travelling faster (tennis balls can travel at 150mph and perhaps faster).[9] The difference between cricket and tennis is, then, a matter of degree. One can imagine that a combination of circumstances might mean that the last frame capture of a tennis ball is 4 feet from the bounce point and here the prediction required might be still greater than that needed for the projected flight path of some lbw decisions.

Hawk-Eye in cricket

We now compare the use of Hawk-Eye in tennis and in the cricket. We begin with cricket and the use of Hawk-Eye in the South Africa-England test match series. In these matches the umpire’s decision was taken to be the default and was only over-ruled if Hawk-Eye indicated that a big mistake had been made. How this works is most easily exemplified in the case of the `lbw’ decision, which effectively asks the umpire to decide whether or not the ball would have gone on to hit the stumps if it had not hit the batsman’s pad.[10] The decision is notoriously difficult as the umpire has to extrapolate from what did happen to what would have happened if the ball’s flight had continued uninterrupted. Where the umpire cannot be sure, the convention in cricket is that the batsman gets the benefit of the doubt and is given ‘not out’ even if the umpire thinks it is possible that the ball might have gone on to hit the outer edge of the stumps.

Figure 1: The grey box illustrates the zone of uncertain on one edge of the wicket equal to half a stump and half a ball (not to scale)[11].

In the review system, where Hawk-Eye is used to check the umpire’s decision, the crucial element in cricket is what we are going to call `the zone of uncertainty’. This is an area about 55 millimetres, or a little more than 2 incheswide (half the width of the ball plus half the width of one stump) around edge of the stumps (see Figure 1).[12] In this area, the RTD does not overrule the on-field umpire’s decision. Thus, in the case where an umpire gives a `not out’ decision and there is an appeal, the umpire is not over-ruled unless the inside edge of the ball is shown to be striking a point c.55mms inside the outer edge of the wicket. The ICC’s guidelines express this as follows, saying that, when the inside edge of the ball is shown in what we have called the ‘zone of uncertainty’: