The Favourite-Longshot Bias

THE FAVOURITE-LONGSHOT BIAS

2. The Favourite-Longshot Bias Across Different Classes of Horse Races: Evidence From the 2003 UK Flat Turf Season.

2.1 – INTRODUCTION

The favourite-longshot bias (FL-Bias) in betting is the anomaly whereby the relative realised returns of bets on favourites are superior to longshots. This chapter has two inter-related objectives; firstly to analyse the degree of the FL-Bias in the UK horse racing betting market using ‘first show’ (opening) odds and ‘at the off’ (starting) odds data taken from the 2003 flat season. Previous studies (with the exception of Smith, Paton and Vaughan Williams (2006)) have examined data from the 70s and 80s. Numerous changes have taken place since then with the advent of betting exchanges, the abolition of off-course betting duty and the average better having access to information more readily.

One would expect any mis-pricings would correct themselves through at least two channels. Firstly, the market mechanism; information flows during the course of betting should move prices in the correct direction. Secondly, reduced protection against insiders; according to Shin (1991, 1992 and 1993) one of the purposes of bookmakers setting odds with a FL-Bias is to protect themselves against agents with insider information. It is argued that insiders should bet earlier in bookmaker betting markets to lock in the guaranteed returns and to not risk the possibilities of information leaks if they wait until the latter stages of the market. This shield is likely to go down as we approach the start of the race so the FL-Bias should be less apparent for starting prices. In other words, if insiders bet early, odds should adjust to reflect this. Evidence is found supporting this proposition.

The second objective is to investigate how the FL-Bias varies between different classes of horse races. One might intuitively expect insider trading to be more pertinent in the lower class races, where more information is likely to be private. Sidney (2003) points out that any horse in a low grade race which is a ‘trier’ is very likely to be in the winners’ enclosure, hence bookmakers need to put up a bigger shield when offering odds against outsiders in low grade races. Also based on the Shin model, where the presence of insiders causes bookmakers to set odds with a FL-Bias (assuming that insider activity is more likely to occur on an outsider), one would expect the returns in bets from lower class races to exhibit a stronger FL-Bias. This investigation classifies races using the official categories in place during 2003, something has not been conducted before. Additionally, classifications based on attendance levels and the number of bookmakers present at the meeting are also attempted. This ‘across class’ motivation was also the driving force behind a study by Vaughan Williams and Paton (1997) who argue that high class handicap races (handicaps where horses are rated above 100) are subject to less private information than other races, and a study by Smith, Paton and Vaughan Williams (2006), who estimated the incidence of insider trading for different classes of races based on turnover for online bookmaker data for 2003. Cain, Law and Peel (2003) conducted a similarly motivated investigation but across different sports and found evidence to support the claim that there is a stronger FL-Bias for sports with higher perceived levels of insider trading (discussed later).

To distinguish between different classes of races, this chapter and Chapter 3 use the official categories in place in 2003. Smith, Paton and Vaughan Williams (2006) argue that using official categories may not be related to the amount of public information, especially with regards to high class two year old races (horses begin their racing careers at two), where many horses are unexposed to public scrutiny. A counter argument against their claim is that these high class two year old races are likely to attract a high volume of bets (which is how their classes are distinguished by). For example, a championship two year old race is likely to attract more bets than a Class B handicap with old campaigners at the same meeting. Official categories are essentially distinguished by the prize money they offer, races of greater value are least likely to be fixed, whereas races with little prize money are likely to have more non-triers and stables setting up horses for gambles (see Section 3.3). The use of official categories as opposed to categories based on bet volumes allows the impact of a different issue to be investigated.

Three methods will be employed to test the hypotheses. Firstly, the relationship between the returns from backing horses in different odds groups for opening and starting prices and across the different classes is investigated. The patterns of returns are compared with data from the 1987 season to investigate whether and how the nature of the FL-Bias has changed. Secondly, the commonly used z-statistic (e.g. see Asch, Malkiel and Quandt (1982), and the works by Busche and Walls) is used to test whether the implied win-probabilities in certain price bands differ from the objective win-probabilities as defined by the race outcome. The two above methods have not been employed to investigate recent data, and across different classes of races.

Busche and Hall (1988) noted a potential problem with estimating objective win-probabilities when grouping horses into price bands. The problem is that if more than one horse from the same race fell into the same price band, only one of them can win, thus the relative win frequency for the respective price band could be unrepresentative of the true win-probability of a typical horse falling into the respective price band. This problem, and a solution are discussed in Section 2.2 and the Appendix 2.A.?.

A comparison with the pattern of returns from the 1987 season is also presented. Horse race betting has witnessed many changes in the UK since 2000 so one may expect to see changes in the nature of the FL-Bias since the previous studies into the matter. Firstly there is the birth of betting exchanges (which took off around 2001/2002) where bettors can make or accept bets with low costs on the internet. This has an impact through an information flow dimension and a competition dimension against the traditional bookmakers, but it also provides them with another mechanism to hedge their liabilities. Secondly, the punter has witnessed the advent of information mines such as the internet and (at least for the 2003 season) the satellite/cable specialist horse racing television channel attheraces. The increased availability of public information may result in less private information so the degree of the FL-Bias may have narrowed. Finally, there is the removal of betting taxes for off-course bettors (October 2001), the tax could have had a bearing on the choice of bets for bettors who were not at the racetrack, removing this distortion could have an effect on the desired betting patterns of bettors.

Finally, an analysis of the same phenomenon with a linear probability model (LPM) is developed and conducted. This allows for formal testing of the hypothesis that the FL-Bias is stronger in some markets than others. The z-statistic only indicates whether groups of horses’ implied win-probabilities have been under or over-estimated. Implied win-probabilities are estimated using the odds data and the binary WIN variable is run against the implied win-probabilities controlling for the fact that in races with larger fields there is likely to be a stronger FL-Bias. This is a finding made by Sobel and Raines (2003) for pari-mutuel betting under a setting where bettors use Bayesian updating to decipher information signals when deciding on which horse to bet on. The LPM provides a new measure of the extent of the FL-Bias which controls for the number of runners in a race, this is particularly useful for the UK where there is large variance in the number of runners[1]. Additionally, the potential problem brought up by Busche and Hall (1988) could also bias the estimates from the LPM. A solution (discussed in Section 2.4) is proposed and the results suggest that the problem causes no systematic differences to the results.

The remainder of this chapter is organised as follows, in the next section, the issue of the FL-Bias is investigated in a little more detail, looking at the work by Shin (1991, 1992 and 1993), a similarly motivated study by Cain, Law and Peel (2003) as well as the aforementioned studies by Vaughan Williams and Paton (1997), and Smith, Paton and Vaughan Williams (2006). The methodology of this investigation is then presented along with the results. It is found that a FL-Bias clearly persists, especially at starting odds, but across classes it is not clear what the differences are.

2.2 – THE FAVOURITE-LONGSHOT BIAS

When investigating the FL-Bias economists have studied the odds of runners and the returns resulting from backing them at those odds (see Hausch, Lo and Ziemba (1994) for a selection of these papers, in particular Snyder (1978) where the results from early empirical work are shown). Generally there are no positive returns (after taxes/deductions), but as was drawn upon earlier, there is a FL-Bias. The returns of favourites (lower variance bets) are relatively superior to those of longshots (higher variance bets), in other words, the incidence of the overround falls on the longshots. This is antipodal to capital markets, where the investor would generally expect to have a higher expected return in order to compensate for holding a more risky asset.

Apart from investigating the returns generating from backing horses with different odds, another way to check for the presence of a FL-Bias is to classify the horses by their odds or other measures of their chances of winning and then evaluate whether the implied number of winners (subjective win-probability) is different to the realised number of winners (objective win-probability). A commonly used test statistic (e.g. by Busche and Walls (2001)) is the z-statistic. The z-statistic tests for market efficiency based on a null hypothesis that the number of winners from a group of n horses with similar implied win-probabilities are realisations of independent binomial trials with success probability equal to the implied mean win-probability. Using the normal distribution transformation, discrepancies between the theoretical and actual number of winners can be readily tested for. If the true/objective win-probability of a group of N horses was p, then according to the null hypothesis the expected number of winners will be Np, with variance p(1-p)/N. Thus the null hypothesis of whether the subjective and objective probabilities were equal for a sample of n observations can be tested with the statistic , where IP denotes the implied probabilities. z will have a limiting normal distribution[2], a mean of zero and a variance of 1. Large positive z statistics would suggest over-betting; a phenomenon expected for longshots in the presence of a FL-Bias, and large negative z statistics would suggest under-betting. Busche and Walls actually group their horses by their rank of favouritism, but in this chapter, horses are grouped by their implied win-probabilities. The advantage of grouping by implied win-probability is that under the hypothesis that the probabilities are correct, the range of p is limited to the range of the (self-defined) probability classes. When ranking by favouritism the range of probabilities can exhibit large variation especially with different sized fields. For example, the implied win probabilities of first favourites could be large for races with a small number of competitors, or it could be small for open races with many runners. In fact, Collier and Peirson (2005) write that the odds on the first favourite can vary from as little as 25/1 on to 14/1 against. Therefore, the variance of the implied probability for the nth favourite is likely to be larger, and cross investigation/case comparisons cannot be made. The z-statistics will be presented for horses in various price bands for the 2003 data in Sections 2.3 and are discussed in Section 2.4.

A potential drawback with ranking by implied probability classes is noted by Busche and Hall (1988, page 343), and subsequently by Vaughan Williams and Paton (1998); what they refer to as a measurement error. The problem is that the observed relative frequency in a bin may be unrepresentative of a typical horse in the bin because it is possible to have more than two horses from the same race appearing in same bin. In other words, the number of winners are not drawn from an independent binomial trail. If this is the case, then only one of the horses from that race can win, effectively putting an upper limit on the number of winners recorded in the bin, and potentially biasing the estimated objective probability downwards. To solve this problem, all the horses in a price bin which are from the same race can be filtered out; but at the cost of the efficiency of the new estimates. This method is employed together with the un-filtered sample, allowing for the investigation of the significance of this potential problem. In fact, the results indicate that the potential measurement error problem does not cause any systematic deviations of the point estimates when comparing the filtered/bootstrapped results with the original results. A brief discussion on why it has no impact is presented in the Appendix 2.A.?.

Many reasons have been proposed for the existence of the FL-Bias. Weitzman (1965) and Ali (1977) estimated the form of the utility function of the representative bettor based on the actions of the average bettor assuming expected utility maximisation. It was found that bettors were risk loving. Bettors prefer an uncertain prospect with a particular expected value to the prospect of obtaining the same expected value with certainty, their utility functions are convex[3]. Bettors favouring bets on longshots will contract their dividends.

One explanation for fixed-odds bookmaker betting markets is that betting on favourites is much more competitive than on longshots. Henery (1988) proposes the reason for this to be the fact that bettors who bet on favourites are more knowledgeable and are able to shop around for the best odds. A more prominent explanation is provided by Shin (1991, 1992 and 1993), where the presence of insiders causes bookmakers to set odds with a FL-Bias. Shin constructed a model of optimal odds determination with a fixed-odds risk neutral bookmaker facing a fraction  of ‘insiders’ who will always bet on the winning horse. In his model, bookmakers compete to gain monopoly rights to the market by submitting prices  with an overround (the sum of which is) Bookmakers set these prices knowing  and pi, the correct winning probabilities of all the horses. he bookmaker offering the lowest earns monopoly rights to the market (in the event of a tie, the rights go to the incumbent bookmaker); this ensures a competitive element to the model so that the bookmaker makes zero profits on average. The bookmaker faces one bettor drawn from the population who will place a one dollar bet with him. The bookmaker does not set =0 because he may be trading against an insider, on the other hand, the bookmaker does not decline bets because he may be trading against an outsider, the margin is here to protect him against the insiders. There is a probability  that the bettor is an insider in which case the bookmaker has to make a payout of the odds of the horse. On the other hand there is a probability of (1-) that the dollar bet will be placed by an outsider whose decisions are based on the correct winning probabilities (pi) of all the horses, in which case the expected payout is pi*(oddsi). The bookmaker sets odds by minimizing the expected payout subject to and (i = 1, 2, … n). The solution of this optimization problem is:

(2.1)

There will be a FL-Bias if when when horse i is a favourite compared with horse j, pi > pj, the relative odds i/j < pi/pj. This condition is satisfied if:

(2.2)

i.e. a FL-Bias exists subject to there being more insider trading when a longshot is subject to positive insider information, which goes with intuition. The intuition behind the FL-Bias comes from the Shin model is that “the bookmaker will require a greater risk premium to insure himself against the possibility of inside information on a longshot”[4]. In other words, there will be a greater markup on the prices of longshots. Cain, Law and Peel (2001) find that this measure of insider trading is closely related to another measure suggested by Crafts (1985) which is based on significant contractions in odds of a particular horse on the day of the race, (see Chapter 3). Barring bookmakers artificially contracting the odds for collusive purposes (which is unlikely), if there is a significant contraction in the odds offered by bookmakers for a certain horse, there is an abnormal level of support for that horse because some bettors, (who may be insiders) believe that bets placed on them are good value.

Another consequence of the Shin model is that the overround should be positively correlated with the number of competitors in the event; “ceteris paribus, a larger field of competitors leads to higher odds against any individual winning the event and thus higher winnings for insiders. In these, circumstances bookmakers need enhanced margins to protect themselves”[5]. This is the case for the 2003 data where the correlation coefficient between the number of runners and the opening overround is 0.72, and for the starting overround is 0.81.