Maximum Likelihood Estimation

of

Sports Team Capabilities

and

Prediction of Competition Outcomes

David J. Murrow

2/8/2014

Maximum Likelihood Estimation of Sports team Capabilities and Prediction of Competition Outcomes—D.J. Murrow

It is often desireable to predict the outcome of an upcoming game or series of games between two sports teams based on the results of previous games between the two teams and based on the results of all games between all of the various teams in the league. The theory presented in this paper applies to competitions between sports teams such as baseball teams, football teams, basketball teams, hockey teams, etc.. The prediction is based on a probabilistic model of the two teams scoring which utilizes a quantitative measure of the inherent offensive and defensive capabilities of the two teams. The basic theory uses all available previous game scores between teams in the league to form a maximum likelihood estimate (MLE) of the inherent defensive and offensive capabilities of the teams. It should also be easily extendible to take into account other secondary factors such as personnel injuries, home team advantage, playing surface, etc..

The inherent offensive capability ,  , of a sports team is defined in this paper as the expected number of points the team would score against an ideal reference defensive team in the same sport. The reference team need not exist. It is only used to get an initial a priori assessment of a teams offensive and defensive capability prior to the availability of previous game scores. The inherent defensive capability,  , of a sports team is defined here to be the expected number of points the team would allow against the reference team offense, normalized by the number of points that the ideal reference team offense would be expected to score against it's own reference defense. Note that by this definition, the value of the reference teams defensive capability is unity , i.e., 1.

The actual number of points a team would score in a given game can be reasonably modeled as a Poisson random variable. The Poisson probability density function (PDF) and distribution function (DF) are described by

where P(N,x) is Pearson's Incomplete Gamma Function, defined by

The expected value of the random variable , n , is

The Poisson PDF models the scoring of a sports team well, as indicated in Figure DJM-1, which compares the scoring of the Boston Celtics NBA basketball team against all opponents in the 1999-2000 season to a Poisson PDF with the same mean.

Figure DJM-1 Poisson PDF vs Actual Scoring of Boston Celtics

Consider two sports teams A and B with the offensive and defensive abilities AABB . For example, suppose A=75 and A=1.12 while B=70 and B=1.25 .

Then the expected score by team A against team B is A/B=75/1.25=60 , while the expected score by team B against team A is B/A=70/1.12=62.5 .

In the theory offered in this paper, if these two teams play, the actual score of the game will be random variables drawn from Poisson PDFs. Let nA/B be the number scored by team A against team B and nB/A be the number scored by team B against team A. Then the Poisson PDFs of nA/B and nB/A proposed here are

These two PDFs can be used to calculate the probability that team A beats team B and vice versa, as

Continuing the example above, the probability that team A beats team B is 0.42 while the probability that team B beats team A is 0.53. The probability that the two teams tie is 0.05. In some sports, ties are part of the game outcomes and are not resolved (e.g., soccer, ice hockey, etc.) In others they must be resolved. This would require that the two PDFs be interdependent, which would complicate the math somewhat and not change the central idea that games results can be predicted based on previous game score results. Furthermore, the probability of ties using independent Poisson PDFs is small in fairly high scoring games and don't contribute much to the overall prediction probabilities.

It is obvious that in order to be able to predict game score outcomes, we must know or estimate the offensive and defensive capabilities of each team in the league. This estimate should be based on all available previous game score data.

The game data can be entered as each occurs, or later in groups at the convenience of the statistician. The teams in the league must be numbered 1 to N.

An Nx1 offensive capability vector may also be defined for the league of teams with entries n , along with an Nx1 defensive capability vector with entries n.

Each game also is numbered sequentially, in the order entered or date played, or other. Suppose in game r the two teams are na(r) and nb(r), and the corresponding scores are xa(r) and xb(r). Then after R games, the team numbers and scores will constitute four Rx1 vectors, na,nb,xa,xb. The corresponding four R x1 team capabilities vectors would then be (na(r)), (na(r)),(nb(r)), (nb(r)).

The joint PDF of all the game data would be

In order to find the MLEs of the team capabilities, this likelihood function should be maximized. For this purpose, it is equivalent to maximize it's natural logarithm, given below.

Let ra(n,:) be the ka(n) x1 vector of game numbers where team n is in the A team vector, i.e., na(ra(n,k))=n. Let rb(n,:) be the kb(n) x 1 vector of game numbers where team n is in the B team vector, i.e., nb(rb(n,k))=n.

Let n be the sum of all the terms in the log likelihood function above where n appears, so

The value of  can be maximized wrt n by maximizing n wrt n , which can be accomplished by setting the derivative of n wrt n equal to zero, i.e.,

Let n be the sum of all the terms in the log likelihood function above where n appears, so

The value of  can be maximized wrt n by maximizing n wrt n , which can be accomplished by setting the derivative of n wrt n equal to zero, i.e.,

These 2N equations for the MLEs of n andm are non-linear in the 2N variables, and are difficult to solve directly. However, the equations must be satisfied if the variable values are to represent MLEs. Fortunately, they can be easily solved iteratively. Let n andm be the results of  iterations. Then let

The initial values for the iteration can be the results from the previous rounds of play if those are available. This improves iteration convergence times. The above iteration equations have been coded into a Matlab computer program designed to assimilate actual game score data and determine the MLEs of the offensive and defensive capabilities of a league of sports teams in competition. For the example cases tried to date, the iteration process above converges rapidly within 2 to 3 iterations to a set of estimates that very nearly satisfies the MLE equations.

Once we have the estimates of andthe next problem is to predict the outcome of an upcoming game between two teams, say n and m. Let xn be the score of team n and xm be the score of team m. The corresponding PDFs are

The MLE of the number of points scored by each team may be found by maximizing the above likelihood functions or, equivalently, by maximizing their natural logarithms,i.e.,

Another problem which presents itself is to rank the teams relative to each other based on a single number which reflects their relative overall capability. A logical ranking could be based on the ratio of a teams average scoring by it's offense against the rest of the teams divided by the average scoring against it's defense by the rest of the teams. These can be calculated as

Some example results from the NBA and the NFL are included next to show the performance of the estimation and prediction processes.

National Basketball Association(NBA)—Example Results

The NBA as structured in the 1999-2000 season had 29 teams divided into 4 conferences as shown below. The season consisted of 82 regular season games for each team. Game scores from that season were downloaded for analysis from the internet CBSsportsline website. Unfortunately, the 1189 game scores from the regular season were not in a format that could be readily imported into a Matlab computer analysis program, and so some of the data was typed into the program manually, from the attached print-out of all of the game scores. As a consequence, only the scores from 80 games were entered into the Matlab analysis program database. The defensive and offensive capabilities of each team was estimated after each round of 16 games, during which each team played at least once. Initial estimates of and =1 were used in the MLE iteration process. The MLE of offensive and defensive capabilities of selected teams vs time are shown in Figure DJM-2.

An overall power rating of the teams based on the average estimated offensive to defensive capabilities is shown in Table DJM-2, indicating that Los Angeles had the highest power ranking, even based on as little as five rounds of play sampled over the regular season.

Table DJM-2 Summary of Offensive and Defensive Capability MLEs for NBA—1999-2000 Season

Based on the 75 games sampled, the probability that LA would beat Philadelphia in a game is

1

8/5/01