1

Baccellieri

Abbreviations

Because I have used many baseball abbreviations throughout the paper, I have included this list to clarify any confusion.

1B: singles

2B: doubles

3B: triples

AB: at bats

BB: walks

CS: caught stealing

ERA: earned run average

GIDP: ground into double play

HBP: hit by pitch

HR: home run

IBB: intentional walk

PA: plate appearances

RBI: runs batted in

RBOE: reached base on error

RC: runs created

SB: sacrifice bunt

SF: sacrifice fly

SH: sacrifice hit

TB: total bases

uBB: unintentional walk

WAR: wins above replacement

wOBA: weighted on-base average

wRAA: weighted runs above average

WS: win shares

Emma Baccellieri

Dr. Bray

Math 89S—Game Theory and Democracy

November 6th, 2012

Sabermetrics Versus Subjectivity in the Search for Baseball’s Greatest Player

Baseball is anchored in tradition, a game beautifully steady and unwavering through the change and tumult of American history: “the one constant through all the years,” in the dialogue of Field of Dreams. But despite its aura of stability and permanence, baseball inherently contains many discrepancies that make it difficult to compare the abilities of players of different positions, ballparks, or eras.The major statistics—batting average, RBIs, and home runs for hitting ability; ERA and wins for pitching; errors and mere individual opinion for fielding—function relatively well as raw, basic indicators of performance, but they fail to account for random occurrences and for nuances of the game, and they do not offer any compensation for variations across seasons or stadiums.Consequently, for the majority of the game’s history, the debate concerning the all-time greatestplayers was one based largely on opinion and subjectivity: while numbers were undeniably important, how could one fairly compare the hitting statistics of a player in the dead-ball era of the early 1900s to those of a modern player? The statistics ofa player before and after the pitching mound was lowered in 1969? Of players with home fields of completely different dimensions? The problems facing the assessment of defensive ability were even more complicated, due to the lack of statistics that could accurately measure fielding prowess for any position, let alone one that would allow positional comparison.

In the late 1970s, however, a more sophisticated method of quantifying the game was developed: sabermetrics, described by its founder, Bill James, as “the search for objective knowledge about baseball” (Grabiner). From James’s publication of the 1977 Baseball Abstract: Featuring 18 Categories of Statistical Information That You Just Can’t Find Anywhere Else to the present day, sabermetrics have grown more and more advanced, allowing for not only morespecific and precise measures of offensive and defensive ability, but also ofoverall player quality (Gray).These statistics of all-around player value allow for a clearly objective comparison of players—essentially, in theory, definitive conclusions to the age-old dilemma of determining the game’s all-time greats. This have been far from universally accepted, however, and perhaps rightfully so:baseball, critics of sabermetrics argue, is much more than arithmetic; it is impossible for a single number to describe a player’s entire career, for one cold statistic to answer the eternal question of who the game’s greatest players are.

These opposing views highlight an extremely compelling subject:in deciding baseball’s best players, how does the absolute objectivity of sabermetrics compare to the long-utilized (albeit subjective) combination of simple statistics, opinion, and gut instinct?An analysis of the ten greatest position players of the twentieth century, comparing lists determined by various sabermetric statistics with those decided on by baseball experts and fans, reveals that the two schools of thoughts produce very diverse conclusions.

One of the earliest sabermetric attempts to measure overall player quality with a single statistic was the total player rating, or TPR, designed in 1984 by Pete Palmer and Jim Thorn (Cale). Though TPR has largely fallen out of favor in the past decade as more advanced statistics have been developed, it is significant in that it laid the foundation for many of these more recent statistics. TPR gives a weight to each type of offensive and defensive play that a player can make according to how likely it is that each given play will result in a run.These weighted values are then used to calculate the player’s performance as being above or below zero runs produced for his team, with zero representing the performance of the average player (Lichtman). Offensive ability is measured through batting runs, defensive ability is measured through fielding runs, and base stealing runs are an additional category. The batting runs value is adjusted for differences in ballpark environments by multiplying by a scoring factor that is calculated through comparing runs scored in the given stadium to the league average. An additional adjustment to the batting runs figure comes with an adjustment for fielding position, a number that is added or subtracted to the final value based on the defensive difficulty of the player’s position. The current scale is as follows (though this scale was not yet designed when TPT was developed, it was created by Tom Tango in 2007 and has since been well-established as the best version):

1)Catcher: +12.5 runs

2)Shortstop: +7.5 runs

3)Second base: +2.5 runs

3)Third base: +2.5 runs

3)Center field: +2.5 runs

6) Left field: -7.5 runs

6) Right field: -7.5 runs

8) First base: -12.5 runs

9) Designated hitter: -17.5 runs

The reasoning of the scale is that if two players have equal offensive statistics, the player with the position higher on the ladder is more valuable to his team. For example, catcher is recognized as the most difficult position and occupies the highest place on the scale as a result—if a catcher and, for example, a left fielder have identical hitting records, the catcher will be more valuable to his team by 20 runs (12.5 – [-7.5]) (Cameron).

After batting runs have been adjusted for differences in ballpark and position, they are added to fielding runs and base stealing runs. The resulting sum is the average number of runs that a player can be said to contribute to their team as compared to the average. This sum is divided by ten to convert runs to wins, using the standard conversion factor of 10 runs by one player = 1 win. The overall TPR formula, according to baseball writer Trace Wood, is as follows:

, where

TPR has become less popular in recent years for several reasons. One of the formula’s mistakes is its equal weighing of base stealing runs with fielding runs and batting runs, as players generally attempt to steal bases more often in times of pressure and the statistic can consequently be skewed depending on the types of games in which players find themselves, a relatively random factor. An additional problem is the fact that the average player produces more than zero runs for his team, meaning that the whole comparative foundation of the statistic is problematic. However, the formula’s major error lies in its calculation of fielding runs, which is a relatively simplistic measure of defensive ability—it does not account for the number of balls per inning that are hit in the player’s range, it does not account for differences in ballpark dimensions, and it is generally thought to overemphasize infield put outs and double plays. While TPR is far from perfect, it serves as an interesting example of how sabermetrics can be flawed and also makes for an interesting comparison with more advanced statistics (Lichtman).

A more sophisticated overall player value statistic is win shares (WS), developed by Bill James in 2002. While WS are somewhat similar to TPR in that both statistics determine the number of wins a player contributes to his team, the formula for WS is far more detailed. A win share is defined as one-third of a win, and the statistic is simply a measure of how many win shares a player is responsible for—therefore, a player with thirtywin shares is responsible for ten wins. James begins calculating the statistic by determining what percentage of a team’s wins is accounted for by offense and what percentage by defense—the team’s offensive and defensive win shares. This is done with acomparison of the team’s wins, losses, and runs scored and given up to the league’s average number of wins, losses, and runs in the given season. The offensive and defensive win shares are then assigned to the team’s individual players (Studeman, Baseball Graphs).

To determine the offensive win shares of a player, a statistic known as runs created (RC), also developed by James is used. RC measures essentially all aspects of a player’s offensive ability, and the formula is rather complicated as a result. It encompasses an on-base factor (A), an advancement factor (B), and an opportunity factor (C). The formula is the following (Appelman):

, where

All of the above values are adjusted for a player’s home ballpark, just as they would be for calculating TPR. After the RC for a given player is calculated, the league average RC for that year must be subtracted to see the comparison to the average. To determine the player’s offensive win shares to his team, RC must be calculated for every member of the team so that the percentage value of offensive win shares can be found for each player. The player’s percentage of the team’s offensive win shares is then multiplied by the team total offensive win shares to determine the player’s individual offensive win shares.

The formula for a player’s defensive win shares is quite intensive; essentially, the team’s defensive win shares are divided between pitching and fielding using a variety of complicated statistics, and the fielding win shares are then allocated to specific players based on position-specific factors such as assists, put-outs, errors, arm ratings, and double plays. As with offensive win shares, these calculations must be carried out for every member of a team so that the individual players’ percentages of the shares may be determined.A player’s offensive and defensive win shares are then added to find his total win shares (Studeman, Baseball Graphs).

As the calculations are done as percentages of a whole, a player’s individual win share value is not affected by his team’s wins and losses, making the statistic fair for players on both winning and losing teams. Due to this fairness and the detail of its equations, win shares are generally viewed as a relatively accurate indicator of a player’s effectiveness (Studeman, Hardball Times).

The most popularsabermetric measure of all-around player quality is, by far, wins above replacement, or WAR. WAR gauges the amount of wins that a player generates in comparison to the number that would be produced by the typical replacement player, usually defined as the average AAA player. While there are several methods of calculating WAR, all involve summing an offensive statistic and a defensive statistic, normalizing the values for ballpark and position (in the same fashion as was done with TPR), and converting runs to wins by a factor of ten. The only difference among the methods lies in the calculation of the offensive and defensive statistics, with various values being used, though the final product is typically relative similar. Thus, while only the group Baseball Reference uses the following formula, it is very comparable to other WAR formulas and is conceptually identical, though not mathematically so (Slowinski).

Baseball Reference calculates WAR by summing batting runs, baserunning runs, grounded into double play runs, fielding runs, and positional adjustment runs—essentially, all runs a player contributes through both his offense and defense (“WAR Explained”).

Batting runs are calculated with weighted runs above average (wRAA), which is a statistic derived from weighted on-base average (wOBA). wOBA is an offensive statistic that attempts to give a weight, in terms of the likelihood that a run will score, to all possible batting outcomes. The weights fluctuate from year to year based on batting trends. The formula for wOBA, using the weights from 2007, the original year in which they were calculated, is as follows:

wRAA is simply a version of wOBA that is normalized for baseball in the given season and converts on-base average into runs. Its calculation is:

where the wOBA scale is a value that makes the conversion from wOBA to runs per plate appearance logical.

The WAR formula’s other components all involve very intensive calculations Baserunning runs are determined from calculations of stolen bases, times caught stealing and other, more subtle indicators of baserunning ability; ground into double play runs are calculated in relation to the average number of ground-outs into double plays. Fielding runs are calculated with defensive runs scored, a system that measures how many successful plays a fielder makes compared to the average at his position, and with total zone rating, a statistic that measures how well a player performs in his designated zone of coverage. Positional adjustment runs are drawn from the same scale as was discussed earlier with TPR. A player’s total run production is the result of the sum of the aforementioned runs, and the runs for a “replacement” player are then subtracted from this value to determine the runs above replacement. Runs are converted into wins using the previously discussed factor of ten (“WAR Explained”).

Though WAR is somewhat mathematically complex, it is relatively simple on a conceptual level, which has allowed it to become one of the most popular of the advanced sabermetric statistics with fans, as well as with scouts and other professionals.

Interestingly enough, while TPR, WS, and WAR all aim to measure overall player quality, they yield completely different results. Though all have Babe Ruth at number one, the rankings vary widely otherwise:

TPR / WS / WAR
1 / Babe Ruth (7.874) / Babe Ruth (39.920) / Babe Ruth (159.2)
2 / Nap Lajoie (5.891) / Mickey Mantle (36.948) / Willie Mays (150.8)
3 / Rogers Hornsby (5.882) / Ted Williams (56.739) / Ty Cobb (144.9)
4 / Ted Williams (5.759) / Honus Wagner (36.159) / Hank Aaron (137.3)
5 / Mike Schmidt (5.062) / Ty Cobb (35.788) / Tris Speaker (127.8)
6 / Mickey Mantle (4.767) / Rogers Hornsby (34.332) / Honus Wagner (126.2)
7 / Lou Gehrig (4.756) / Tris Speaker (34.051) / Rogers Hornsby (124.6)
8 / Honus Wagner (4.532) / Joe Jackson (33.482) / Stan Musial (123.4)
9 / Tris Speaker (4.437) / Willie Mays (33.303) / Ted Williams (119.8)
10 / Willie Mays (4.388) / Lou Gehrig (32.802) / Eddie Collins (118.5)

Sources: TPR and WS—Morong; WAR—Baseball Reference

There is nearly as much disparity within the subjective rankings as there is within the sabermetric ones. Listed below are the ten best position players of the twentieth century as determined by The Sporting News in 1998, members of the Society of American Baseball Researchers in 1999, Mark McGuire and Michael Sean Gormley in their 2000 book The 100 Greatest Baseball Players of the 20th Century Ranked, and a 2011-2012 ESPN fan vote that garnered over 8,000 votes, as well as the ten players who have been elected into the Hall of Fame with the highest percentages of votes:

TSN / SABR / McGuire/Gormley / ESPN / Hall of Fame
1 / Babe Ruth / Babe Ruth / Babe Ruth / Babe Ruth / Cal Ripken (98.53%)
2 / Willie Mays / Lou Gehrig / Willie Mays / Willie Mays / Ty Cobb (98.23%)
3 / Ty Cobb / Ted Williams / Hank Aaron / Hank Aaron / George Brett (98.19%)
4 / Hank Aaron / Hank Aaron / Ty Cobb / Ted Williams / Hank Aaron (97.83%)
5 / Lou Gehrig / Stan Musial / Lou Gehrig / Ty Cobb / Tony Gwynn (97.61%)
6 / Ted Williams / Joe DiMaggio / Joe DiMaggio / Lou Gehrig / Mike Schmidt (96.52%)
7 / Rogers Hornsby / Ty Cobb / Ted Williams / Mickey Mantle / Johnny Bench (96.42%)
8 / Stan Musial / Willie Mays / Stan Musial / Stan Musial / Babe Ruth (95.13%)
9 / Joe DiMaggio / Rogers Hornsby / Rogers Hornsby / Honus Wagner / Honus Wagner (95.13%)
10 / Honus Wagner / Honus Wagner / Honus Wagner / Rogers Hornsby / Rickey Henderson (94.81%)

Sources: TSN, SABR and Hall of Fame—Baseball Almanac.

If the results of each list, sabermetric and subjective, are viewed as a preferential ballot and the answers are tallied with a Borda count, with the first place player being given ten points and the tenth place player being given one, the results are as follows:

1)Babe Ruth (73)

2)Ty Cobb (48)

3)Hank Aaron (44)

4)Willie Mays (42)

5)Ted Williams (41)

6)Lou Gehrig (31)

7)Rogers Hornsby (26)

8)Honus Wagner (22)

9)Stan Musial (18)

10)Mickey Mantle (13)

11)Joe DiMaggio (12)

11) Tris Speaker (12)

12)Mike Schmidt (11)

13)Cal Ripken (10)

14)Nap Lajoie (9)

15)George Brett (8)

16)Tony Gwynn (6)

17)Mickey Mantle (5)

18)Johnny Bench (4)

19)Joe Jackson (3)

20)Rickey Henderson (1)

20)Eddie Collins (1)

While Babe Ruth is clearly the consensus best player, thereare not many definitive findings to be drawn from the rest of the ranking. The conclusion that ultimately seems most fitting is perhaps that, while sabermetric measures of overall player value are thought-provoking for fans and practical for professional use, they only add another dimension to the debate of the all-time greats, rather than settling it. The language of baseball is numbers, but the game is ultimately one of intangibles—of improbable comebacks and devastating losses, of twenty-seven-up-twenty-seven-down perfect and miserable-blown-save disastrous, of idiosyncratic players and individualistic managers and flawed umpires—and there are some aspects that math will never be able to capture.