Curve Ball: Baseball, Statistics, and the Role of Chance in the Game , Jim Albert and Jay Bennett Copernicus/Springer-Verlag, New York, 2001. $29.00 (350 pp.). ISBN 0-387-98816-5
The game of baseball produces a plethora of statistical information relating to both team and individual performance: hitters’ batting and slugging averages, pitchers’ earned run averages, and the like. To translate these data into evaluations of the true ability of a player or team is a difficult challenge, one that is often handled poorly by baseball journalists and TV commentators.
Jim Albert and Jay Bennett’s Curve Ball is an attempt to apply the techniques of statistical analysis to the understanding of baseball statistics. The authors are both professional statisticians. Both (like the reviewer) are members of the Society for American Baseball Research. Both, (definitely unlike the reviewer) are lifelong fans of the Philadelphia Phillies.
Among the questions addressed in their book are: When a player or team experiences periods of poor and of good performance during a season, is this necessarily (or probably) due to ups and downs of actual ability, or can it be explained by chance? Are some hitters significantly better than others at hitting in particular situations, such as with runners on base, or in night games, or on artificial turf? What offensive statistics are most useful in evaluating a player’s true contribution to the scoring of runs? Can the “clutch” performance of a player be objectively evaluated? Is the winner of the World Series really the year’s best team in true ability?
The book seems to be addressed primarily to baseball fans who are not necessarily educated in probability theory. The authors do not use even elementary statistical concepts, such as standard deviation, without explanation, and, using various models, they treat examples of player and team performance by computer simulations of entire seasons rather than by proving theorems about the expected spread of this or that result. Physicists who understand baseball will find the mathematical reasoning quite easy to follow; in fact, they may find themselves skimming over some of the explanatory text.
Some idea of the content of the book may be conveyed by citing a few examples of the specific problems treated:
In the treatment of situational effects, the book uses data from Player Profiles (Stats Publishing, 1998) for the 1998 season to reach the following conclusions: (a) There is no evidence that players’ batting averages differ appreciably between grass and turf, the observed differences being explainable as chance fluctuations. (b) Differences in batting averages between home and road games can be explained by a bias, assuming that every player a priori has a batting average 12 points better at home than on the road; further differences between individuals can be explained by chance. (c) Neither pure chance nor bias is capable of explaining the spread in batting averages with runners in scoring position versus bases empty; the evidence indicates that some players are truly better than others when batting with bases occupied.
In discussing the use of batting statistics to predict run production, the authors apply least squares linear regression to various models in order to arrive at correlations among various measures (batting and slugging average, on-base percentage, and more complex modern measures) and team run production. In the process, they also demonstrate pitfalls that can arise, such as a spurious correlation between sacrifice flies and runs scored. As they explain, teams that often have runners at third base with fewer than two out will have many sacrifice flies and will also have many runs.
These are only some of the topics treated in this book. There are a number of other intriguing analyses of team results and individual batting data. The book makes no claim of being an exhaustive treatise on statistical analysis of baseball—for example there is very little said about pitching statistics, nothing about fielding—but it is a most interesting and useful introduction to the subject. It should make enjoyable reading for physicists who are also baseball fans, and it ought to be required reading for baseball managers, executives, and commentators. Unfortunately, these are probably the least likely to buy and read this book.