NL Players Stats Overrate Performance Friday, Aug 15 2008 

It turns out that the D.H. rule doesn’t just discourage comparison of pitchers, it also overrates hitters. Pitchers in the AL obviously have higher ERAs because they have to face the D.H. But think about the ramifications of the D.H. Because teams don’t pinch-hit for pitchers in the AL, they use fewer relief pitchers, and hence the pitchers they use are superior. Therefore, hitters in the AL don’t to face the lesser pitchers NL hitters do when the opposing team pinch hits for their pitcher and brings in a new reliever. This reasoning reveals a whole new level of interleague player comparison.

Note that AL players are not intrinsically better, they are just misrepresented by their stats.

Adjusting for Park Effects in Baseball Monday, Aug 4 2008 

December 2007

Analyzing the effects of home parks is one of the biggest problems that baseball statisticians face. There is a huge difference between playing in Denver’s Coors Field, a high run-scoring park because of its altitude, and Washington’s RFK Stadium, whose giant dimensions make the lowest scoring park in baseball. Seventy-four home runs in Coors is the equivalent of only fifty-five in RFK. How can the bias introduced by home parks be eliminated? The most popular methods of adjusting statistics, the doubling method and the Park Factor method, have serious flaws. I invented the Equal Games method to adjust statistics without most of these errors.

The most common method of getting around the statistical distortions, doubling a player’s road statistics, is actually erroneous. The doubling method sets out to remove the advantage that players in parks like Wrigley and Fenway receive, and it does that well. Unfortunately, it goes too far in adjusting the stats. For example, suppose there is a four-team league with parks A, B, C, and D. A is an extreme pitcher’s park, allowing 2/3 of the league average in runs. B and C are average, and D allows 1 1/3 of the league average. A player on team D, therefore, has his stats inflated by 22%. Players on team A, though, have theirs reduced by 22%. Now adjust the statistics by doubling the road stats. The player from D is now below the league average by 11%, and A is over the average by this margin. Since the doubling method takes away from D players the opportunity to play in their home park and lets A players not play in A, it actually adds bias in favor of players in tough home parks.

There are other problems with the doubling method. Teams that play in divisions with hitter-friendly parks have better adjusted stats than teams from other divisions. The doubling method removes games in a hitter-friendly home park, but playing more games in high-scoring parks on the road is not eliminated. The same problem exists if the rest of the division has particularly good or bad pitchers. Playing time can also be inaccurate when statistics are adjusted. If a player is injured for a short time during a homestand, doubling the road statistics will give him more playing time than he actually got. Injuries or suspensions during road trips reduce adjusted games and at-bats, as well as all statistics. Finally, since most players play about 3% better at home than on the road, the general level of production declines for both hitters and pitchers. After adjustment, runs scored and allowed, and most other statistics, don’t match. When these four distortions are combined, the doubling method is very inaccurate.

The Park Factor method avoids the deflation of statistics as well as problems with playing time. It assigns a factor to each home park depending on runs scored in that park compared to the league average. Since this method eliminates stadium effects but preserves the number of games played at the home park, overall statistical levels stay the same. Also, since only the statistics and not games played in different parks are changed, players have the right number of at-bats. But the method has disadvantages as well. For example, Houston’s Minute Maid Park is great for right-handed hitters but a nightmare for lefties. The Park Factor method can’t account for biases like these when analyzing Houston players. Similarly, a stadium can raise home runs but decrease triples. This causes major problems when the Park Factor method is used. The home park still changes half the statistics and the Park Factor method doesn’t adjust individual statistics like this. This method also causes problems in the evaluation of players. It can cause power hitters to be seen as speedy, or vice versa. To avoid this problem, a separate factor has to be calculated for every statistic.

Both the doubling method and Park Factor method have serious flaws, so a different method must be used to properly adjust statistics for park effects. Such a method must not eliminate home parks but weight them similarly to other parks, and it should not give teams in high or low run scoring divisions an edge. Using these guidelines, I invented the Equal Games method. It works by making a player play an equal number of games in every stadium. Then the home park isn’t allowed to dominate the statistics. The formula is:

  • Find the number of opposing teams a player faced (NT)
  • Find the number of at-bats against each team (AB1, AB2, etc.)
  • Adjusted Statistics = (NT / AB1) S1 + (NT / AB2) S2 +…, where SN is the stats against the appropriate team

This method has a few problems. Like the doubling method, statistical levels are reduced because most players play better at home. But as long as the statistic is confined to its main task of head-to-head player comparisons, it is much superior to the doubling method and PF method. Also, if comparing an entire league, all the statistics can be multiplied by (NT -1)/ NT x 1.015.

Home parks often distort player statistics and make analyses almost impossible. To avoid these changes, many methods try to adjust the statistics according to home parks. But these methods have problems with general statistical deflation, playing time, and imbalance between different statistics. The Equal Games method avoids these problems by rating every park similarly.

Seven Tips For Picking Baseball Playoff Winners Monday, Aug 4 2008 

October 2007

With the baseball playoffs approaching, a favorite occupation of baseball fans is predicting the results of the series and the eventual champion. It can be hard to analyze talent at all positions and all the factors that influence the result. This article contains seven guidelines for predicting baseball champions.

1. Park Effects are Key

In the long run, teams play about the same number of games in pitcher’s parks then they do in hitter’s parks. In the playoffs, teams often play a greater number of their games in one park because they play fewer total games. Therefore, park effects are of more importance in the playoffs than in the regular season.

How can you figure out which teams are aided by what kinds of parks? A team that depends mostly on singles would do best in a large park where it is easier to hit balls between the outfielders and where their lack of power doesn’t hurt them. A power-hitting team, on the other hand, would obviously benefit from playing in a small park. It is easy to determine whether a team has power or not. First, look at the hitters and divide the batting average of the starting lineup by the slugging average. If the number is greater than three-fifths then the team is like Ichiro Suzuki in that they generally use singles and stolen bases to score runs. If the number is less than three-fifths then the team is more similar to Adrian Beltre or Barry Bonds in that they count more on the long ball and walks. A team’s success in the playoffs can depend on park effects, so it is important to account for whether a team uses singles or power to win games.

If a park has asymmetrical dimensions, the outcome of a game may hinge on whether a team has right or left-handed talent. Just remember that right-handed hitters generally hit to left field and southpaws to right. Even the pitching staff can be influenced by a park. If a team has pitchers with high homeruns allowed numbers, they do best in large fields since balls that would be homers in a smaller stadium turn into long flyouts.

2. Teams Need Balance in Hitting

Which team did better in this World Series between the New York Yankees and the Pittsburgh Pirates?

Game NYY Runs PIT Runs Winner
1 4 6 PIT
2 16 3 NYY
3 10 0 NYY
4 2 3 PIT
5 2 5 PIT
6 12 0 NYY
7 9 10 PIT
Series 55 27 PIT

New York scored more than twice as many runs as Pittsburgh yet lost the series 4-3. Why was this? Bill Mazeroski’s home run in the bottom of the ninth inning of the seventh game might have had something to do with it, but notice the pattern here. With three blowouts and four close losses, New York’s number of runs varied wildly. Pittsburgh’s offense was remarkably consistent; only the Game Seven win was out of place. We can measure the amount of variation between games with standard deviation. The Yankees had a 5.37 standard deviation while the Pirates had only 3.53, and just 2.48 excepting Game Seven. The typical standard deviation of a major league team is about 3.5. Pittsburgh won because of their low standard deviation, despite the fact that New York scored more than twice as many runs.

Having a low standard deviation can drive a team all the way to the World Series. Think of it this way: Which result is better, a 15-2 win or a 5-4 win? Both scores are equal, since as long as you win, it doesn’t matter how many runs you score. For teams to be successful in the playoffs, they must have a high winning percentage and not a high number of runs scored. To measure the effects of standard deviation, I conducted a statistical study of two hypothetical teams, each scoring the league average number of runs. Team 1, however, had a 3.68 standard deviation, while Team 2 had 4.55. Because of their low standard deviation, Team 1 had a .552 winning percentage over more than 3,000 games. This translates to an incredible .611 winning percentage in the World Series. A low standard deviation of runs scored is a major factor of a team’s success.

An easy way to predict success in the playoffs is to look at a team’s fluctuation in runs. To evaluate a team’s lineup in this way, get a record of a team’s games and calculate the standard deviation. If this method isn’t practicable, then just look for balance in a team’s lineup. Teams with a high standard deviation have greater fluctuation in the number of runs scored. There is also more fluctuation among a few players than many. Teams that depend on a small core of batters have higher standard deviations than teams who have balanced lineups where most players can hit fairly well.

The St. Louis Cardinals of 2004 clearly demonstrate why balance in a lineup is key to success. Although they had the spectacular hitting quartet of Albert Pujols, Jim Edmonds, Larry Walker, and Scott Rolen, the Redbirds were weak at catcher, second base, and left field. Because of this imbalance in their lineup, they were swept by the Red Sox in the World Series. In comparison, another great team, the 1998 edition of the Yankees, had one of the most balanced lineups of all time. Their worst regular player, Chad Curtis, had a reasonable .360 on-base percentage and scored 79 runs in just 148 games. The Yankees, of course, swept the Padres in the World Series. The key factor here is the balance of the lineup.

One of the factors that most affects team performance is fluctuation. Because a team’s direct objective is to win games, not score runs, the standard deviation can be used to forecast the performance. Since this is a lot of work, though, another way is just to look for a balanced lineup that doesn’t depend too much on any one player. Don’t forget that in addition to having low standard deviation teams must also have a high average of runs scored.

3. You Really Don’t Need Five Pitchers for the World Series!

The biggest misconception about starting pitching in the playoffs is that all five pitchers in a rotation are important. In predicting the playoffs, however, it is only important to look at four of the starting pitchers. Since the two teams are playing only seven games in nine days, it is easy to have four pitchers take care of the series. Four days rest is standard for a pitcher, with some pitchers being able to do three days. Consider these schedules for pitchers, with 1 being the #1 starter and so forth, and x being a day off.

1 2 x 3 1 4 x 2 1

1 2 x 4 1 2 x 3 1

Both of these methods assume the #1 starter can pitch on three days rest. Even if no hurlers can throw on three days rest, there are still plenty of ways:

1 2 x 3 4 1 x 2 3

3 2 x 1 4 3 x 2 1

What do all these schedules show? A team only needs four good starting pitchers to succeed in the playoffs. While fifth starters may be important in the regular season because of injuries and fewer days off, they are not needed in the playoffs. In trying to predict the playoffs, don’t bother to look at the fifth starters. The first four starters are the only important ones. Two pitchers alone can carry a team to the championship. Consider Curt Schilling and Randy Johnson, the key players in the Arizona Diamondbacks 2001 World Series win. Even though reliever Byung-Hyun Kim allowed two game-winning home runs, Johnson and Schilling led the D-backs to their first World Series championship and had the best starting performance in the playoffs since the Dodger’s pitching staff in 1963.

4. Three Relief Pitchers Especially Important

Notice that with both hitters and starting pitchers it isn’t necessary to have more than a certain number of good players. With hitters, the lineup is by far the most important factor. Similarly, only four starters are needed in the playoffs. It’s the same with relief pitchers, since three are enough for a series. This means that bullpen depth is not key when looking at a team and analyzing its chances.

There are three types of relief pitchers. There are closers, players responsible for getting out the side in the last inning like Trevor Hoffman and Mariano Rivera. To set them up, there are long relief pitchers. The rest of the relief pitchers are versatile swingmen who can pitch in short or long relief and even start if necessary.

It’s easy to show that a team only needs one of each kind of these relief pitchers in the playoffs. Assume that they need a closer for five games. Since pitchers like this often pitch on very little rest, one man should be able to handle this workload. The long relief pitcher should come in three or four times, two or three to set up the closer and one alone. Finally, the third relief pitcher can take care of extra-inning duties (about 44% of World Series contain an extra inning game) and anything else.

5. Watch for Designated Hitter Opportunities in World Series

The only key difference between the AL and NL is the designated hitter, and when teams from the two leagues play in the World Series, adapting to the DH rule can be decisive. How can each team cope with changing their lineup and try to make the best of this situation? This factor can play a major role in the outcome of the World Series and therefore it is important to take it into account when comparing the two pennant winners and predicting the overall champion.

How can the AL put the bat of the DH into their lineup but not destroy their defense? The usual solution is to put the DH at first base. What can the manager do, however, when a good-hitting, poor-fielding player already mans the first sack? There are several solutions to this problem. One way to get out of the dilemma is to just put the DH at first and hope for the best. This doesn’t hurt the defense too much and does improve the offense slightly. It takes a good hitter out of the lineup, though, and is not a good solution if the former first-baseman’s bat is desperately needed. Therefore, the method that should be used is to put the first baseman at a position where he will do the least damage and then put the DH at first. With this method a team keeps both hitters in the lineup and gets a weaker bat out of the game. If a team has a poor hitter in left or right field, this can be the optimal situation for them. If not, then you can degrade their chances for games 3-5.

It is much easier for the NL to adapt than it is for the AL. All they have to do is take out the best bat and worst glove combination in their lineup, put them at DH, and put in a slick-fielding and hopefully good-hitting player in.

6. Relief Pitchers Dominant in Division Series

The bullpen is the key factor in the Division Series. The fire squad is important to prevent late rallies. If the relief pitchers are not able to protect against a late loss, a team can rarely recover since the series is short and every game counts. They must come back with a rally of their own off the opponent’s bullpen to win another game. In predicting Division Series victories, the bullpen should be the foremost factor.

A good example is the division series between the Texas Rangers and the New York Yankees in 1996. In every single game, Texas had an early lead. Then why did they lose the series 3-1? Their bullpen had an ERA of 2.40, mediocre for relief pitchers. New York, on the other hand, had a brilliant .42 ERA for their relief pitchers, including 4.2 innings of scoreless pitching from Mariano Rivera.

7. Division Series Organization Is Key, World Series is Not

In the division series, the #1 or #2 seed hosts games 3, 4, and 5 while the #4 or #3 seed plays games 1 and 2 at home. Does the organization help one team and if so, how can you use this information to help predict the winner?

It turns out that since home teams usually win 53% of the games, it is easy to find that the top-ranked seed wins .511 percent of the games. This is a fairly significant advantage. We also have to take into account that the higher seed is a better team. Assuming an advantage of 5 wins during the regular season for the 2-3 seed game and a 12 game advantage for the 1-4 seed game, here is what I found:

  • #1 seeds should win .669 of the time
  • #2 seeds should win .550 of the time
  • Both series come down to a final 5th game about .376 of the time

Why is this last piece important? In the division series, a team can start their top four pitchers in order and then their #1 pitcher in the last game. Since this happens more than a third of the time, the ace of the staff can be a very important player.

It’s clear that home field advantage has an effect in the Division series. In the World Series, though, the home advantage has little or no effect. Teams with the advantage should actually have only a .508 winning percentage, nothing special. Because the winner of the All-Star game has their pennant winner host the first two and last two games of the Series, the system has recently gotten a lot of publicity, but the statistical evidence does no suggest that it has any effect. Also, since there is very little correlation between winning the All-Star Game and the World Series, the better team does not necessarily have the advantage.

With these tips, predicting the winners in the playoffs should be easy. Best of luck, and may the team you pick win!

Adjusted Stats: Modern Techniques Applied to Sports Monday, Aug 4 2008 

August 2006

Adjusted statistics are one of the newest and most helpful things to grace the world of sports statistics. These statistics correct flawed numbers to account for differences in era, league, and even games played.

To better understand, let’s compare Mark McGwire’s stellar 1998 season, in which he hit seventy homers, to Babe Ruth in 1919, who led the league with twenty-nine. Back then, in the dead ball era, home runs were still scarce, and Ruth didn’t even play full time, pitching seventeen games and going 9-5! After adjusting for this fact, assuming Ruth had 550 AB’s, we can make chart of their home runs:

Name Adj. HR’s League HR’s Pct. of league 1998 HR’s
M. McGwire 70 2565 2.7% 70
B. Ruth 37 497 (adj.) 7.4% 190

First of all, the NL in 1998 hit 2,565 home runs, and the AL in 1919 hit 240. However, you see it in the chart as 497. Why is this? The NL had sixteen teams, and the AL had eight. Since we are calculating the percentage compared to the league, then the AL should be doubled. Next we calculate the percent compared to league, and then multiply by 2,565 to get the expected number of homers in 1998. Note that Ruth would have had almost three times as many homers! This example, if extreme, does show the power of adjusted statistics.

Although adjusted statistics can be used to account for era, they can also be used to correct for position. If we are comparing a corner outfielder to a third baseman, it becomes necessary to adjust for the fact that outfielders have much higher offensive expectations than third baseman. Seeing this, it becomes clear that since Mike Schmidt was a third baseman, his eight home run titles are one of the greatest achievements in baseball history.

Although adjusted statistics are only used in baseball, there is no reason why it should not be possible to use them for other sports like football. For example, which AFC leader was greater: Michael Strahan in 2003 with eighteen and a half sacks, or Dwight Freeney in 2004 with sixteen?

Name Sacks Conference Sacks Pct. of conference 2004 Sacks
M. Strahan 18.5 544 3.4% 20
D. Freeney 16.0 583 2.4% 16

This time the adjusted statistics don’t reverse the margin, rather they augment it.

Of course, this is expected to happen half the time, and can even be helpful in making a statistical argument.

For a powerful example of this augmentation, take Babe Ruth in 1927 with sixty homers versus Roger Maris in 1961, with sixty-one, which broke Ruth’s record. This is obviously neck-and-neck, and so every factor must be taken into account. Both batted left-handed. Both were corner outfielders. Both played their home games at Yankee Stadium. Maris played in a few more games, and Ruth was walked more often. However, the time in which they played is undoubtedly the decisive factor.

Name HR’s Adj. League HR’s Pct. of league 1961 HR’s
R. Maris 61 1086 5.6% 61
B. Ruth 60 549 10.9% 118

The 1927 league stats are adjusted for the fact that Maris’ league had ten teams, but Ruth’s had eight. After doing the necessary calculations, we find that Ruth would have hit almost twice as many homers as Maris in 1961. This resolves one of the greatest statistical arguments in all of baseball.

This method works very well for seasons, but when adjusting for careers we need to be more cautious. Say you are comparing Jackie Robinson to Pete Rose. Most ballplayers reach their prime around 26-28, and then tail off. However, Robinson entered the majors at age 29, and only then do we have close-to-full statistics of his performance. This obviously favors Rose, and the only ways to close the gap are to use Robinson’s Negro League statistics and slightly raise them for the war years that he missed, or do the same with his major league stats. Unfortunately, this method risks having misleading estimations and can give a distorted picture.

You also need to keep your data sets in mind. Say you are comparing two top level equine sprinter’s six-furlong times. If you did the usual solution of averaging the year’s times, your statistics would be incorrect. Since two far apart years will have different ratios of low level claimers to higher class allowances and stakes, you will end up with times skewed in one direction or the other. Because of this, since these are first rate horses, the obvious solution is to just use stakes races. Still, this is a common error and one that can have a large effect.

For adjusted baseball statistics using the method outlined in this column, I find the book “Leveling The Field” by G. Scott Thomas to be a very complete resource. It uses these statistics to simulate playoffs, answer questions like “What was the greatest baseball team of all time” and even compute what players’ salaries would be like in today’s world. It also includes career adjusted statistics for more than 400 of the greatest players of all time.

If used correctly, adjusted statistics can give a sizeable boost to the knowledgeable fan’s position. They are one of the most dangerous and most satisfying tools in today’s world of sports statistics.

New Innovations in Baseball Statistics Monday, Aug 4 2008 

June 2006

Although modern electronic means of transmitting the statistics of many major sports are now common, innovations in the statistics themselves are extremely rare. Baseball is virtually the only sport where statistics are frequently being discovered. This column details five new statistics invented by the author:

  • Isolated Power
  • 2×2 Rate Distribution
  • 2×2 Volume Distribution
  • Pitcher Measure
  • Volume Pitcher Measure

The first of these creations, Isolated Power, is the simplest. Although this statistic is not very strong for actual analysis of how good a player is, it can be very helpful for comparing different types of players. The formula is:

1B + 2 x 2B + 3 x 3B + 4 x HR = TB = (SLG x AB)=IPow

H                                  H              H

(Note: an index of statistical abbreviations can be found at the end of this column.)

The typical IPow now is about 1.62, and surprisingly, due to the designated hitter, is slightly higher in the National League. Isolated Power is also useful for analyzing differences between eras. For example, in the baseball dark ages of 1880, there were only 1.28 bases per hit, a ridiculously low number. Also, IPow usually responds quite well to pitcher dominance, as the following chart of National League isolated powers and ERA’s shows:

Year 1967 1968 1969
ERA 3.38 2.99 3.60
IPow x 2.25 3.285 3.15 3.325

In fact, ERA and IPow x 2.25 have a Pearson correlation of .939, with -1 being no correlation and 1 being perfect correlation. IPow, though deceptively simple, can thus be very useful.

The second of these discoveries is 2×2 Rate Distribution. This statistic attempts to measure overall hitting prowess with double weight to both percentage hitting and slugging. The formula is:

SQRT (AVG x OBP x SLG2)=2×2 Rate

Some career 2×2 Rates:

Babe Ruth:         .277

Ted Williams:    .258

Ty Cobb:            .203

Mickey Mantle: .197

Willie Mays:      .190

One nice thing about this statistic is that it’s easy to adjust, because you can calculate adjusted AVG’s, OBP’s, and SLG’s, and plug them in the formula. This statistic is also the means for calculating Mean Differential (MDif), which is found by computing the player’s 2×2 Rate and then subtracting the league average.
For a volume interpretation of this statistic that reflects how many times a player has gone to bat,  we can use 2×2 Volume Distribution. It is found by the following formula:

(MDif x PA)/20=2×2 Volume

At this point, it is very helpful to use MDif and not 2×2 Rate. Why? If you do this with adjusted 2×2 Rate to find the MDif, you automatically get a built-in comparison to the average player, at 0. This is especially helpful in evaluating careers, since to find a career 2×2 Volume, you simply add all the career 2×2’s together.

Although these 2×2 methods do not factor in base running, stealing, and fielding, and give no way whatsoever to rate pitchers, they are an easy and accurate way to compare hitting. Since these statistics are calculated by multiplication, they also reward players who are more balanced which is an additional benefit.

Now let’s explore a new method for evaluating hurlers called Pitcher Measure. The formula is:

[(WP/TWP) x (LG ERA - ERA) x (K/BB)]/40=Pitcher Measure=PM

WP is in fact a poor statistic, but when it is divided by the team winning percentage, it is a simple and fairly accurate measure. Nobody doubts that ERA is a definitive measure of pitching excellence, and so it obviously merits inclusion. The adjustment to the league has two benefits. The first is that this automatically gives a statistic in which an average performance is 0. Also, this is an adjustment of ERA so regular adjustment is not needed. K/BB may look strange but is just an assessment of control. An average PM is obviously 0, but for a pitcher slightly above the mean .03 would be typical. A Cy Young Award candidate would be about .2. Also, it’s useful to note that this statistic is in fact quite comparable to 2×2 Rate.

The alteration of this statistic to volume type gives us Volume Pitcher Measure, or:

PM/5=VPM

In transforming this stat to volume form, note that we don’t have to make any alteration like we did above in finding 2×2 Volume. Since PM automatically adjusts to the mean, i.e., an average performance is 0, above average is >0, etc., there is no reason to have to make an adjustment to the mean like in 2×2 Distribution. Again, to find the career VPM, just add up all the seasonal values.

In conclusion,

Statistical Index

1B: Singles.

2B: Doubles.

3B: Triples.

AB: At-bats.

AVG: Batting average, H/AB.

BB: Walks.

ER: Earned runs, runs allowed without errors.

ERA: (ERx9)/IP

H: Hits.

K: Strikeouts.

L: Losses.

LG ERA: League ERA.

OBP: On base percentage, (H+BB)/(AB+BB)

PA: Plate appearances, AB+BB

SLG: Slugging percentage, TB/AB

TB: Total bases, 1B + 2 x 2B + 3 x 3B + 4 x HR.

TWP: Team winning percentage.

W: Wins.

WP: For a pitcher, W/L.