Thursday I compared five projection systems with their projections for weighted on base average. Today I’m looking at two different pitching categories, runs allowed per 9 innings (RA9) and WHIP, walks plus hits per innings pitched.
Tom Tango kindly highlighted my post yesterday, but suggested that one of my charts was useless, because it did not compare the systems on the same players. So today I’ll just have two charts per stat, one using only those players projected by all systems, and another filling in a “missing” value for any player a system did not project.
The systems are the same five:
- CAIRO – from S B of the Replacement Level Yankees Weblog.
- Marcel – the basic projection model from Tom Tango, coauthor of The Book. This year I’m using Marcel numbers generated by Jeff Zimmerman, using Jeff Sackmann’s Python code.
- MORPS – A projection model by Tim Oberschlake.
- Steamer/Razzball – Rate projections by Jared Cross, Dash Davidson, and Peter Rosenbloom, and playing time projections from Rudy Gamble of Razzball.com.
- RotoValue – my current model, based largely on Marcel, but with adjustments for pitching decision stats and assuming no pitcher skill in BABIP.
And, as before, I’m calculating RMSE and MAE, and sorting by the former. The error is bias-adjusted, so I’m first comparing each player’s stat to the average of the system, and then I compare those deltas with the actual delta. I’m using actual innings pitched to weight the averages. First up, RA9.
These are the 373 pitchers projected by all the systems:
Source | Num | Avg RA9 | MAE | RMSE |
---|---|---|---|---|
Actual | 373 | 4.1006 | 0.0000 | 0.0000 |
Steamer/Razzball | 373 | 4.2557 | 0.8416 | 1.1176 |
Consensus | 373 | 4.2288 | 0.8465 | 1.1293 |
Marcel | 373 | 4.1553 | 0.8836 | 1.1620 |
MORPS | 373 | 4.1651 | 0.8810 | 1.1735 |
RotoValue | 373 | 4.2009 | 0.8842 | 1.1809 |
CAIRO | 373 | 4.3407 | 0.9230 | 1.2250 |
y2012 | 373 | 4.0932 | 1.1731 | 1.5998 |
Steamer again is the best, followed by the Consensus. There’s a much higher level of errors here, at 25% or more of the statistic, compared to about 10-12% when projecting wOBA. Also, the systems vary more in how well they do. Pitching is indeed harder to predict.
Next I’ve assumed an RA9 of 0.50 above the league average for any players not projected by a system:
Source | Num | RA9 | MLB | RA9 | StdDev | MAE | RMSE | Missing |
---|---|---|---|---|---|---|---|---|
Actual | 664 | 4.1765 | 664 | 4.1752 | 1.4593 | 0.0000 | 0.0000 | 0 |
Steamer/Razzball | 515 | 4.2975 | 664 | 4.3477 | 0.5198 | 0.9656 | 1.3825 | 198 |
Consensus | 872 | 4.3286 | 664 | 4.3403 | 0.5480 | 0.9836 | 1.4049 | 113 |
MORPS | 499 | 4.2385 | 664 | 4.2846 | 0.6549 | 1.0162 | 1.4381 | 226 |
Marcel | 841 | 4.3405 | 664 | 4.2872 | 0.5407 | 1.0230 | 1.4390 | 136 |
CAIRO | 489 | 4.3972 | 664 | 4.4681 | 0.6919 | 1.0400 | 1.4644 | 234 |
RotoValue | 836 | 4.3470 | 664 | 4.3361 | 0.7283 | 1.0521 | 1.4916 | 133 |
y2012 | 630 | 4.3382 | 664 | 4.4439 | 1.9624 | 1.4250 | 2.3074 | 168 |
This comparison shakes up the order quite a bit compared to using the subset of all projected pitchers. Steamer still had the lowest errors, but MORPS and CAIRO now do much better, while Marcel, the Consensus average, and my RotoValue projection did relatively worse in this test.Steamer and Consensus are still the top two, while MORPS moves ahead of Marcel, and CAIRO passes RotoValue. Including more, and worse, pitchers raises the errors of all the systems.
Now on to WHIP, for the 373 pitchers projected by all:
Source | Num | Avg WHIP | MAE | RMSE |
---|---|---|---|---|
Actual | 373 | 1.2816 | 0.0000 | 0.0000 |
Consensus | 373 | 1.3032 | 0.1359 | 0.1838 |
RotoValue | 373 | 1.2848 | 0.1360 | 0.1844 |
Steamer/Razzball | 373 | 1.3045 | 0.1348 | 0.1845 |
Marcel | 373 | 1.2716 | 0.1396 | 0.1870 |
MORPS | 373 | 1.2913 | 0.1405 | 0.1925 |
CAIRO | 373 | 1.3582 | 0.1735 | 0.2292 |
y2012 | 373 | 1.2591 | 0.1798 | 0.2448 |
The RMSEs here are a little tighter than for RA9, but till much wider than for wOBA. My model edged out Steamer/Razzball in RMSE here, but was a bit behind it in MAE, while the Consensus did even better than mine.
Now here’s adding league-average WHIP plus 0.100 for any players not projected:
Source | Num | WHIP | MLB | WHIP | StdDev | MAE | RMSE | Missing |
---|---|---|---|---|---|---|---|---|
Actual | 664 | 1.2995 | 664 | 1.2993 | 0.2354 | 0.0000 | 0.0000 | 0 |
Steamer/Razzball | 515 | 1.3139 | 664 | 1.3250 | 0.0895 | 0.1522 | 0.2176 | 198 |
Marcel | 841 | 1.3055 | 664 | 1.2952 | 0.0911 | 0.1589 | 0.2232 | 136 |
Consensus | 872 | 1.3231 | 664 | 1.3280 | 0.1101 | 0.1579 | 0.2233 | 113 |
RotoValue | 836 | 1.3136 | 664 | 1.3097 | 0.1100 | 0.1592 | 0.2246 | 133 |
MORPS | 499 | 1.3129 | 664 | 1.3226 | 0.1335 | 0.1626 | 0.2307 | 226 |
CAIRO | 489 | 1.3720 | 664 | 1.3874 | 0.1733 | 0.1864 | 0.2529 | 234 |
y2012 | 630 | 1.3049 | 664 | 1.3219 | 0.3179 | 0.2175 | 0.3630 | 168 |
Steamer comes out on top in this test, while my model and the Consensus do drop back. Pitching statistics take longer to stabilize, so it’s not so surprising to see the relatively higher errors here. Very observant readers might have noticed a slight difference in the actual WHIP and RA9 in the two columns of the second tables. That’s an artifact of a small number of pitchers who did not record an out, and thus had 0 IP. When I’m computing errors, these pitchers aren’t included (since I weight by IP, they have a weight of 0), but their runs, hits, and walks allowed do slightly increase the aggregate actual statistics.
7 February 2014: I found and fixed a bug in my code that generated the tables using a worse than league average value for missing players, so the affected tables in this post were changed to reflect that.