{"id":1330,"date":"2015-02-13T13:38:54","date_gmt":"2015-02-13T18:38:54","guid":{"rendered":"http:\/\/blog.rotovalue.com\/?p=1330"},"modified":"2015-02-13T13:38:54","modified_gmt":"2015-02-13T18:38:54","slug":"comparing-2014-projections-woba","status":"publish","type":"post","link":"https:\/\/blog.rotovalue.com\/index.php\/2015\/02\/13\/comparing-2014-projections-woba\/","title":{"rendered":"Comparing 2014 Projections &#8211; wOBA"},"content":{"rendered":"<p>In the\u00a0<a href=\"http:\/\/blog.rotovalue.com\/reviewing-five-2012-mlb-projection-systems\/\">past<\/a> <a href=\"http:\/\/blog.rotovalue.com\/batting-around-crystal-baseballs\/\">three<\/a>\u00a0<a href=\"http:\/\/blog.rotovalue.com\/2013-projection-systems-review-woba\/\">years<\/a>\u00a0I&#8217;ve done reviews of baseball projections systems with actual data for those systems for which I could get <a href=\"http:\/\/blog.rotovalue.com\/2013-projection-systems-review-woba\/\">data<\/a>. Will Larson\u00a0maintains a valuable site of <a href=\"http:\/\/www.bbprojectionproject.com\/\">projections<\/a> from many different sources, and most of the sources I&#8217;m comparing are from that.<br \/>\nAs in the past, I&#8217;m computing\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Root-mean-square_deviation\">root mean square error<\/a>\u00a0(RMSE) and\u00a0<a href=\"http:\/\/en.wikipedia.org\/wiki\/Mean_absolute_error\">mean absolute error<\/a>\u00a0(MAE) for each source compared to actual data. For these tests, I am doing a bias adjustment, so the errors are relative to the average of a source.\u00a0I care more about how a system projects players relative to its own projected averages than about how well it projlects the overall league average.<br \/>\nI have\u00a0data from these systems:<\/p>\n<ul>\n<li><a href=\"http:\/\/www.cs.virginia.edu\/~rjg7v\/AggPro.pdf\">AggPro<\/a> \u2013 A projection aggregation method from Ross J. Gore, Cameron T. Snapp, and Timothy Highley.<\/li>\n<li>Bayesball &#8211; Projections from Jonathan Adams.<\/li>\n<li><a href=\"http:\/\/www.rlyw.net\/\">CAIRO<\/a>\u00a0\u2013 from S B of the\u00a0<a href=\"http:\/\/www.rlyw.net\/\">Replacement Level Yankees Weblog<\/a>.<\/li>\n<li><a href=\"http:\/\/fantasynews.cbssports.com\/fantasybaseball\/stats\/sortable\/points\/SP\/standard\/projections\" target=\"_blank\" rel=\"noopener\">CBS<\/a>\u00a0Projections from CBS Sportsline.<\/li>\n<li><a href=\"http:\/\/claydavenport.com\/projections\/PROJHOME.shtml\">Davenport<\/a>\u00a0Clay Davenport&#8217;s projections.<\/li>\n<li><a href=\"http:\/\/games.espn.go.com\/frontpage\/baseball\" target=\"_blank\" rel=\"noopener\">ESPN<\/a>\u00a0Projections from ESPN.<\/li>\n<li><a href=\"http:\/\/www.fangraphs.com\/projections.aspx?pos=all&amp;stats=bat&amp;type=fan\" target=\"_blank\" rel=\"noopener\">Fans<\/a>\u00a0Fans&#8217; projections from <a href=\"http:\/\/www.fangraphs.com\/\">Fangraphs.com<\/a>.<\/li>\n<li><a href=\"http:\/\/www.bbprojectionproject.com\/\" target=\"_blank\" rel=\"noopener\">Larson<\/a>\u00a0Will Larson&#8217;s projections.<\/li>\n<li><a href=\"http:\/\/www.tangotiger.net\/marcel\/\">Marcel<\/a>\u00a0\u2013 the basic projection model from Tom Tango, coauthor of\u00a0<a href=\"http:\/\/www.insidethebook.com\/\">The Book<\/a>. This year I\u2019m using Marcel numbers generated by <a href=\"http:\/\/www.baseballheatmaps.com\/marcel-database-download\/\">Jeff Zimmerman<\/a>, using <a href=\"http:\/\/summerofjeff.wordpress.com\/2011\/01\/14\/python-code-for-marcel-projections\/\">Jeff Sackmann\u2019s<\/a>\u00a0Python code.<\/li>\n<li><a href=\"http:\/\/morps.mlblogs.com\/\">MORPS<\/a> \u2013 A projection model by Tim Oberschlake.<\/li>\n<li><a href=\"http:\/\/bats.blogs.nytimes.com\/author\/dan-rosenheck\/\" target=\"_blank\" rel=\"noopener\">Rosenheck<\/a>\u00a0Projections by Dan Rosenheck.<\/li>\n<li><a href=\"http:\/\/www.hardballtimes.com\/introducing-oliver\/\">Oliver<\/a>\u00a0&#8211; Brian Cartwright&#8217;s projection model.<\/li>\n<li><a href=\"http:\/\/steamerprojections.com\/\">Steamer<\/a>\u00a0\u2013 Projections by Jared Cross, Dash Davidson, and Peter Rosenbloom.<\/li>\n<li><a href=\"http:\/\/steamerprojections.com\/\">Steamer\/Razzball<\/a> \u2013 Steamer rate projections, but\u00a0playing time projections from Rudy Gamble of <a href=\"http:\/\/razzball.com\/\">Razzball.com<\/a>.<\/li>\n<li><a href=\"http:\/\/blog.rotovalue.com\/?p=181\">RotoValue<\/a>\u00a0\u2013 my current model, based largely on Marcel, but with adjustments for <a href=\"http:\/\/blog.rotovalue.com\/projecting-wins-saves-and-holds\/\">pitching decision stats<\/a> and assuming no pitcher skill in BABIP.<\/li>\n<li><a href=\"http:\/\/www.rotovalue.com\/cgi-bin\/Search?year=2014&amp;league=51&amp;source=RV%20Pre-Australia\">RV Pre-Australia<\/a>\u00a0&#8211; The RotoValue projections taken just before the first Australia games last year. Before the rest of the regular season I continued to tweak projections slightly.<\/li>\n<li><a href=\"http:\/\/www.baseballthinkfactory.org\/oracle\/discussion\/2012_zips_projections_spreadsheets_v._1\">ZiPS<\/a>\u00a0\u2013 projections from Dan Szymborski of\u00a0<a href=\"http:\/\/www.baseballthinkfactory.org\/\">Baseball Think Factory<\/a>\u00a0and\u00a0<a href=\"http:\/\/search.espn.go.com\/dan-szymborski\/\">ESPN<\/a>.<\/li>\n<\/ul>\n<p>In addition, I&#8217;ve computed a source &#8220;All Consensus&#8221;, which is \u00a0a simple average of each of the above (ignoring a source if it doesn&#8217;t project some particular category).<br \/>\nNot all the models had enough data to compute wOBA, so the tables\u00a0(below the jump)\u00a0only include those sources which do. The other sources do affect the All Consensus values for those stats where they do have data.<br \/>\n<!--more-->First, as an &#8220;apples-to-apples&#8221; comparison, I&#8217;m comparing only those players projected by each system (279 total):<\/p>\n<table>\n<tbody>\n<tr>\n<th>Source<\/th>\n<th>Num<\/th>\n<th>Avg wOBA<\/th>\n<th>MAE<\/th>\n<th>RMSE<\/th>\n<\/tr>\n<tr>\n<td>Actual<\/td>\n<td>279<\/td>\n<td>0.3270<\/td>\n<td>0.0000<\/td>\n<td>0.0000<\/td>\n<\/tr>\n<tr>\n<td>All Consensus<\/td>\n<td>279<\/td>\n<td>0.3387<\/td>\n<td>0.0236<\/td>\n<td>0.0303<\/td>\n<\/tr>\n<tr>\n<td>Steamer<\/td>\n<td>279<\/td>\n<td>0.3354<\/td>\n<td>0.0240<\/td>\n<td>0.0307<\/td>\n<\/tr>\n<tr>\n<td>Steamer\/Razzball<\/td>\n<td>279<\/td>\n<td>0.3364<\/td>\n<td>0.0242<\/td>\n<td>0.0308<\/td>\n<\/tr>\n<tr>\n<td>Zips<\/td>\n<td>279<\/td>\n<td>0.3386<\/td>\n<td>0.0245<\/td>\n<td>0.0309<\/td>\n<\/tr>\n<tr>\n<td>RotoValue<\/td>\n<td>279<\/td>\n<td>0.3353<\/td>\n<td>0.0242<\/td>\n<td>0.0313<\/td>\n<\/tr>\n<tr>\n<td>RV Pre-Australia<\/td>\n<td>279<\/td>\n<td>0.3351<\/td>\n<td>0.0241<\/td>\n<td>0.0313<\/td>\n<\/tr>\n<tr>\n<td>CAIRO<\/td>\n<td>279<\/td>\n<td>0.3361<\/td>\n<td>0.0247<\/td>\n<td>0.0315<\/td>\n<\/tr>\n<tr>\n<td>Davenport<\/td>\n<td>279<\/td>\n<td>0.3343<\/td>\n<td>0.0243<\/td>\n<td>0.0316<\/td>\n<\/tr>\n<tr>\n<td>MORPS<\/td>\n<td>279<\/td>\n<td>0.3432<\/td>\n<td>0.0250<\/td>\n<td>0.0318<\/td>\n<\/tr>\n<tr>\n<td>Marcel<\/td>\n<td>279<\/td>\n<td>0.3200<\/td>\n<td>0.0247<\/td>\n<td>0.0318<\/td>\n<\/tr>\n<tr>\n<td>Oliver<\/td>\n<td>279<\/td>\n<td>0.3394<\/td>\n<td>0.0250<\/td>\n<td>0.0325<\/td>\n<\/tr>\n<tr>\n<td>CBS<\/td>\n<td>279<\/td>\n<td>0.3410<\/td>\n<td>0.0253<\/td>\n<td>0.0326<\/td>\n<\/tr>\n<tr>\n<td>Fans<\/td>\n<td>279<\/td>\n<td>0.3470<\/td>\n<td>0.0257<\/td>\n<td>0.0329<\/td>\n<\/tr>\n<tr>\n<td>y2013<\/td>\n<td>279<\/td>\n<td>0.3374<\/td>\n<td>0.0314<\/td>\n<td>0.0427<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The lowest errors came from the Consensus, so there may be some marginal improvement from averaging multiple sources. But the spread among systems was rather small. Steamer did best among actual systems, but they all did markedly better than my simple benchmark of 2013 data. ZiPS was second-best, followed by the two RotoValue models (yay!). Marcel remains quite competitive here, though, which shows that a basic model can still do quite well.<br \/>\nNext I&#8217;m rerunning the analysis using 20 points worse than league average wOBA for any player not projected, and now comparing the 643 players projected by at least 1 system:<\/p>\n<table>\n<tbody>\n<tr>\n<th>Source<\/th>\n<th>MLB<\/th>\n<th>wOBA<\/th>\n<th>StdDev<\/th>\n<th>MAE<\/th>\n<th>RMSE<\/th>\n<th>Missing<\/th>\n<\/tr>\n<tr>\n<td>Actual<\/td>\n<td>643<\/td>\n<td>0.3186<\/td>\n<td>0.0427<\/td>\n<td>0.0000<\/td>\n<td>0.0000<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>Steamer<\/td>\n<td>643<\/td>\n<td>0.3271<\/td>\n<td>0.0260<\/td>\n<td>0.0270<\/td>\n<td>0.0364<\/td>\n<td>21<\/td>\n<\/tr>\n<tr>\n<td>Zips<\/td>\n<td>643<\/td>\n<td>0.3291<\/td>\n<td>0.0274<\/td>\n<td>0.0275<\/td>\n<td>0.0365<\/td>\n<td>49<\/td>\n<\/tr>\n<tr>\n<td>Steamer\/Razzball<\/td>\n<td>643<\/td>\n<td>0.3288<\/td>\n<td>0.0251<\/td>\n<td>0.0272<\/td>\n<td>0.0366<\/td>\n<td>166<\/td>\n<\/tr>\n<tr>\n<td>All Consensus<\/td>\n<td>643<\/td>\n<td>0.3305<\/td>\n<td>0.0250<\/td>\n<td>0.0272<\/td>\n<td>0.0366<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>Davenport<\/td>\n<td>643<\/td>\n<td>0.3264<\/td>\n<td>0.0250<\/td>\n<td>0.0276<\/td>\n<td>0.0376<\/td>\n<td>158<\/td>\n<\/tr>\n<tr>\n<td>CAIRO<\/td>\n<td>643<\/td>\n<td>0.3267<\/td>\n<td>0.0284<\/td>\n<td>0.0283<\/td>\n<td>0.0377<\/td>\n<td>32<\/td>\n<\/tr>\n<tr>\n<td>Oliver<\/td>\n<td>643<\/td>\n<td>0.3294<\/td>\n<td>0.0315<\/td>\n<td>0.0282<\/td>\n<td>0.0379<\/td>\n<td>25<\/td>\n<\/tr>\n<tr>\n<td>MORPS<\/td>\n<td>643<\/td>\n<td>0.3353<\/td>\n<td>0.0257<\/td>\n<td>0.0285<\/td>\n<td>0.0380<\/td>\n<td>111<\/td>\n<\/tr>\n<tr>\n<td>Fans<\/td>\n<td>643<\/td>\n<td>0.3403<\/td>\n<td>0.0255<\/td>\n<td>0.0285<\/td>\n<td>0.0384<\/td>\n<td>320<\/td>\n<\/tr>\n<tr>\n<td>RV Pre-Australia<\/td>\n<td>643<\/td>\n<td>0.3285<\/td>\n<td>0.0247<\/td>\n<td>0.0285<\/td>\n<td>0.0385<\/td>\n<td>11<\/td>\n<\/tr>\n<tr>\n<td>CBS<\/td>\n<td>643<\/td>\n<td>0.3339<\/td>\n<td>0.0261<\/td>\n<td>0.0287<\/td>\n<td>0.0387<\/td>\n<td>295<\/td>\n<\/tr>\n<tr>\n<td>RotoValue<\/td>\n<td>643<\/td>\n<td>0.3288<\/td>\n<td>0.0246<\/td>\n<td>0.0286<\/td>\n<td>0.0387<\/td>\n<td>7<\/td>\n<\/tr>\n<tr>\n<td>Marcel<\/td>\n<td>643<\/td>\n<td>0.3121<\/td>\n<td>0.0234<\/td>\n<td>0.0288<\/td>\n<td>0.0387<\/td>\n<td>104<\/td>\n<\/tr>\n<tr>\n<td>y2013<\/td>\n<td>643<\/td>\n<td>0.3255<\/td>\n<td>0.0474<\/td>\n<td>0.0364<\/td>\n<td>0.0503<\/td>\n<td>137<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The errors are a bit bigger, as this set includes more players, and those who will play less (and thus be less likely to perform close to their true talent). Steamer is again the best single system, this time edging out ZiPS slightly, and the Consensus now just behind Steamer\/Razzball. Oliver, CBS, and Fangraphs Fans, which all lagged Marcel in the smaller set, now do better, as all systems now have lower errors than Tango&#8217;s monkey system. My model, however, dropped back relative to the other systems, which implies my projections for less strong players may be relatively weaker than other systems.<br \/>\nThe spread between the best and worst system in RMSE is just 0.0023, even smaller than last year&#8217;s spread, while the gap from the weakest system to 2013 data is over 5 times as large. So using projections is better than simply relying on last year&#8217;s data. Steamer\u00a0also came out on top in the comparison\u00a0I did\u00a0<a href=\"http:\/\/blog.rotovalue.com\/expanded-2013-woba-projections-comparison\/\">last year<\/a>, but the spread between systems is smaller this time, so which projections you use matters far less than that you use projections.<br \/>\n<em><strong>Update:\u00a0<\/strong><\/em>Rudy Gamble of Razzball.com asked if I could rerun the analysis for players with 500 or more PA. So here&#8217;s the table:<\/p>\n<table>\n<tbody>\n<tr>\n<th>Source<\/th>\n<th>MLB<\/th>\n<th>wOBA<\/th>\n<th>StdDev<\/th>\n<th>MAE<\/th>\n<th>RMSE<\/th>\n<th>Missing<\/th>\n<\/tr>\n<tr>\n<td>Actual<\/td>\n<td>149<\/td>\n<td>0.3352<\/td>\n<td>0.0316<\/td>\n<td>0.0000<\/td>\n<td>0.0000<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>All Consensus<\/td>\n<td>149<\/td>\n<td>0.3422<\/td>\n<td>0.0221<\/td>\n<td>0.0222<\/td>\n<td>0.0276<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>Steamer\/Razzball<\/td>\n<td>149<\/td>\n<td>0.3400<\/td>\n<td>0.0241<\/td>\n<td>0.0226<\/td>\n<td>0.0280<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>Steamer<\/td>\n<td>149<\/td>\n<td>0.3391<\/td>\n<td>0.0237<\/td>\n<td>0.0227<\/td>\n<td>0.0281<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>Davenport<\/td>\n<td>149<\/td>\n<td>0.3382<\/td>\n<td>0.0229<\/td>\n<td>0.0229<\/td>\n<td>0.0284<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>Zips<\/td>\n<td>149<\/td>\n<td>0.3428<\/td>\n<td>0.0240<\/td>\n<td>0.0233<\/td>\n<td>0.0287<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>MORPS<\/td>\n<td>149<\/td>\n<td>0.3470<\/td>\n<td>0.0242<\/td>\n<td>0.0229<\/td>\n<td>0.0288<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td>RV Pre-Australia<\/td>\n<td>149<\/td>\n<td>0.3387<\/td>\n<td>0.0239<\/td>\n<td>0.0229<\/td>\n<td>0.0292<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>CAIRO<\/td>\n<td>149<\/td>\n<td>0.3401<\/td>\n<td>0.0252<\/td>\n<td>0.0237<\/td>\n<td>0.0294<\/td>\n<td>1<\/td>\n<\/tr>\n<tr>\n<td>RotoValue<\/td>\n<td>149<\/td>\n<td>0.3388<\/td>\n<td>0.0239<\/td>\n<td>0.0232<\/td>\n<td>0.0294<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>Marcel<\/td>\n<td>149<\/td>\n<td>0.3225<\/td>\n<td>0.0230<\/td>\n<td>0.0233<\/td>\n<td>0.0302<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>CBS<\/td>\n<td>149<\/td>\n<td>0.3444<\/td>\n<td>0.0268<\/td>\n<td>0.0241<\/td>\n<td>0.0305<\/td>\n<td>4<\/td>\n<\/tr>\n<tr>\n<td>Fans<\/td>\n<td>149<\/td>\n<td>0.3516<\/td>\n<td>0.0256<\/td>\n<td>0.0241<\/td>\n<td>0.0307<\/td>\n<td>6<\/td>\n<\/tr>\n<tr>\n<td>Oliver<\/td>\n<td>149<\/td>\n<td>0.3441<\/td>\n<td>0.0283<\/td>\n<td>0.0241<\/td>\n<td>0.0313<\/td>\n<td>0<\/td>\n<\/tr>\n<tr>\n<td>y2013<\/td>\n<td>149<\/td>\n<td>0.3429<\/td>\n<td>0.0411<\/td>\n<td>0.0307<\/td>\n<td>0.0425<\/td>\n<td>3<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This is very much like the apples-to-apples table above, as very few systems didn&#8217;t have a projection. This is a set of smaller, and better, players, and the overall errors are lower, but the ordering remains about the same.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the\u00a0past three\u00a0years\u00a0I&#8217;ve done reviews of baseball projections systems with actual data for those systems for which I could get data. Will Larson\u00a0maintains a valuable site of projections from many different sources, and most of the sources I&#8217;m comparing are from that. As in the past, I&#8217;m computing\u00a0root mean square error\u00a0(RMSE) and\u00a0mean absolute error\u00a0(MAE) for&hellip; <a class=\"more-link\" href=\"https:\/\/blog.rotovalue.com\/index.php\/2015\/02\/13\/comparing-2014-projections-woba\/\">Continue reading <span class=\"screen-reader-text\">Comparing 2014 Projections &#8211; wOBA<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[6,11,14],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/posts\/1330"}],"collection":[{"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/comments?post=1330"}],"version-history":[{"count":0,"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/posts\/1330\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/media?parent=1330"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/categories?post=1330"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.rotovalue.com\/index.php\/wp-json\/wp\/v2\/tags?post=1330"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}