Improving Future Statistics

While we can’t know the future, we can make educated guesses and projections about it. RotoValue does this by displaying projected stats (both my own projections, and for baseball, projections contributed from outside sources, Steamer, Marcel, and MORPS), and also by letting you choose prorated stats for the current year or the previous year.
When I first implemented prorated numbers, I simply divided a player’s current stats by the number of games his team had played, and then multiplied by the number of games in the season. If you were showing stats for less than the full season, I’d prorate to the number of games his team had scheduled over that time. That was a decent first-cut, but for players who missed much of the season so far, their prorated future numbers were too low. So, for example, Clayton Kershaw won his first start, pitching very well, but then went on the DL. Under my old model, since he pitched just one of the Dodgers’ 26 games so far, I’d prorate him to start just 1 out of every 26 games, or about 6 starts. His fantasy owners (not to mention the Dodgers) surely hope he pitches a lot more than that! Now, however, I’m tracking when players are actually on a team’s active roster[ref]I started tracking this in MLB for 2014, so I don’t have the data for prior seasons[/ref], and using that information to better prorate statistics. Kershaw has made just 1 start, but since he was on the disabled list most of the time, I currently prorate him to make a total of 25 starts[ref]Since the link is prorating, his ERA and WHIP will be his actual 2014 data[/ref].
Simply prorating stats has another bias, though, one that also affects searches based on projected statistics. Now that the season is under way, players do get injured, and preseason projections did not reflect that information. Josh Hamilton tore a ligament in his thumb and is out for 6-8 weeks according to my injury reports. But the preseason projections for Hamilton don’t account for this new knowledge, but I now try to do that. I now compute a “target return” date for injured players, based on the data shown in the injury reports I receive. In Hamilton’s case, I’m adding 7 weeks to the April 9th date listed for his injury, and I set the target return to May 28th. So rather than showing Hamilton’s stats assuming he’ll play all the team’s remaining games, I prorate the projections as if he’ll only play from May 28th onward.HamiltonSearch
Here I’ve reduced the playing time for Hamilton, but kept his rate numbers the same as the original preseason Steamer projections:
This should better reflect the future value Hamilton might have to fantasy owners. So when you’re viewing projections in a Search page, or as part of a projected standings page, I’ll adjust projections based on a player’s target return from injury. The Player Detail page will continue to show the original projections as given to me by the source (or computed by me).
The target return date is used not only when showing projections, but also when showing prorated statistics. Where the injury report gives an estimate of the player’s return, I use that to get the target return date. If he’s on the disabled list, but without any other guidance, I use the first date he’s eligible to come off, unless that date is in the past, in which case I’ll arbitrarily say he’ll miss 10 more days. At this point I’m updating based on the injury reports themselves, and not other news stories about a player’s return. So while I’ve seen reports that Bryce Harper will be out until July, because the injury report currently just lists him on the 15-day DL, I’ve set his target return to May 11th, 15 days after he was put on the DL.
These enhancements are also used when I compute projected standings for a league, so those values should be improved overall.
One caveat, however: by ignoring time a player is not on the active roster for prorated stats, I do expose small sample size issues. Because he has made only one start so far, Kershaw prorates to have the 1.35 ERA and 0.900 WHIP he had in that one start, while Hamilton hit .444 in the 8 games he played before going on the DL. While the rate stats, are way overly optimistic, the cumulative totals are better than they would be if I simply assumed both players would only play a tiny fraction of the season, which is what the old prorating model did.
Numbers should never drive all your decisions in fantasy sports, but getting better numbers can better inform your decisions. And the projected and prorated statistics shown at RotoValue have just gotten better.