In my last series of posts, I looked at the predictive ability of Wins Produced, ezPM, RAPM, and APM. It appeared that RAPM did the best. One reason that I suggested but didn’t think was a good explanation was that RAPM assumes that more players are average. This would essentially add a certain amount of regression to the mean across seasons which may help with predictions. I decided to look into it a bit more and found some interesting results.
As part of the post, I thought I would walk through another team as an example, which gives me an opportunity to double-check that the numbers are right. It was a good thing, because I noticed that my code accidentally put all rookies at replacement level for APM (since they had no measure in the previous season) under the average rookie assumption. That means the APM numbers I reported previously for that case were off. Don’t worry, I’ll give the correct ones below (and APM is still the worst). This time I’ll look at the 2010 Denver Nuggets. Here are their actual performance numbers:
You can see that the metrics generally agree that Afflalo, Allen, Balkman, Carter, Graham, and Petro were below average, while Andersen, Anthony, Billups, Nene, Lawson, Martin, and Smith were above average (although there are some disagreements, and disagreements as to degree). And here are their performances in 2009:
Ty Lawson is missing since he was a rookie in 2009; I’m just going to assume average rookie performance today (.045 WP48, -1.92 points per 100 possessions). And everyone gets 0 for ezPM because there are no (usable) numbers for that season. As you remember, those performances are used to predict performance in 2010. In this case I’m going to present the player predictions in terms of total points produced over the course of the season. For ezPM, RAPM, and APM, that’s just the rating times number of possessions divided by 100. For WP the formula is (WP48-.1)*82*100/(1.927*2.54*48) to get points per 100 possessions, then converted to total points the same way as the other metrics. Here are the 2010 predictions based on 2009:
Then I sum up the points and divide by 82 to get predicted per-game point differential, do this for every team, and find the mean absolute error for each metric. As presented previously, WP does worse than RAPM (and ezPM in the one season available) at predicting future team point differential. As I mentioned, one suggestion was that WP had too much of a range; the good players are too good and the bad players too bad. So I took the predicted points for each player and multiplied them by .8. For example, Afflalo would be predicted to produce -115 points, Allen -52.75, and so on. As it turns out, if you do this for all six years of data that I have team performance for, WP improves in every season. Presented in order of WP, RAPM, ezPM, and APM, the mean absolute error for 2011 is 2.83, 2.51, 2.68, 3.47; for 2010 it’s 2.92, 2.97, (ezPM drops out), 4.58; for 2009 3.16, 3.04, 4.68; for 2008 3.89, (APM drops out), and 3.33; and for 2007 2.41 and 2.48. If you look at the WP average rookie errors from my previous post, you see that WP improves by .13 to .61 points. This is enough for it to beat RAPM in two seasons and only be behind by .1 to .5 points in the other seasons. Previously with average rookie performance it was always behind and typically by closer to a point.
That result would suggest that regression to the mean is helpful for WP. I only checked .8; it’s possible some other number would allow it to beat RAPM consistently. Here’s the rub: what if you use those regressed ratings to explain the current year (e.g., use 2010’s ratings to explain 2010)? WP does *worse* in all six seasons. Pure WP has the best mean absolute error every year (ranging from .123 to .222) while RAPM is the worst (1.69 to 2.28); APM ranges from .705 to .874 and ezPM in its two years has .486 and .491. But the regressed WP has a range from .506 to .811. It has fallen behind ezPM and declined by over half a point in accuracy; it overlaps somewhat with APM.
From this quick look, it appears as though regressing to the mean functions to improve future predictions at the cost of current accuracy. This appears in the pattern of results for RAPM (best at predicting upcoming seasons, worst at explaining current seasons) and happens with a test of Wins Produced (multiplying the player ratings by .8 improves prediction but hurts explaining current results). If true, it suggests that my assumption was wrong; the assumption of average players could be enough for RAPM to make the best predictions. However, it also implies that it does so at the cost of getting the ratings wrong, since RAPM does a relatively poor job at describing current team performance. This would be problematic because it implies that the player ratings aren’t correct per se, but simply enough in the right direction that the forced regression to the mean smooths things over and makes good predictions. I think it also suggests that there must actually be a good amount of regression to the mean for players across seasons.