In part 1 I laid out the methods for generating retrodictions for Wins Produced, ezPM, RAPM, and APM. Here in part 2, you get the goods.
Let’s start with my original version, where rookies all got the same average productivity. In 2011, we get the following mean absolute error: 2.51 RAPM, 2.68 ezPM, 3.32 WP, and 3.95 APM. If rookies get their actual productivity, we get 2.72 RAPM, 2.76 ezPM, 2.77 WP, and 3.50 APM. A couple things to note: RAPM and ezPM get numerically worse knowing actual rookie production. This is only somewhat interesting for RAPM, since ‘actual’ rookie production for it in 2011 is an assumption of 0 (and you know why if you read part 1). For ezPM it’s only a difference of .08 points on average, so probably not a big deal. On the other hand, WP and APM get a decent amount better. If we had to choose, RAPM seems to make the best picks followed by ezPM, WP, and APM. Looking at one year, though, it’s hard to draw firm conclusions. So…
2010: I can’t use ezPM any more because the 2009 numbers are not to be trusted according to Evan. The average rookie errors are 2.97 RAPM, 3.39 WP, and 5.08 APM. Actual rookie errors are 3.22 RAPM, 3.15 WP, and 4.36 APM. Again we see WP and APM improve while RAPM gets worse. APM again brings up the rear, and RAPM is the best although not if actual rookie production is used. The predictions also seem to be generally worse for 2010 than 2011.
2009: The average rookie errors are 3.04 RAPM, 3.77 WP, and 5.02 APM; the actual rookie errors are 2.93, 3.27, and 4.37. The same pattern is popping up, although RAPM improved slightly by knowing rookie production this time.
2008: There is no 2007 APM data, so APM drops out here. RAPM has average and actual rookie errors of 3.33 and 3.57 while WP has errors of 4.23 and 4.19.
2007: the last year I can look at with more than one metric. RAPM has average and actual errors of 2.48 and 2.50 while WP has errors of 2.54 and 2.68. This is the only season where WP does worse using actual rookie performance.
Summary: there’s only one season where we can look at ezPM, but it does pretty well that year. It comes in second to RAPM, and is essentially tied with RAPM and WP if actual rookie production is used. Assuming that pattern would hold up, it seems like ezPM does a good job of both explaining existing results and predicting future performance.
APM covers three seasons, and consistently does the worst. There’s some chance that this is due to my replacement player assumption of -3.8. However, since replacement players by definition don’t play many minutes, they shouldn’t affect the results too strongly. APM consistently got better when allowed to use actual rookie performance, suggesting that it can describe within-season performance better than guessing the average. But given its overall performance, APM would not be my first choice as a player metric.
RAPM covers most of the timespan I looked at, and consistently comes in first. What’s potentially more interesting is that RAPM also consistently did worse when given actual rookie performance. This is excusable in 2011 when it ‘guessed’ that all rookies were average just as my default filler value, but it makes less sense for ’07, ’08, and ’10. Using WP as the most consistently available alternative, RAPM is better by about half a point of average absolute differential.
Finally, Wins Produced was consistently better than APM but consistently worse than RAPM. It did improve when allowed to use actual rookie production, which allowed it to make the best predictions in 2010. It was also close to RAPM in 2011 and 2007. So while RAPM is consistently ahead, it is not a wide gap; for the four years where both methods can use actual rookie performance, RAPM has an average error of 3.06 compared to WP’s 3.32.
In part 3, I’ll talk about some conclusions I drew from the retrodiction results and some methodological issues.