Last time I looked at the ability of three different metrics (ezPM, WP, and RAPM) to predict how teams would do this past season. That post has all the details, so you should give it a read. It turned out that ezPM did the best and WP the worst, although performance was very close between the three. This time I’ll look at 2010.
In order to predict 2010, I’m primarily taking performance by players in 2009. Unfortunately, that means that I can’t use ezPM because Evan has told me that the data is likely flawed. Double unfortunately, that means I can’t use RAPM because I used Evan’s data to get the number of possessions played. So I’m left with Wins Produced. In 2011, WP had a mean absolute error of 8.3 for wins and 3.32 for point differential. In 2010, those numbers were 8.47 and 3.39. Very similar. The biggest mistakes in 2010 were for Phoenix and OKC followed by New Jersey, Milwaukee, Golden State, Minnesota, Sacramento, Toronto, and Boston (that sounds like a lot, but that group is just clumped together without a clear cut-off line). Phoenix was better than expected due to Channing Frye and Robin Lopez not being terrible, Goran Dragic improving, and improvements by Nash, Richardson, and Stoudemire. OKC was also better than expected for the reasons you would think: all their young players got better. Durant alone added 8 wins, and there were also unexpected additions from Harden, Ibaka, and Westbrook.
Since I don’t have much else to say at this point (although I do hope to do these retrodictions for WP as I have time to type in team performance for past seasons), I’ll mention a couple other things I saw in the data. One is the correlation between the three measures. For 2011, WP48 and ezPM100 had a correlation of .73, which I find to be perhaps surprisingly high although there’s obviously plenty of room for disagreement. The correlation between WP48 or ezPM100 and RAPM is unknown since I didn’t use that data. Looking at 2010, the WP48/ezPM100 correlation is down to .55; the WP48/RAPM correlation is .33 and the ezPM100/RAPM correlation is only .16. I have a few other seasons of RAPM, so here are the WP48/RAPM correlations for 2009, 2008, 2007, and 2006: .22, .32, .29, and .29. If you think that players are fairly separable from their teams (which is part of the reason we do this whole statistical analysis thing in the first place), then these correlations are actually artificially inflated to a certain degree. That’s because each player is listed for each team he plays for in a season. Thus any player who plays for multiple teams contributes multiple data points. If he plays about the same for each team, and each system picks up on that, the correlation between the systems will increase.
Assuming that the drop in correlation for WP48 and ezPM100 from 2011 to 2010 is generally true, then the three systems should be much less similar for 2010 and previous seasons. If I can get my hands on some good data, I’m looking forward to filling in some more retrodictions. I’m guessing there will be bigger differences and with enough seasons one method may pop out as the preferred option.