The debate on rebounding and wins produced (for example) got me to thinking about a few things, one of which is the predictive value of models. A model that is able to predict future performance should, everything else being equal, be preferred over models that don’t predict as well. One of the benefits of Wins Produced is that it is a stable measure; a player’s WP48 this year has a correlation of around .8 with his WP48 next year. That means that if a player is good this year, we can be confident that he will be good next year (although not certain, of course). On the other hand, adjusted plus/minus only has a correlation of around .25; it appears to be a very inconsistent measure. Why might this be?
I can’t say for APM, because there isn’t a known equation for APM player values, but I can look into it for WP. The discussion on rebounding led me to think about that feature; WP places a higher weight on rebounding than other measures (notably PER, and public opinion). So I did a little data exercise. I made a fake data set with 1000 ‘players’. Each player generates four box score stats, which are supposed to stand in for field goal percentage, rebounding, steals, and turnovers. These stats are also generated for year two, and the values are pulled from a multivariate normal distribution with all means set to 0. What that means is that each player has those four stats for two ‘years’, and the average value for each stat is 0 (I set the variance to 1, so each stat is basically a Z score). The multivariate part means that I can allow the stats to covary. In this case, each statistic is independent from the others (so a player’s rebound value and his turnover value are uncorrelated) within a year. From year 1 to year 2, however, they correlate with values I got from the Stumbling on Wins book, which are .47 for field goal percentage, .9 for rebounding, .68 for steals, and .61 for turnovers. So a player’s FG% in year 1 correlates with FG% in year 2, but neither correlate with any other stats, and the strongest correlation is for rebounding across years (which I’ll get back to later). These labels are arbitrary, but their correlations are not; what I’m calling FG% could be any ‘statistic’ so long as it has a correlation of .47 with itself across seasons. (side note: I don’t think these stats are uncorrelated in the actual NBA; big men tend to rebound a lot and shoot efficiently whereas guards rebound less and shoot less efficiently, for example, so I would expect a negative correlation for FG% and rebounds. But I don’t have specific numbers for those values and it should be unimportant to this discussion, so I set the correlations to 0.)
Having made up these numbers, I calculated three ‘summary stats’. The first was meant to be something like PER; I created it by using the equation .5*FG% + .15*rebounds+.15*steals-.2*turnovers. Thus it puts the most weight on shooting efficiency; not quite PER, but more importantly it places lower weight on rebounding. The next was meant to fill in for WP and has weights of [.1, .3, .3, -.3]; there’s less emphasis on shooting and more (and equal) weight on possession stats. The third was meant to reflect NBA Efficiency and thus has equal weights: [.25, .25, .25, -.25]. Again, these are obviously simplifications. These summary stats were created for each player for both years.
So what happens? Well, since this was done with random sampling, I’ll first note that the correlations I specified changed a little bit from [.47, .9, .68, .61] to [.457, .909, .696, .647]. I don’t think that’s important. Second, I can look at the correlation of each statistic with the summary statistics. As you might expect, each summary stat reflects what it weighs most strongly. PER correlates around .86 with FG% but only about .27 with rebounding and steals and -.35 with turnovers. WP has correlations of .2, .56, .55, and -.56. NBA Efficiency has flat correlations of .51, .5, .5, and -.49 (you can see the same kinds of patterns in actual data, like slide 8 here). More importantly, how do the summary stats correlate with themselves across years? Are they predictive? Well, PER has a correlation of .53 (95% confidence interval .485 to .574 for those interested), NBA Efficiency .683, and WP .745. So my fill-in for WP, which weighs rebounding highly, is most predictive of future performance. Why? Because it values a consistent statistic. NBA Efficiency also correlates well because it has a fairly high weight on rebounding. PER emphasizes an inconsistent statistic and so is itself inconsistent.
Again, the major conclusion here is that a summary statistic that puts relatively high weight on a consistent statistic will itself be consistent. This makes that statistic more valuable because of its predictive value. Even if you disagree with how WP attributes rebounds, it is still more valuable than something like PER or adjusted plus/minus. Why? Because you know why it values certain players. If you disagree with giving a player full credit for his rebound, and some player has a high WP mostly due to rebounding, you can choose to discount that player’s value. This will lead to you having a noisier measure, more akin to PER, but you’re welcome to do it. If you start with PER and further discount rebounds, I think the measure will quickly become unusable. You don’t even have the option with adjusted plus/minus because no one knows why a player gets the rating he has.
An important thing to note is that WP does not emphasize rebounds because they are consistent; it values rebounds because they gain possession. This is important because you might be tempted to simply create a summary stat that is as predictive as possible. That’s an idea I had, and then saw it in one of the comments somewhere (I think from EvanZ). I think this is a bad idea however, because it assumes the conclusion. WP takes the right approach, which is to ask what is important (in this case, winning) and work backwards. It just so happens to have the feature of finding consistent measures to be important. PER has decided what it finds important, and those things are not as consistent. In an interesting twist, if you were to create a measure that is as consistent as possible, it would rely highly (if not solely) on rebounds, since they have the highest year-to-year correlation of the box score statistics.
In my next post (to be finished at some indeterminate point), I’m hoping to look at the effect of divvying up rebounds the way some people have suggested. In the meantime, I have a question: if rebounding is so dependent on teammates, why is it the most consistent measure in the book?