Consistent Pieces Lead to Predictive Wholes

The debate on rebounding and wins produced (for example) got me to thinking about a few things, one of which is the predictive value of models.  A model that is able to predict future performance should, everything else being equal, be preferred over models that don’t predict as well.  One of the benefits of Wins Produced is that it is a stable measure; a player’s WP48 this year has a correlation of around .8 with his WP48 next year.  That means that if a player is good this year, we can be confident that he will be good next year (although not certain, of course).  On the other hand, adjusted plus/minus only has a correlation of around .25; it appears to be a very inconsistent measure.  Why might this be?

I can’t say for APM, because there isn’t a known equation for APM player values, but I can look into it for WP.  The discussion on rebounding led me to think about that feature; WP places a higher weight on rebounding than other measures (notably PER, and public opinion).  So I did a little data exercise.  I made a fake data set with 1000 ‘players’.  Each player generates four box score stats, which are supposed to stand in for field goal percentage, rebounding, steals, and turnovers.  These stats are also generated for year two, and the values are pulled from a multivariate normal distribution with all means set to 0.  What that means is that each player has those four stats for two ‘years’, and the average value for each stat is 0 (I set the variance to 1, so each stat is basically a Z score).  The multivariate part means that I can allow the stats to covary.  In this case, each statistic is independent from the others (so a player’s rebound value and his turnover value are uncorrelated) within a year.  From year 1 to year 2, however, they correlate with values I got from the Stumbling on Wins book, which are .47 for field goal percentage, .9 for rebounding, .68 for steals, and .61 for turnovers.  So a player’s FG% in year 1 correlates with FG% in year 2, but neither correlate with any other stats, and the strongest correlation is for rebounding across years (which I’ll get back to later).  These labels are arbitrary, but their correlations are not; what I’m calling FG% could be any ‘statistic’ so long as it has a correlation of .47 with itself across seasons. (side note: I don’t think these stats are uncorrelated in the actual NBA; big men tend to rebound a lot and shoot efficiently whereas guards rebound less and shoot less efficiently, for example, so I would expect a negative correlation for FG% and rebounds.  But I don’t have specific numbers for those values and it should be unimportant to this discussion, so I set the correlations to 0.)

Having made up these numbers, I calculated three ‘summary stats’.  The first was meant to be something like PER; I created it by using the equation .5*FG% + .15*rebounds+.15*steals-.2*turnovers.  Thus it puts the most weight on shooting efficiency; not quite PER, but more importantly it places lower weight on rebounding.  The next was meant to fill in for WP and has weights of [.1, .3, .3, -.3]; there’s less emphasis on shooting and more (and equal) weight on possession stats.  The third was meant to reflect NBA Efficiency and thus has equal weights: [.25, .25, .25, -.25].  Again, these are obviously simplifications.  These summary stats were created for each player for both years.

So what happens?  Well, since this was done with random sampling, I’ll first note that the correlations I specified changed a little bit from [.47, .9, .68, .61] to [.457, .909, .696, .647].  I don’t think that’s important.  Second, I can look at the correlation of each statistic with the summary statistics.  As you might expect, each summary stat reflects what it weighs most strongly.  PER correlates around .86 with FG% but only about .27 with rebounding and steals and -.35 with turnovers.  WP has correlations of .2, .56, .55, and -.56.  NBA Efficiency has flat correlations of .51, .5, .5, and -.49 (you can see the same kinds of patterns in actual data, like slide 8 here).  More importantly, how do the summary stats correlate with themselves across years?  Are they predictive?  Well, PER has a correlation of .53 (95% confidence interval .485 to .574 for those interested), NBA Efficiency .683, and WP .745.  So my fill-in for WP, which weighs rebounding highly, is most predictive of future performance.  Why?  Because it values a consistent statistic.  NBA Efficiency also correlates well because it has a fairly high weight on rebounding.  PER emphasizes an inconsistent statistic and so is itself inconsistent.

Again, the major conclusion here is that a summary statistic that puts relatively high weight on a consistent statistic will itself be consistent.  This makes that statistic more valuable because of its predictive value.  Even if you disagree with how WP attributes rebounds, it is still more valuable than something like PER or adjusted plus/minus.  Why?  Because you know why it values certain players.  If you disagree with giving a player full credit for his rebound, and some player has a high WP mostly due to rebounding, you can choose to discount that player’s value.  This will lead to you having a noisier measure, more akin to PER, but you’re welcome to do it.  If you start with PER and further discount rebounds, I think the measure will quickly become unusable.  You don’t even have the option with adjusted plus/minus because no one knows why a player gets the rating he has.

An important thing to note is that WP does not emphasize rebounds because they are consistent; it values rebounds because they gain possession.  This is important because you might be tempted to simply create a summary stat that is as predictive as possible.  That’s an idea I had, and then saw it in one of the comments somewhere (I think from EvanZ).  I think this is a bad idea however, because it assumes the conclusion.  WP takes the right approach, which is to ask what is important (in this case, winning) and work backwards.  It just so happens to have the feature of finding consistent measures to be important.  PER has decided what it finds important, and those things are not as consistent.  In an interesting twist, if you were to create a measure that is as consistent as possible, it would rely highly (if not solely) on rebounds, since they have the highest year-to-year correlation of the box score statistics.

In my next post (to be finished at some indeterminate point), I’m hoping to look at the effect of divvying up rebounds the way some people have suggested.  In the meantime, I have a question: if rebounding is so dependent on teammates, why is it the most consistent measure in the book?


This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

15 Responses to Consistent Pieces Lead to Predictive Wholes

  1. Pingback: Blogging on the road and some Fanservice « Arturo's Silly Little Stats

  2. Guy says:

    Very interesting. It occurs to me that you could use a similar simulation to finally settle the whole debate on diminishing returns on rebounds. Run a simulation for 5-man teams, assuming that each player’s rebound performance has no impact on his teammates’, and then compare the variance in team rebounding to what we observe in the actual NBA. For each position use the average DReb% and appropriate SD. I believe average DReb% is roughly C .22, PF .20, SF .14, SG .10, PG .08, and the SD of Dreb% is about .030 for the guard positions and .042 for C and F (Arturo could probably provide more exact estimates, but small differences won’t materially change the results). Run a simulation and tell us the resulting SD for team DReb%. If it’s reasonably close to what we observe in the NBA, that will be very strong evidence that diminishing returns are either small or non-existent. But if the actual NBA SD is much smaller than the SD in your simulation — which is modeled on no diminishing returns — then there must be substantial diminishing returns. Care to put your convictions to the test?

    • Alex says:

      Guy, I’ll definitely be taking a look at some point; as I said in a previous post, I completely believe there are diminishing returns. I more have a question about how much they matter, and even more importantly what effects would come about from divvying up rebounds in some manner other than just giving them to the player who got them. But I can tell you right now, whenever I get to it I will not be using rebound percentage. In the same post I said that percentages are simply not a good thing to use when adding things together because you quickly run into boundary problems (more so with defensive than offensive rebounds, but I’ll probably just do total rebounds, which will have the same issue). If someone can convince me to use percentages I’ll think about it. I understand their benefit for comparing players to each other, but I don’t think this is a good place for them.

      • Guy says:

        Alex: The reason to use reb% is that rebounding opportunities vary quite a bit, so Reb48 creates an illusion of much more variance in rebounding ability than actually exists. That said, it won’t matter for this purpose: you can just use DReb48 instead since that variation in opportunities affects both players and teams. You will get the same basic result, I think.

        I don’t know why you question “how much they matter.” The year-to-year consistency you like so much is only valuable IF there is a real link to winning. If most of a high-Reb48 player’s rebs come from his opponents, he is generating wins; if they come mainly from teammates, then he is not generating wins (and whether he does it “consistently” is then largely irrelevant). So whether these truly represent net additional rebounds for the team is extremely important.

        Just FYI, the boundary problem you mentioned was not in fact a problem in Eli W’s analysis, as you could see simply by looking at this graphs. The diminishing returns on DRebs he found was just as strong for lineups with below-average expected DReb% as for squads of strong rebounders. Obviously, a lineup of 3 centers and 2 PFs would encounter the problem you describe, as it would be impossible for them to reach their projected reb% (although that is still a form of diminishing returns — just an extremely obvious one). But since Witus used real lineups, not “5 Rodman” scenarios, there was no problem with his analysis.

  3. nerdnumbers says:

    You code in R, you write awesome posts and use Mario and Lugi as code names for your models. I think these are pretty good attributes for predicting awesomeness! Looking forward to the next post!

  4. Pingback: Late Friday Bullets | The Wages of Wins Journal

  5. Guy says:

    Alex: A question for you: As you note here, WP48 weights rebounds very heavily. The table you link to shows a .68 correlation between rebounds and WP48, and in Arturo’s 2009-2010 data it’s about .75. Rebounds are much more important than shooting efficiency in determining a player’s WP48. However, wins are not highly correlated with rebounds. At the team level, the correlation between WP48 and rebounds is only about half as large, around .37 (and much lower than the correlation with shooting efficiency). Do you have a theory that would reconcile these two findings? How can rebounds simultaneously account for most of the “wins” produced by players, yet account for relatively few actual wins and losses? I have a hard time reconciling these two facts.

    • some dude says:

      I was going to touch on this. Alex, you praise the value of consistency for consistency’s sake irrespective of predictability at the team level and I don’t understand that. Just because something is consistent doesn’t make it predictable.

      Perhaps a better question would be why is rebounding so consistent or time? Why is the variance so small? And what does that mean about our evaluations of players when at the team level the difference is so slight?

      Another issue is the assumption that consistency is good. Why assume players don’t vary in quality over time and assume they are consistent year in and year out?

  6. EvanZ says:

    What if we just value players by height? That will be extremely consistent. Therefore, must be valuable.

  7. Westy says:

    I definitely look forward to the impacts of divvying up the DREBs. I’ve long been curious exactly what would happen. 0.6 for the one who grabs the ball and 0.1 for the other defenders seems appealing for ease of its logic. Thanks for the post, Alex.

  8. Pingback: Arturo's Silly Little Stats

  9. Pingback: Consistency of Metrics | Sport Skeptic

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s