Apologies for how quiet it’s been around here; it turns out that even grad students have to do stuff sometimes. Although it seems to have paid off, because I have a postdoctoral position lined up for the fall, which is great. In the meantime, I’ll try to be better; it shouldn’t be too hard as the playoffs (both NBA and NHL) get closer.
For now, I wanted to look at adjusted plus/minus (APM) and see if I could provide a bit of a summary and ask a few questions. APM is an alternative to boxscore measures of player productivity. Instead of looking at how many points, rebounds, steals, etc, a player accumulates and then weighting them in some manner to get a single number that measures his production (the method used for Wins Produced, Win Shares, PER, etc.), APM uses regression to directly estimate how a team does when he is on the floor. The ‘pure’ APM format is to create a data set where each line contains a snippet of a game, over each timespan where no players are subbed in or out. The dependent variable is the point differential for the home team in that timeframe (typically adjusted by the number of possessions to get a points per possession measure). The independent variables are dummy codes for the players on the floor in that timeframe. So at the start of a game between the away team Knicks and home team Celtics, you would have 1′s in each column for Kevin Garnett, Rajon Rondo, and the other Celtic starters; -1 in each column for Amar’e Stoudemire, Carmelo Anthony, and the other Knicks starters; and the points per possession value would be however much those five Celtic players outscore (or are outscored by) those five Knicks players divided by the number of possessions. Once a player is substituted in on either team, a new data line is started. And your data set has every possession from every game played all season long. APM has been discussed and described a few places; you can look at posts by Arturo, Aaron at basketballvalue.com, Dan Rosenbaum, Eli Witus, a number of places in the APBR community, Dave Berri, and I’m sure elsewhere. Obviously a lot of credit goes to them; I’m summarizing their work and thoughts here.
This regression is meant to calculate each player’s contribution to the team’s bottom line (outscoring the opponent or being outscored) while accounting for his teammates and his opponents. The main benefit is that APM should account for things that don’t appear in the boxscore; does player X set good screens, does he space the floor, does he close out on shooters, does he disrupt the opponents’ offense. Defensive value is perhaps the best part of APM, since the only individual boxscore measures of defense are steals and blocks. The second best part is the fact that regression is meant to account for the other variables, in this case teammates and opponents. Sure, Kevin Love gets a lot of rebounds, but maybe it’s because his teammates force opponents into bad shots, and that’s where the value is? Maybe he scores so much because defenses key in on Beasley? In theory, APM gives a measure of a player’s value completely separated from other players in the league, regardless of how they might contribute.
That last sentence also summarizes the downside to APM. One big problem is theoretical; APM is a black box. The data goes in and the numbers come out, but we can’t say why they turn out the way they do. If Kobe is above average, is it due to his scoring? Is it his clutch ability? APM can be separated for offense and defense, so there’s some value there, but if someone is an above-average defender you can’t say why. With box score measures, you can point to where a player gets value and declare that to be why he is producing.
The other issue is a practical matter: players tend to play with the same guys over and over. Starters are a good example; they are often on the court at the same time. An extreme example from the same technique in hockey comes from the Sedin brothers; Daniel appears to share over 90% of his ice time with Henrik. What this means is that those players (which are variables in the regression) are highly collinear: their values follow each other very closely across observations. Players who play together a lot have virtually identical contributions to the model (they are both 1 or 0 most of the time), and thus the model cannot tell them apart. This leads to two issues mathematically: unstable coefficients, meaning that players may be given incorrect APM scores, and high errors, meaning that we can’t be very certain about how good a player actually is. The solution, practically speaking, is to add more data: if you include previous seasons to add more data points and gain some leverage from players being separated due to injuries and trade, the estimates become better. Kobe Bryant is a good example. His APM this year is -5.23 with an error of 6.86. If we had to guess, Kobe is a very bad player (a score of 0 means that his team would play even on a neutral court if he were on the court and the other 9 players were equally matched to each other). But we can’t be sure because the error is so big; we can only be somewhat sure that he’s somewhere between awful and slightly above average. If you add in last year as well, though, he has a score of 4.06 with an error of 3.59. Over the past season and so far this year, Kobe is a positive contributor, and we can be somewhat sure that he’s above average. It also turns out, as described in Arturo’s post, that the APM regression does a very bad job of describing what happens on the court. For whatever reason (noisy data or otherwise), the R squared is very low; you would not be terribly wrong if you just declared every player equally good.
A few methods have been suggested for dealing with these issues (beyond adding more seasons). One is to try statistical plus-minus (SPM), which uses regression to predict APM from box score metrics. The Rosenbaum link above does this as part 2 of his final APM measure, and Evan has done something similar with regularized APM and his model. Since the boxscore tells us why someone is effective (e.g., we can see that the shot a good percentage, or get a lot of steals), connecting that to APM can be informative. Another option is the regularized APM I just mentioned; it’s also called ridge regression. What this does in practice is move all players close to average (0). However, even with multiple years of data, RAPM is not as predictive as you might like.
In summary, APM is a statistic that has great promise but big practical issues. These issues have not gone unnoticed; beyond Arturo and Dave Berri’s posts, some people at the APBR site have been very cautious about its use (including RAPM). But other people are not; it’s used as the basis for various SPM models and the same approach is used to analyze rebounding. This leads me to the bleg portion of the post, aimed mostly at people who do use APM: in short, why keep using it? The one-year results, even for RPM, are so noisy as to be unusable. It has very little predictive power; the people you think are good this year could be great, terrible, or anywhere between the next year. Despite the noise, some people use it to evaluate their own model or build new ones; why rely on something so unreliable to determine your model? Has anyone attempted to see if APM becomes more predictive with more non-overlapping years? For example, if you create 2-year APM from 07-08 and 08-09 and used it to predict the 2-year APM from 09-10 and 10-11, how well does that turn out? Comparing APM and boxscore metrics is common in evaluating a player and my sense is that APM is given the benefit of the doubt. For example, a player who scores highly on APM but not WP or WS or whatever *must* be a good defender or spread the floor; rarely is it assumed that his score is a mistake (unless he’s perceived to be good but scores poorly, like Kobe this year). If you only use multiple-year APM, how do you know who was good just last year, or this year so far? Weighting seasons is meant to cover that issue, but I bet it does little to improve the errors.
So help me out guys: why use it?
(A quick P.S.: I understand the boxscore metrics all have their own drawbacks. I know why people use them, though; I’m less clear on why people continue to use variations of APM or its method.)