Phil Birnbaum at Sabremetric Research has a new post up about the FAQ over at Wages of Wins. I don’t have the time to talk about it in detail because a guy’s got to sleep sometime, but I wanted to get something up. As it happens, I disagreed with one of Phil’s posts in the past and wrote a post on an old blog. I’ve copied it below with only minor editing; keep in mind I wrote it in early fall ’09. I don’t read Birnbaum’s blog very often because I’m not interested in baseball, but on the occasions I have read it I always come away with one impression: the man is not to be trusted. At the very least, his grasp on the meaning of R squared is tenuous.

I read something on the internet and it was wrong. Advanced NFL Stats, which I enjoy and usually agree with, posted a round-up of recent articles he liked. One of them was this, about how R squared is not a useful measure. And it’s mostly wrong. In essence, it’s a response to a post by one of the authors of Wages of Wins (which I’ve read, liked, and agreed with) in which he claims that there is little relationship between wins and salary in any of the major sports. The evidence is that while there is a positive relationship between the two, the R squared value isn’t very high. Phil Birnbaum says that the real question is, does spending more money lead to more wins? And Phil believes that when you run the regression, in the NBA for example, and find the result wins = .61*salary (in millions) – .76, and the slope (.61) is significant, you have your answer. Now Phil says a number of things I disagree with, so I’m going to step in at different points. Let’s start here.

The conclusion (for Phil) at this point is that there clearly is a relationship, a positive one, between salary and wins. Also, it would be hard to argue otherwise since the equation says that highest spending team would be expected to win 60 games while the lowest would win 27. This is, of course, an early and unwarranted conclusion. The regression can only tell you about what is in the regression equation. It’s possible, and in fact likely, that salary is confounded with other factors that might explain wins. For example, maybe better players are paid more, and so (generally speaking) teams that spend more will win more. In fact, when variables are confounded, you can have some crazy stuff happen. An example (taken from a class) would be predicting body fat from tricep skin thickness, thigh size, and midarm size. These variables will obviously be related to each other. If you run a regression containing all three variables, you’ll find in fact that none of the slopes for those variables are significant, but the regression itself is highly significant (p=7×10 -7, R squared= .80). Why are none of the slopes significant? The covariance between them messes up the error terms for the coefficients. So if we were to simply trust the regression equation we would be in trouble, even though obviously we have explained something here.

Next Phil argues that R squared doesn’t tell you much about the relationship between salary and wins; the fact that in the NBA data from last year the R squared is only 25.6% is inconsequential because the variance could be really big. He makes an analogy to buying a car and points out that the same number can be expressed as different percentages of different numbers (like it could be 700% of your monthly salary, or .01% of Bill Gates’ pocket change). This is simply wrong. Once you have a data set, the variance is set. In terms of wins in the NBA last year, that number is about 199. Now it’s true that the number doesn’t intuitively mean anything, and thus the fact that salary explains 51 is somewhat meaningless. But, what the 25.6% (51/199) tells us is that a lot (e.g., the other 75%) of what causes different teams to win different amounts of games is *not* explained by salary (in fact, R squared is also called ‘coefficient of determination’, and is defined as telling you how much of the variance in Y you have explained). More importantly, there are not different numbers that could come in here. We have our data, our number is 199, and we can only explain 51 with the current model.

Or maybe I’m wrong? Phil says that you can actually play around with R squared. For example, if you group NBA teams into triplets and work with their combined salary and wins, you change the R squared. For example, you now have the team “Knicks-Cavs-Mavericks”, who spent 276 million dollars to win 148 games. If you run that regression, you get the equation wins = .68*salary -17.5. So the relationship between salary and wins is pretty much the same – still .68 million per win. But now the R squared is .49! (these numbers are different from their article, not sure why. But we’ll see soon that it’s irrelevant.) Phil notes that the regression equation hasn’t changed “because we arranged the data differently”, but we have “arbitrarily” increased the R squared. This is also wrong, and in a couple places. Let’s start with combining the data. Unless there’s a really good reason to do this, and in this case there isn’t, you should never combine your data. Why? Because now you’ve sucked variance out of your data. For example, we’ve removed any differences between the Knicks, Cavs, and Mavericks. You’ve taken information out of the system. This is what we would call a “no-no”, or possibly “data massaging at a level that would get you kicked out of your profession”. It gets worse for Phil: the regression equation only stayed about the same because he grouped teams in order of salary. This means that he has maintained the ordering between salary and wins, and so it maintains the positive relationship. There are still consequences, however – while the slope stays about the same and the R squared goes up, the significance of the slope drops. Let’s say instead that I “arbitrarily” grouped into fives instead of threes. The equation is now wins = .73*salary-45, the significance on the slope is only a trend (.055), and the R squared is .64. The slope is getting to be kind of different from what we started with, and the significance is dropping quickly. And, it only is still looking somewhat ok because we kept the salary ordering (it should also work if you ordered by wins and grouped teams that way). Let’s say instead that I randomize the teams and then group them. The regression will fly all over the place across randomizations, becoming super-significant, non-significant, and everything in between. Both the slope and the R squared will change. This is because you have started messing with how the variance within teams is being ‘hidden’ by grouping them. If you treat your data properly, you *cannot* massage R squared. If you don’t, you can change whatever you want, not just R squared, and the regression equation is not immune.

Let me give another example as to why grouping teams is nonsensical. Let’s say you’re Mark Cuban, owner of the Mavs. You’ve hired Phil, who runs his regression with team groups. Phil walks in one day with a big smile and says “Hey Mark! Dan Gilbert, owner of the Cavs, just ok’ed a signing which increases the Cavs’ salary by 10 million!”. If you’re Mark, what do you think? The extra 10 million that one team spent means that your group should win an extra 6 games. Will you win any of those games? Will the Cavs win them all? Could the Knicks win more games because the Cavs spent more money? The answer, which should be evident, is that only the Cavs will be affected. However, the regression equation is agnostic. All it says is that the group will win, on average, 6 more games. Stacey Brook, the Wages of Wins guy, would instead walk right in and say “Well, if you’re just going by my equation, you’d better spend some more money to catch up”. Although he’d probably actually tell you to sign people who play well.

However, it should be immediately evident that it is a mistake to follow either equation. Let’s say now that I’m Joe Dumars, in charge of the Pistons. Last year the team spent $71 million (10th in the league ) but only won 39 games (good for 8th in the East, but something like 16th or 17th overall in the league). Let’s say you figure that you need to win about 65 games to get first in the East and return to prominence. Following the regression equation, you could figure out that you need to spend about $108 million to expect to win 65 games. So you could decide to take your exact same team from last year and give each of the 15 players a $2 million (and change) raise. This would be the extra $30 million or so you need to get your salary to $108 million and win 65 games. Does this make sense at all? It shouldn’t. And that’s because salary does not in fact explain much about winning. Instead, salary is an intermediate variable that covaries with player quality, and player quality determines who wins. Now you wouldn’t know this if all you knew in the world was wins and salary, which is the case with the regression equation. But the fact that the R squared is relatively low gives you a hint that you might want to look into some other variables and see if they explain more about wins. For example, if you were the Wages of Wins authors, you might explain wins with team offensive and defensive efficiency. It turns out that this explains something like 98% of the variance in wins. That is, if you know a team’s efficiency values, you know almost everything there is to know about how many games they’ll win. There is some other factor which influences outcomes a little bit, but not very much.

So to summarize, salary does have something to do with wins, but only if you don’t consider other factors. Even if you do leave it just at salary, you don’t explain very much – a lot of the differences in team wins is due to something other than how much money they spend. And if you start massaging your data to try and make a point, you should probably know what you’re doing.

Thanks for the response! Will read it through later when I have a bit more time to concentrate.

Hi, Alex,

Your main point seems to be that even if payroll is correlated with wins, it’s because there is an intermediate variable causing the correlation. Specifically, more money buys better players, and better players produce the wins. Well, yes, of course that’s how it happens. I wasn’t suggesting otherwise, and I didn’t mention it explicitly because I thought it was obvious.

You come up with examples like giving all your players a raise and expecting their wins to go up. And, of course, that’s silly. But again, it’s not something I ever argued.

And neither did Berri, in his original piece that I was critiquing. My argument is simply: Berri says an r-squared of .256 says that salary isn’t important. I say, no, on the contrary, you can’t tell *just by the .256* that salary isn’t important. The question I’m addressing is whether salary is important. The question you’re addressing is, once you’ve decided whether salary is important, how do you interpret that correlation? I have no beef with you on your question. But it’s irrelevant to my question.

Same thing for your example of when I grouped the teams into threes. When you group teams into threes — say, Mavs, Cavs, and someone else — and the Cavs spend more money, the group wins more games. And then you object that even though that’s true, the Cavs spending doesn’t affect the Mavs winning. Yes, of course. Who said otherwise? Certainly not me.

One point on which I think we very much disagree: you say that if I grouped the teams randomly instead of by salary, the regression equation would change. Yes, it would, of course, depending on the grouping. But, *on average*, the regression equation will be the same as if you used the entire league. Try it, if you like. You’ll see.

However, the r-squared will almost always be higher with grouping than without. Why? Because, when you combine three teams together, you reduce the variance that’s due to luck. Remember when the r-squared was only .256? That’s because there was a lot of random luck in the remaining .744. Combine the teams, and maybe the ratio of signal:luck drops from 256:744 to 256:500 (making that number up). That leads to the r-squared increasing to .34.

That’s my argument. That the .256 changes predictably depending on what data you use, but the regression equation stays the same (subject to random fluctuation around the same expected equation).

I absolutely guarantee you that if you run the same regression, but use the number of wins in only the team’s 41 odd-numbered games, you’ll get an r-squared lower then .256. And I guarantee you that if you use the number of wins in two seasons (164 games) instead of just one, you’ll get an r-squared higher than .256. But I bet you that both other regression equations will be reasonably close to the one you get in the .256 case.

That’s my point — that the r-squared depends on the number fo games for each team that you put in your regression. If you use one game, you get close to .000. If you use 82 games, you get .256. If you use between 0 and 82 games, you’ll get something between .000 and .256. And If you use 164 games, you get something higher than. 256.

If spending more money leads to more wins over 82 games, then spending more money had to lead to more wins in odd-numbered games. But if the expected r-squared is different in both cases, when the answer is the same, then it can’t be a very reliable guide to the answer, can it?

P.S. when you say “the Man [me] is not to be trusted,” I hope you mean to say that my logic is wrong, and not trying to imply that I’m deliberately lying.

No, I didn’t mean to say you were lying.

My main point is that grouping changes what you’re looking at. If you leave everyone alone, you have a certain amount of variance, say 500. When you group, you hide variance. As you say, some of that is random noise, but it also contains useful information. In the example from my post, the Mavs and Cavs may be efficient spenders, but the Knicks certainly are not. When you group them, the Mavs and Cavs hide some of the Knicks’ wasteful spending. It isn’t clear to me that you can tell a priori how much of the hidden variance will be noise versus signal.

While you didn’t make the claims I described (adding payroll to the same team, the Mavs benefiting from the Cavs’ spending), your description of the regression equation made it sound, to me, as if that’s what you would claim. If you believe that salary leads to winning, how could it be otherwise? That’s what the model coefficients would tell you. My point is that the coefficients are not all-important, you need to properly evaluate them along with R squared. Typically, I would look at R squared first. If it’s small, you know that your model is not doing a good job of describing what you want it to describe regardless of what the coefficients say.

One other question Phil – I don’t have salary data handy, but I imagine you do. If you were to run your regression with as many years of NBA data as you can (say 30 seasons?), what does the R squared get up to? I’m going to take a wild guess and say no higher than .5. This would still be an indication that salary only somewhat gets at what makes teams win.

Alex,

I don’t have the exact numbers in front of me, but they’re something like this:

The SD of team wins is about 11. The SD of binomial variation in wins over 162 games is about 6. That means the SD in talent is about 9 (6^2 + 9^2 ~= 11^2).

So, if salary was 100% correlated with talent (which it isn’t), the maximum r-squared you could get would be 9^2/11^2, which would be about .67.

Since salary is not 100% correlated with talent, your guess of about .5 is probably reasonable.

However, if you combine 2, or 5, or 10, or 20 seasons, the r-squared would go up and up and up. If salary were 100% correlated with talent, over 50 seasons (ignoring inflation and such), the r-squared would be very close to 1.

And that’s my point. The r-squared depends on the sample size. You cannot base any conclusions on the intuitive size of the r-squared. The statement, “The r-squared is only .25, so therefore payroll doesn’t have much to do with wins” is nonsense.

I think my post from last night demonstrates that to be false, but you can let me know if you disagree with what I did. Increasing the sample size will only get you closer to the true value, not closer to the maximum possible value.

Sorry, the above was baseball, as you probably figured out. I don’t have the corresponding numbers for basketball.

Alex, do you really look at R^2 first, as if the real world is only made of models where 95% of variation is explained? I know you are relatively young, but that is extremely naive thinking.

I hope you don’t ever plan to go into biology or politics, you will find the real world to be quit a hairy place! Full of “unexplained” variance.

I’ve actually been in the hairy world of psychology, where correlations are typically low (especially social psych, although that isn’t my area). However, I’m naive enough to think that if one model explains variable Y with R squared .95 and another explains Y with R squared .5, the first model should be preferred (all else being equal).

Phil,

http://en.wikipedia.org/wiki/Multicollinearity

Payroll does not explain wins. Winning teams have to pay players. Connect the dots.

Good link. You should read it. Under the section “Remedies for Multicollinearity”:

“Try seeing what happens if you use independent subsets of your data for estimation and apply those estimates to the whole data set. Theoretically you should obtain somewhat higher variance from the smaller datasets used for estimation, but the expectation of the coefficient values should be the same. Naturally, the observed coefficient values will vary, but look at how much they vary.”

For both Evan and Arturo – this isn’t a case of multicollinearity per se, because it’s a simple regression. Either way, that particular suggestion (and a couple others in the same section of the wiki article) isn’t so much a remedy as a check. If your subset coefficients are very different from those in the whole set, you know something bad is happening. The concern with the salary analysis is that there are obvious missing variables, and the R squared tells you that you should be looking for better predictors.

This is sophistry. Dave Berri always starts from the assumption that general managers in a given sport don’t know how to value players. But he never considers the hypothesis that perhaps Dave Berri doesn’t know how to value players.

It’s pretty much how everyone works. If you develop a model and you didn’t think it was the best available (at least for the questions you’re trying to answer), why would you ever use it? Berri developed a model, and analysis with the model suggests that GMs aren’t properly evaluating players. The people who prefer adjusted +/- would presumably disagree with GMs as well; Miami spent a ton of money to get LeBron, and he’s hasn’t even played as well as Boobie Gibson this year (or maybe he has, hard to say for sure with the errors). http://basketballvalue.com/topplayers.php?year=2010-2011&mode=summary&sortnumber=94&sortorder=DESC . GMs presumably have their models (if not numerical ones), and I would hope that they think they’re properly evaluating players.

Pingback: Arturo's Silly Little Stats

Pingback: Competitive Balance Probably Won’t Help the NBA | Sport Skeptic

Pingback: Salary…. Again | Sport Skeptic

I don’t really get why you us the offensive/defensive efficiency as an alternative example to salary. These values themselves depend on one million other things and cannot be tweaked directly (even depend on salary). I could tell you, that point margin explain 99% of wins too… In order to win teams just have to make more points than their opponents. It is almost too easy! ;)

Money is still the single most important factors in influencing wins. If you disagree, find another single factor which is more important (despite luck). If you think it is the capability of GM’s to recognize performance, than you should try to prove that. I would argue, that one of the desperate teams would follow the Moneyball example an hire a lot of statisticians to have some success. (however, I doubt they had much success, because in basketball players are more interdependent to just analyse the box scores)

I would further point out, that salary seams to have a smaller influence because of the salary cap. But it doesn’t change the underlying paradigm. Big market teams still have advantages in attracting players because they have an attractive city and often a better team (path dependency). Than if you look at the rules, you recognize, that many players are underpaid at their climax and overpaid later. If Kobe gets 35M a year he will be 35 and in his least productive year. This is totally screwing up the correlation, but is totally rational to do. If you cannot pay players their real value now, than overpay them in the future. And than all the rookie rules further screw up the correlation. Basically players are most of the time not paid for their current performance, but for past or future performance. This can mess up the results in the short term and I therefore agree with Phil, that you should look more long term. (but than you also measure path dependency …)

I think the real point is that salary isn’t as critical to winning as people make it out to be. Teams win by having good players, and in a ‘perfect’ world good players would be paid more than bad players. But for a variety of reasons, that isn’t true and so the relationship between team salary and winning isn’t particularly strong (although what counts as ‘strong’ may be a matter of opinion).

In regards to teams hiring statisticians, that is an increasing trend. Two of the teams best-known for doing that are the Rockets and the Mavs, and both have had pretty good runs of long-term success. But I believe about half the league has, or recently has had, a stats guy known to work for them and their record does tend to be better than the other teams in the league.

That is probably true, but I would argue that statisticians use not just box score based measures for their decision making (I think the Mavs are more into +/- statistics). In principle I believe that you can use statistics to make better decisions, but I don’t you can base your decisionsons WP48 alone, because it has some loopholes.