Phil Birnbaum recently took everyone to task for a paper that came out in Psychological Science looking at the influence of round numbers on goals and motivation. His main complaint is that the NY Times article reports that this Psych Science paper (I can only find a preview version) claims that MLB hitters sitting at .298 or .299 batting average at the end of the season hit .463 in their last at-bat. Phil spent three posts demonstrating that .463 is far too high and that the finding is likely caused by players being pulled after getting a hit to go over .300, thus creating a sampling bias that leads to such a high number. In no particular order, Phil’s complaints are that economists don’t know anything about baseball, academic sports studies are usually wrong, the media reports these studies before the study is released, the media doesn’t correct reports that end up being false, and the media should turn to the sabermetrics community as experts in this field.
I won’t try to address all of these concerns, but as a member of academia (and psychology in particular) I think it’s important to note a few things. First, it’s important to note that Phil is right: the overly high batting average is likely due to some kind of sampling bias. But, that finding is not due to the authors being inexperienced when it comes to baseball, it comes from a numbers mistake. Economists are trained mathematicians and statisticians; they know all about numbers and should know about sampling bias. You might argue that if they talked to someone who knew more about baseball or sabermetrics that they would have been more likely to catch their mistake, but they also might not have. It seems like the authors talked to someone because on page 5 they note that boosts in performance are not found in close games, for example. So, yes, someone missed the ball on this one, but I don’t think it should be taken as indicative of econometric or statistical work in general, and I doubt that every “published” article by a sabermatrician has been correct.
Second, the authors of a paper often only have some control over what appears in media accounts of their work. I’ve only been second author on a paper that was picked up at all, so I don’t know all the details of what goes on, but my understanding is that there is essentially a media office in the university. That office will make some decisions about what papers being put out are interesting, and the wider media can then pick it up. I don’t think that authors are running around trying to get their stories out; 99% of the time the media wouldn’t care. With this being the case, it’s important to note that the paper must exist and, I would think, has already been accepted to a journal. That doesn’t mean that it’s available to everyone, due to publishing lags, but the journalists should certainly have access to it if they want it. Once the media has it, they can call or email the author(s) with questions or to get quotes, but the authors have little control over what the media story says in the end. In this case, the NY Times article says that these motivated hitters bat .463, but I can’t find that number anywhere in the early release of the actual paper. I find .43, which is also probably too high, but either this number was changed in a later version of the paper, was misquoted, or the media just got it wrong.
Perhaps more important, the media is allowed to focus on whatever they want. In this case, they took that one number and ran with it. The whole Times article is about baseball players trying desperately to break .300, and that’s all Phil focused on as well. But if you read the actual paper, there are two other parts to the study and the paper is not really about baseball players suddenly batting really well. The paper is actually about how round numbers serve as goals that motivate people to try to improve. Baseball players are one example, and even Phil’s sample issue still supports the conclusion: managers and batters keep players in if they could break .300 and take them out if they might drop under, even though there isn’t really any meaningful difference in performance between .300 and .299 or .298. The other evidence includes high school students taking the SAT; there seems to be a lack of juniors with scores that end in 90 (1090, 1190, 1290, etc) compared to seniors, which implies that the juniors take the test again and report the second score. There’s also an experiment asking people how likely they would be to continue doing some activity dependent on how much they had done so far, with some subjects getting a far below round number, others getting close to a round number, and a final group getting the round number or barely over. In each case, the evidence suggests that people will continue some activity or make another attempt if they are just below a round number, even though the difference is negligible, suggesting that round numbers become goal markers that drive motivation for people. The second two studies say nothing about people actually doing better when just below a round number, just that they try again, but this is all that the media article and Phil focused on. In general, authors can of course try to guide what the media will say or try to point out the limitations of their study, but the media will print what they want in the end.
At the end of the day, I don’t think it matters if the media gets sports stats articles correct. Unless someone is analyzing injury data, or perhaps other broadly important topics like discrimination, this is all really for fun. So I think that the more important part of Phil’s discussion comes toward the end. Phil would like the sabermetricians to be viewed as the experts and they get the final say. He doesn’t want to sound like he’s bashing academia, but he’s totally bashing academia. I don’t think that should be the case. Sports stats, and statistics in general, is one of the few fields these days open to absolutely anyone. With very few exceptions, you can’t study physics or chemistry in your basement and do new work; you can’t make breakthroughs in medicine or genetics or astronomy. But anyone can get R or Excel or some other software and data from Yahoo or ESPN or basketball-reference and start digging around the numbers.
This raises two relevant issues. First, who qualifies as an ‘expert’? Is it the guy with the most blog hits? The most posts? The guy that the most other guys think is good? This problem exists in academia, but there are obvious markers: number of publications, the quality of the journals that those publications appear in, the school that hired the person, invited talks at other schools or quality conferences. These aren’t guaranteed to tell you who the best is, but you’re very likely to pick an expert in his field. I’m not sure that you have anything like this in the sabermetrics community. On top of that, there are literally no barriers to being a sabermetrician. Like I said, all you need is Excel and an internet connection. I understand that the community is self-correcting (much like academia) and that some, perhaps many, of the members do have training in statistics, but there is zero guarantee that any particular person is doing things correctly. In academia, on the other hand, researchers are either Ph.D’s or grad students on their way to one. You know that they have statistical training because it’s required. They’ve been trained by and their work has been checked by other qualified individuals with training, and it’s happened at multiple points. As this article makes obvious the system isn’t foolproof, but the end result of academic work (a journal article) is far more likely to be correct than the end result of sabermetric work (a blog post or the like).
The other relevant issue given that anyone can do sports stats is that, despite the animosity between academics and sabermetricians, everyone is working toward the same goal: having a better understanding or description of their sport(s) of choice. Importantly, everyone can contribute to that goal. It would be better if both sides viewed the others’ work as valid. I don’t see any particular reason to view one side or the other as better than the other in the end, and good work comes from both, so I’m not sure why sabermetricians should be viewed as the arbiters of baseball information. It would be nice if the media would turn to sports stats guys for info as well, but I don’t think they should be faulted for asking academics. But if academics and sabermetricains can’t agree on anything else, I’m sure both would agree that the media could do a better job.