## The Signal and the Noise: The Benefits of Theory

I got Nate Silver’s book ‘The Signal and The Noise’ as an early Christmas present (thanks Dad!) and just finished it last night.  I’m hoping to touch on a few different points from the book over time, but today I’m going to focus on one where I agree with Nate, which is the importance of having some theoretical background to guide a predictive model.  That sounds overly fancy, but it will be clear as day in a second.

When people make predictions, they use a model.  Sometimes that’s a mathematical model, like the ones I tend to talk about, but sometimes it’s just a rule or heuristic, like always bet on black.  That model can really be anything; you could make predictions based on a wild guess but then your model would be something like flipping a coin or randomly generating a number in your head.  There’s always some basis for a prediction.

However, there isn’t always a guiding reason for that model.  Let’s say you want to predict the winner of NFL games.  You decide to flip a coin and take the home team if it comes up heads.  There isn’t much of a guiding reason there; you could be making a grand statement that single NFL games are basically random and you feel good just flipping a coin.  But there’s no theory or reasoning behind it.  You would actually be ignoring established knowledge, like the home team wins more than 50% of the time (and you can do even better than that).  You could decide to follow the choices of a soccer-loving octopus, but you’d have to ask yourself how the octopus is making decisions.

A great place to look at various models is in predicting the winner of an election.  Nate Silver is (now) (somewhat) famous for his 538 blog, where he predicts which state will swing for which presidential nominee as well as various governor and Senate/House races.  To do this, he uses polling information combined in some manner.  The theory is fairly straightforward: if you ask people who they’re going to vote for, and adjust for noise and bias in the sample, then you have a good idea of who they’ll vote for in November.  But you could also try other models, like those listed in this Cracked article.  Perhaps you like the Redskins rule: if the Redskins lose their most recent home game before the election, the incumbent party (or perhaps the last party to win the popular vote) will also lose.  Or the Summer Olympics rule, where the incumbent nearly always wins as long as a country that has previously held an Olympics hosts the most recent Games.

This is where having a theoretical reason for your model comes in handy: what if some of the models disagree?  London hosted the most recent Summer Olympics and had hosted them before, so that predicted an Obama victory.  Nate Silver’s polling data also went for Obama.  But the Redskins lost their game against Carolina, predicting a Romney victory.  Before the election happened, could we form a preference for one of the models over the others?  After finding out the result, could we say what happened and what went right (or wrong)?  Particularly with rare events like Presidential elections, you don’t get a lot of feedback.  One wrong prediction isn’t a death-knell for a model unless you think it should be 100% correct; maybe the Redskins rule works but happened to be off this year?  The Olympics rule was wrong once too.

Let’s apply this to the Redskins.  First, what theory would generate that model?  The Cracked article notes that you might think about public excitement at a football game, which reflects happiness with the state of the universe, providing extra home benefit to the Redskins.  However, the article also notes that it’s hard to think of why it would apply to that one game.  The incumbent has been in office for over three years at that point; why not all Redskins home games?  Should this apply to the Ravens, who are close enough to presumably benefit from happiness in DC?  Should it apply to the Nationals since they’ve moved to town, or the Senators?  How about teams everywhere, since this is a national election that we’re talking about?  The answer, of course, is that there wasn’t really a theory that produced the rule.  Someone happened to notice that the pattern held, and then it (kind of) worked in the two elections prior to this year’s.  In other words, this was not a theory-based prediction but a data-based prediction.  I’m sure I’ll have more to say about the theory/data distinction in the future.