Yes, the time has finally arrived to talk about the football model (although if anyone wants to talk about Flight of the Conchords, that’s also cool). Here’s the deal: over the past two years I’ve been working on a model to predict NFL winners against the spread. I have a model that works, in that it would make money (if I lived somewhere where I could bet on NFL games), but then I bought a new computer and didn’t want to have to put MatLab on it, which is what the program was written for. So I decided to rewrite things for R, and while I was at it see if I could improve the model. I tested three models: my previous one, which is kind of a stripped-down version in terms of the number of parameters; a ‘kitchen sink’ model with everything I could throw in there, and an intermediate model. My data set covers all games from 2004 to present, including playoffs. I include playoff games with the rest of the data because I don’t think they are fundamentally different from other football games, perhaps with the exception of week 17 games for teams that have nothing to play for. While I was checking on the models, I thought I would also expand from predicting score differences to also predict total points (used for the over/under) and winners via a logistic regression to produce a probability (used for the money line). I don’t have over/unders or moneylines, but I do have the spreads and accompanying odds (because not every game is run at -110) from Bodog for 2008 and 2009, and the lines from mrnfl.com for the earlier seasons. Given the interest in potential legitimate betting, I’m going to focus on the past two seasons.
The first thing I wanted to check was how the models fit the data. In terms of the logistic regression for win probability, the AIC (a measure of fit; smaller means better fit) drops as I move from the sleek to the middle to the kitchen sink models, which you would expect. More predictors means better fit. However, a likelihood ratio test also says that each bigger model is better even accounting for the extra predictors, so in this case bigger is better. The point difference fit is a little different since it’s a linear model (not logistic). The best R squared and adjusted R squared values actually belong to the middle model. Perhaps the kitchen sink has too many colinear predictors or more of the predictors are irrelevant to point differential. This is also true for the total points prediction.
Of course, the actual fit isn’t too interesting. What we’d really like to know is how well these models predict the future. Since the future hasn’t happened yet, I had to do the next best thing: predict the past. I did this by deleting the 2008 and 2009 seasons from my data and running through the algorithm like I would a new season. So the models all started with the 2004 – 2007 seasons as known data, then the results from week 1 of 2008. This was used to predict the results of week 2 2008. Then the models were given the week 2 data and used to predict week 3, and so on through last year’s Super Bowl. Using the predicted score difference and the line from Bodog, I can decide which way to bet a game and then tell if that decision was correct or not. Using something similar to my betting rule (which is a modified Kelly criterion), I can also see how much money each model would have bet and made over the course of the past two years. I don’t know how to evaluate the win probability and total point predictions as well, since I don’t have money lines or over/unders, but I’ll take a look at those too.
Most football games have a pay odds of -110 for the spread. That means that you have to bet $110 to win $100. To know how often you have to pick correctly to break even, you need to solve 100*p – 110*(1-p) = 0, because you get $100 when you’re right and you lose $110 when you’re wrong. Solving the equation gives you 52.38%, so you need to do better than this to make money against the spread. However, the number isn’t completely accurate because not all games go off at -110, and because there are occasional pushes where the point difference equals the spread and you get your money back. In practice, the worst model against the spread was the middle one at 53% accuracy, and it made a little money over the two years. The kitchen sink and sleek models were both over 54% correct and made money, with the sink model doing better. But how much money? There are two ways to look at it. With the bet sizing method I use, the models were acting as if they had a bankroll of around $1200 (I’m guessing because I didn’t use a set bankroll to pick the bet size). Over two years, the kitchen sink model would have made $800. So assuming you pocketed your 2008 profit and rolled the same $1200 into 2009, that’s a return of 66%, which you will not see offered at the bank. Even the middle model, which did worst, returned 11.5%. The other way to look at it is to say that my bet sizing method is too conservative. It’s designed to make sure you don’t go broke, but maybe you’re willing to take that risk. The most money actually bet in a week, if you bet on every game, is about $600. So if you decided to take your chances and run with, say, $1000 (holding back $400 to cover losses and let you keep betting $600 or so a week), then your profits are 80% at best and 14% at worst. So it looks like any of the models would be an ok choice in that they all have positive returns, but the kitchen sink model is the favorite right now.
Another way that you might pick against the spread is to use the win probability. It turns out that the team that wins also covers the spread about 80% of the time; since you need to be about 53% accurate against the spread to make money, you can come out ahead if you can pick the winner 53%/80% = 66% of the time. Since an ROC shows that my middle model predicts winners the best using logistic regression, I’ll use it to predict a winner, and then bet the spread with that prediction. Unfortunately, the model maxes out at about 65%, so it’s at chance against the spread. That’s ok though, because the point difference predictor, as just discussed, does pretty well.
As I just said, the middle model appears to do best in terms of correctly predicting a winner. So I’ll keep it around and list its predictions as I track the money line this upcoming season. The last thing to check on is the total points prediction for the over/under. Again, I don’t have any data there, so the best thing I could think to do is correlate the predicted and the actual total points scored. The middle model does best there as well, but while the correlation is significant it is kind of small. The kitchen sink model does second best, so I’ll track that as well.
If you’ve hung around this long, congratulations! The plan for the season will be to list my predictions for each game. Those predictions will be the winner (money line), point difference (spread), and total points scored (over/under) from both the kitchen sink and middle models. The sleek model never does best, so I’m going to ignore it for now unless it makes a miraculous turn-around this year. As I go through, I’ll talk more about my thoughts on bet sizing, which bets to take, and other things as I think of them. One important caveat, however: I have a hope, perhaps delusional, that some day, someone will want to pay me for my picks. For that reason, I’m not going to discuss my models in detail (outside of the predictions they make and their accuracy), and I’m going to post the predictions just after kickoff for the relevant games so as not to give the information away for free. On the plus side, I’m assuming most of you are in the U.S. but not Las Vegas, so it shouldn’t hurt you too much any way.