## Predicting NFL Winners in Week 1

Predicting the winner of week 1 games in the NFL should be, in theory, really hard.  As has been shown a few times, it’s almost impossible to know how many wins a team will have this year based on how they did last year (here’s a link to an Advanced NFL Stats post on the subject); you can do as well or better than most anybody by predicting everyone will go 8-8.  If a team’s quality from last year tells you nothing about their quality this year, how would you know who to pick?

Before I dig into this, I have to have an aside about point differential.  It’s come up a lot on my blog because it’s so darn useful.  From 2004 to 2009, for example, a regression that predicts team wins from their point differential is highly significant, with an R squared of .84 (interestingly, this isn’t nearly as high as the same regression for the NBA; a discussion for another time).  The equation predicts an average 8-win season if you have a differential of +1 (although I don’t think the intercept at differential=0 is significantly different from 8), and you would expect to win an extra game for each 35.5 points of differential you have.  To have a perfect season you would want a differential of about 283; the Patriots actually put up a 315 for their perfect season (and the Lions a -249 when they went winless).  But the main thing to note here is that we can use point differential as a substitute/alternative for team wins.

Is it really impossible to predict how a team will do this year based on how they did last year?  If you read Brian’s article, it’s actually a mixed result.  The regression of this year’s wins based on last year’s wins is significant, with an equation of wins this year = 5.96 +.254*last year wins, but the R squared is very low (those are my numbers, but closely mirror Brian’s).  This suggests that knowing last year’s win total is useful, but only weakly so.  Additionally, Brian uses the mean absolute error as a measure of prediction accuracy; it’s the average of the absolute value of the prediction error.  So if I thought a team would win 5 games and they win 8, the error is 3; if I predicted they would win 11 the error would still be 3.  I average the errors for my data set and I get the MAE.  The regression produces an MAE of 2.509; predicting 8 wins for everyone produces an MAE of 2.544.  So the regression does ever so slightly better than predicting that everyone is the same, which is about what the regression told us.

Using team wins is only a little helpful, but the regression predicting this year’s point differential from last year’s differential is a better fit than predicting wins from wins.  If we put predicted point differentials through the reliable differential-to-wins regression equation (effectively using last year’s point differential to predict this year’s differential, then converting that to wins), the MAE is now 2.472.  This isn’t thrilling compared to the 2.544 from saying everyone’s the same, but we’ve done a bit better.

So what is all this building to?  I ran a logistic regression predicting if the home team would win their week 1 game based on their point differential (or wins, which is about the same) and the away team’s point differential.  Both the model fit and an ROC show that away team point differential isn’t a significant predictor, which is surprising.  It seems like the team that you play against would matter, but apparently not in a statistical sense.  But, using the prediction from the logistic regression, we can tell which team will win in week 1 at least 65.6% of the time just using the home team’s point differential from last year.  If we just took the home team every time, as we would if we thought we knew nothing about the teams, we would get the typical home team win percentage as our accuracy, which is 56.25%.  So we have rocketed up almost 10% in picking the winner!

As I noted in my last post, this accuracy isn’t quite high enough to use to pick against the spread.  But we can still make some guesses as to who will win this weekend, if you’re inclined to do anything with that information, starting with the Saints and Vikings tonight.  The table is below:

hometeam  awayteam   winpred
Saints          Vikings        0.7338815
Bills              Dolphins     0.4795665
Bears           Lions           0.5026814
Titans          Raiders       0.5026814
Patriots       Bengals       0.7087920
Giants          Panthers    0.5292432
Steelers       Falcons       0.6073649
Bucs             Browns       0.3801759
Jaguars       Broncos       0.4542446
Texans        Colts            0.6194294
Rams           Cardinals     0.2739937
Eagles          Packers       0.6588697
Seahawks    49ers          0.4314249
Redskins     Cowboys     0.4772582
Jets              Ravens       0.6793457
Chiefs          Chargers     0.4088926

So it looks like the Saints have a good chance of repeating the conference final game from last year and the Rams have a good chance to look just like they did last year as well.