Tonight we have what has traditionally been an entertaining game, the Patriots and the Jets. Since it’s the beginning of the season, I thought I’d take the opportunity to run through the game as a little reminder of how my models work in terms of picking who’s going to win. You can click on the link in the header to go to an article that spells it out a bit more, but then you’ll miss all the AFC East-y goodness!
My models were inspired by Brian Burke’s at Advanced NFL Stats, which I’m sure most of you already read religiously (or will soon after clicking on that link). There are two broad ideas: win-loss record is not a good indicator of team quality, because it’s really one summary number affected by all sorts of noisy things, and even points scored can be improved on, because points are also affected by all sorts of noisy things that might be unrepeatable. For example, last year both the Seahawks and the Colts were 11-5. Does that mean that they’re equally good teams? I think most people would say no. And we can improve on that impression a bit by looking at their points scored. Seattle outscored their opponents by 167 over the course of the season while the Colts were actually outscored by 30. Point differential is great and I use it often, but in the NFL it can still be misleading because there are so few games and points scored can be seriously impacted by a few unusual events. If a given team happens to get a couple of kick return touchdowns, or recover a high number of fumbles, or maybe just have their opponents miss more field goals than they should (for the most part, rare events that aren’t repeatable but that have a big impact on a game when they do happen), they will win for reasons that are unlikely to happen again in the future.
If you don’t use win-loss or points scored, how do you evaluate a team? You use the stuff that leads to those outcomes: passing ability, running ability, pass defense, run defense, etc. I use a bunch of numbers for a team from the standard NFL boxscore (going back to 2004) as the predictors for a regression model. Those predictors are used, in three different models, to predict the probability of the home team winning, how many points the home team should win by, and the total number of points scored given the numbers for the home and away teams and the average numbers produced by their opponents. To be fair, the total score predictions have been crap, so I may report them but I often ignore them at this point. But the win probabilities and point differentials have been ok. I also have three different regression ‘styles’, leading to three model names. The one I’ve used the longest is Luigi, which is a typical regression (logistic for win probability, standard linear for point differential). Last year I tried out two regularized regression models, Yoshi 1 and Yoshi 2, to see if that helped at all. These models use an out-of-sample procedure to pick regression weights that theoretically do better at predicting than a standard regression. One runs a regression on all the data while the other runs a regression only on the data from a given week of the season in case there’s a different relationship from week to week (for example, maybe the stats after week 1 are so noisy that you just kind of take the home team all the time, whereas after week 10 they’re reliable and the weights look more like Luigi’s). One season isn’t a lot to go on, so I’ll start to have a better idea of how the Yoshi’s look at the end of this year.
Alright, so what about the Jets and Patriots? According to SBR, the Patriots started as about 13 point favorites and now are closer to 11. The moneyline is Patriots -545/Jets +465, meaning you’d have to bet $545 on the Patriots to win $100 if they win the game, or you could bet $100 on the Jets to win $465 if they win the game. And the over/under is 43 points. How do those numbers seem? Let’s take a look at the two teams.
The Patriots are at home, which is in their favor. Last week they beat the Bills on a last-second field goal, so in terms of point differential they look ok but not great. Moving to their play, it’s kind of a mixed bag (as you would expect in a close game). The Patriots passed a lot, but didn’t get much from any individual play; their 5.5 yards per pass was only 27th in the league last week. Brady threw an interception, but one interception on 52 passes isn’t that bad. Similarly, Brady was sacked twice, but on 50+ dropbacks that isn’t so bad. In terms of passing defense, the Pats held the Bills to 5.6 yards per pass, which is good, but they didn’t get any sacks or interceptions. New England had a good day on the ground, with their 4.5 yards per rush coming in 7th in the league. However, they fumbled twice and lost the ball both times. They also let the Bills get 4.0 yards per run, above average, but forced three fumbles and recovered two. The recoveries aren’t as critical as the initial forces, but it’s good to know. So, after one week, we could say that the Pats have an above-average run game to go with a mediocre-at-best passing game, and a good pass defense to go with a mediocre run defense. This is all speculation on the part of the model, because it only has one game to go on, but that’s what the numbers say so far. The model also uses some other info, like penalties, but these kind of efficiency numbers are the main drivers.
Moving to the Jets, they also won on a last-second field goal against the Bucs. The Jets’ 6.6 yards per pass was below average, but better than the Pats’. However, they have up 6.8 to Tampa. The Jets generated an interception and three sacks, which is good, but they gave up one and five of their own. The Jets were also below-average at running, getting only 3.1 yards a pop, but held the Bucs to the third-worst rate at 2.6. Coincidentally enough, there were 5 fumbles in this game too. So, again going by one game, we might say the Jets have a below-average offense but an above-average defense.
The next question is how to combine all this information, and that’s what the regression does. It takes these stats from previous games and how those games turned out and weights each stat to try and make the best guess possible. The three models, Luigi, Yoshi 1, and Yoshi 2, just make those guesses in different ways, leading to different weights and thus different guesses about new games. When you put everything in and fire up the ol’ stats machine, Luigi says that the Patriots should win by about 4, winning 64% of the time overall. Yoshi 1 has the numbers at more like 5 and change/62%, and Yoshi 2 says 4/61%. So there’s broad agreement that the Patriots should win most of the time (less importantly, they all also agree that the game should have under 43 points scored). Thus if you wanted to make a bet on the game, or just wanted to have your models live up to a strict standard, you would actually take the Jets to cover, since 4 or 5 points is well under the SBR number of 11. You might also consider betting on the Jets to win outright. You don’t necessarily expect it to happen, but if they should really win about 38% of the time, then you expect to win $465 38% of the time (+177) while losing $100 62% of the time (-62) for an expected return of $114.70. That is, if the models are doing a good job and you were able to place a lot of bets on games like this one, in the long run you should make about $115 per game.
Aside from the numbers, the Pats and Jets have disliked each other enough historically to make the games somewhat entertaining even when the Pats have been clearly better. Add in some questions about what Tom Brady can do with his new receiving corps and how Geno Smith is going to turn out and hopefully we’ll have a good game tonight.