Last night I was watching the Lakers-Jazz game but went to bed before it ended. This morning I get up and turn on SportsCenter, and I hear something like “Kobe Bryant had the ball at the end of the game. You’ll never guess what happened”. Sadly, this was enough to make me think of Bayes’ Theorem.
Bayes’ Theorem is used to determine the probability of one event happening given that another event happens. The linked Wikipedia article uses an example of determining if the person you saw (from far away) was a male or a female given that the person was wearing pants. You need to know a few things to figure this out and use Bayes’ equation; the equation if P(A|B) = P(B|A)*P(A)/P(B). It says that the probability of A given B is equal to the probability of B given A times the probability of A divided by the probability of B. We want to know the probability that the person was a girl (A) given that the person was wearing pants (B). Let’s say the school you’re at is 60% male and 40% female. Then P(A) = .4. We also need to know the probability that a person wears pants given that she’s a girl; this is P(B|A). Let’s say it’s .5. Finally we need to know the probability of anyone wearing pants, P(B). All boys wear pants and half of girls wear pants, so we have 1*.6+.5*.4 = .8; 80% of the students at the school wear pants. So now we just plug and play: P(A|B) = P(B|A)*P(A)/P(B) = .5*.4/.8 = .25. There is a 25% chance the person you saw was a girl given that the person was wearing pants.
Bayes’ theory is nice because it takes all relevant information into account and adjusts for the base rate of events A and B. Just given the question ‘how likely is it that pants-wearing person was a girl?’, you might think that combining the probabilities is enough: 40% of people are girls and half of those wear pants, so .4*.5 = .2. But this doesn’t take the boys at the school into account at all. A more common example for base rates has to do with medical diagnosis. If a test is 95% accurate and you get a positive result, are you actually sick? Depending on what the 5% errors are (false positives, misses, or a mix) and how common the disease is, there’s a very good chance you actually aren’t sick. If the disease is rare, you could still be more likely to be healthy than actually sick.
So what does this have to do with Kobe and ESPN? Well, you probably came across Henry Abbott’s takedown of Kobe’s clutchness a while back, or some of the aftermath. Kobe isn’t as great a clutch shooter as everyone thinks he is. So when I heard that the game came down to Kobe taking a shot, I figured the Lakers probably lost. But then I thought, would SportCenter really hype it if he missed? ESPN loves Kobe and the media is usually pretty happy to show him doing great things. So maybe he made it. Here comes Bayes’ theorem!
We want to know P(A|B), the probability that Kobe made the shot given that ESPN mentioned the end of the game. Let’s start with P(A), the probability that Kobe made the shot. Henry’s article says Kobe has a 31% chance of making a clutch shot since he entered the league. The numbers on 82games.com from the past couple years seem a little higher, so let’s say he’s at 40% right now. P(A)=.4. Now we need to know P(B), the probability that ESPN mentions the end of a game. There are a lot of games, but it did involve the Lakers, so I think it’s fairly high; let’s say that it’s .8. Finally we need to know the probability that ESPN would mention the end of the game given that he made a game-winning shot, P(B|A). ESPN covers pretty much any game-winner, so let’s say P(B|A) = .99. So what’s the probability that Kobe won the game last night? P(A|B) = .99*.4/.8 = .495. Despite Kobe not being a great clutch shooter, the fact that ESPN mentioned the end of the game raises the likelihood that he made the shot from 40% to just under 50%.
This analysis has a couple of holes in it. First, I was a little loose with my definitions. It turns out that the game ended with Kobe turning the ball over instead of taking a shot at all. So I probably should have looked at the probability of points being produced by a Kobe possession (shooting or passing, and including free throws) instead of just shooting percentage. But that’s really just a quibble.
The second issue is more fundamental to any use of Bayes’ theorem, and has to do with the values of P(B|A), P(B), and P(A). They aren’t always known, and when they aren’t they need to be estimated. For example, I adjusted the probability of Kobe making a clutch shot and I made educated (?) guesses at probabilities for ESPN mentioning a game and specifically mentioning a game with a clutch situation. Those numbers are probably knowable, but I don’t know them. This leads to subjective values that people can argue over and that influence the results. But in general, Bayes’ theorem is a great tool. If you have actual measures of all the necessary probabilities, it is without a doubt the way to go.