The (Usual) Folly of Probability Matching

Generally speaking, people are fairly decent at learning how often things happen.  For example, if you asked someone to predict a bunch of coin flips, they would come out pretty close to 50%.  They aren’t especially good at doing other, closely related things, like shifting quickly if those odds changed or making the sequence of predictions properly random, but at least they do a good job of probability matching – that is, they will respond in proportions equal to the actual outcome probabilities.  Why in the world would you care about that?  Because it’s usually actually a bad idea – unless you’re filling out your bracket.

Let’s stick with the coin flipping for a second.  Let’s say I told you I was going to flip a coin 50 times, and I wanted you to predict each flip.  How would you do it?  You would probably try to randomly guess heads or tails each time, and in the end you’d have a sequence of heads and tails in about equal proportion.

You can mock this up in Excel pretty easily instead of actually flipping coins (and I encourage you do this for the next part, when you’ll need to convince yourself of something).  In the first column, type =rand() in a cell, and then copy it down for 100 cells or so.  You should see a number between 0 and 1 in each cell.  Then do the same thing in the next column.  What we’re going to pretend is that each of those numbers is a probability.  We’ll say that the first column is the coin, and every cell that has a number under .5 is a flip that came up heads.  The second column is our guess, and every time that has a number under .5 we guessed heads.  To make this a bit more clear, in the third column type =if(A2<.5,”heads”,”tails”)   (assuming your first coin random number is in A2) and in the fourth column type =if(B2<.5,”heads”,”tails”)  (assuming your first guess random number is in B2), and then copy those down for however many flips and guesses you created.  As a comparison, make a fifth column that just says heads in every cell.  Then we’ll check to see if we were right; make a sixth column that starts with the cell =if(C2=D2,1,0) and copy it down.  If the coin result and your guess result are in C and D, this will mark a 1 every time you guessed the flip right and a 0 every time you were wrong.  Do the same thing in a seventh column for that comparison column.  Finally, pick two random cells off to the side and type =average(F2:F101) in one and =average(G2:G101) in the other  (assuming you did 100 cells under a label row like I did, and made the columns I described).  These numbers tell you how often your guess was correct and how often guessing ‘heads’ every time was correct.  Even with 100 flips there’s going to be a little noise, but both averages should be around .5, or 50%.

Ok, so what did we figure out?  If you guess coin flips and probability match (say heads or tails equally often), you’ll be at 50%.  Not surprising.  But if you made that comparison column where you guessed heads every time, you’ll also be at 50%.  What’s the big deal?  Well, this is a special case where those are both equally good strategies.  What if we had a rigged coin that came up heads 75% of the time?  We can model this by going to our flip and guess columns (columns C and D) and making them <.75 instead of <.5.  Now the coin will come up heads 75% of the time, and we’ll probability match and guess heads 75% of the time.  What you should find now is that our guesses are never correct as often as guessing heads every time.  What gives?  The problem is that we’re guessing randomly.  If there’s a 75% chance of the coin coming up heads, and a 75% chance of us guessing heads, but those two events are uncorrelated, than we’ll only say heads when the coin is heads .75*.75 = 56.25% of the time.  Similarly, we’ll get the tails call correct .25*.25 = 6.25% of the time.  Put them together and we’re right 62.5% of the time.  In contrast, by guessing heads every time we know we’re going to hit on the 75% of the time the coin comes up heads.  When you get away from 50/50 events, probability matching starts to become a bad idea.

What does this have to do with your bracket?  I think that people intuitively want to probability match to pick their games.  One of the ‘rules’ I hear every year is that there’s always one 5-12 upset, so you should pick one 12 seed to get out of the first round.  But here’s the thing; there are four 5-12 games.  Which one do you pick?  If those four games are equally likely to provide the upset, then we’re in the coin flipping situation where heads comes up 75% of the time, but instead of heads we have the 5 seed advancing.  In the long run, you’re going to be correct more often by picking every 5 seed to advance as opposed to picking one 12 seed.  The same thing goes for 4-11 and 6-10 games and so on.  The exception would be the 8 versus 9 seed; those are roughly 50/50 and as we saw before, it doesn’t matter much if you go with all 8s or mix it up.

Of course, the bracket adds a little something extra to the situation.  If all you had to do was be correct as often as possible, it would make sense to pick every favorite.  In the long run, that would give you the best outcome.  But if your goal is to win a bracket in a particular year, this no longer makes sense.  Imagine that everyone uses that strategy; everyone would end up tied.  If you can make an unusual pick that happens to be correct, you would get those points that no one else does and win the bracket.  You’re less likely to be right, but you have increased your chances of being an individual winner.  So game theory plays a bit of a role in picking brackets.  Most brackets also have rules the make different games worth different numbers of points; for example, there’s usually an ‘upset bonus’.  In the bracket I play, you get points for the seed of the team that advances: if you pick a 5 seed and they win, you get 5 points.  If you picked the 12 seed and they win, you get 12.  What does this do to our pick strategy?  Well, say we pick all four 5 seeds to advance.  We expect three to do so on average, so in an average bracket we’ll get 5*3 = 15 points.  Say we probability match and pick one 12 seed.  If we get them right, we’ll get 5*3+12 = 27 points, but that may not happen that often.  All four 5 seeds might actually win, in which case we’re back to 15 points.  But maybe a 12 seed wins, but not the one we picked.  In that case, we only get 10 points for the two 5 seeds we picked correctly.  Since we’ll only pick the right upset 6.25% of the time, we’re unlikely to hit that big 27 point round.  And if you go the other way and pick all four 12 seeds to advance, to try and make sure you hit the bonus?  If only one actually wins you only get 12 points, and miss out on the expected 15 from picking all four 5 seeds.  You can envision a payoff where it would make sense to pick upsets more often, but the payoff alone in this scenario (ignoring the game theory idea above) doesn’t cut it.  And this ignores the potential points later in the bracket; a 12 seed is unlikely to get 2 upsets in a row, but a 5 seed might win and then upset a 4 seed (Wisconsin if you’re pro-Big Ten?  VCU if you’re anti-Big Ten?), which you can never pick if you take all the 12s to advance.

So, in general it’s a bad idea to probability match if you expect non-50/50 results.  Another relevant example would be weather forecasts; if there’s a 70% chance of rain, do you take your umbrella 70% of the time?  You’ll end up wet more often than if you took the umbrella every time.  But there are some situations where game theory says not to make the same guess every time, and there are pay-off structures that might make it worthwhile to move closer to probability matching or even guessing unlikely outcomes.  The best idea would be to figure out what situation you’re in and adjust your guessing strategy appropriately.  Good luck with your bracket!

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s