Bad Data, Bad!: March Madness, Data Style...or not

Well my alma mater didn't make it in to March Madness this year, but we did have a good night last night. I caught a bit of the game, but spent most of the night watching Georgetown get crushed by a team that didn't even exist before 1997. I normally like underdog stories, but I actually was watching the game with a rabid Georgetown alum...so it was a little awkward.

Anyway, I've been pondering the role of data in March Madness predictions this week ever since my husband got home from his March Madness auction earlier this week. Unlike the well known "pick a bracket" set-up, the auction is a fun twist where everyone throws in $50 and then bids on the teams they want. Payouts get progressively larger depending on how far your team(s) get. You can get as many teams as your $50 will buy, but if you want to go all in on one team, you can go throw in more money to go higher than $50 (you can't have spent anything previously if you want to do this). You do not get any money back if you don't spend it all. Teams are auctioned off in random order.

This has normally been a pretty friendly competition (it's through his work), so he was a little surprised to show up to see someone furiously typing on a laptop. He asked the guy what was going on, and he told him he'd devised an algorithm based on Nate Silver's predictions. He had then assigned a relative dollar amount to each team, and was going to attempt to get bargains. My husband (who of course puts up with me on a regular basis) was pretty sure he was over thinking it.

My husband's strategy has stayed pretty simple over the years: don't bid until later in the game, pick up a team you think can go all the way, and don't leave money on the table. He won the first 3 years they did this, and has at least made his entry fee back every year, so I figure he's got a pretty good strategy.

He watched his coworker with the laptop, curious which teams he would pick. After watching him get two low-cost-but-unlikely-to-do-much teams, he was wondering how he was going to proceed. Suddenly the guy turned to him and said "hey, we get back the money we don't spend, right?". "Um....no." my husband and his other friend answered. The guy blanched a bit.

To me, this is the problem many people have with data. The best predictions in the world are useless if you forget to learn the rules of the game. A beautiful data set is useless if it doesn't actually help you solve the problem in front of you, and it's often worse if it sort of helps. Harder to parse out the uselessness, more tempting to apply a flawed strategy. How much better to always keep in mind some basic common sense.

I had an interesting discussion lately that led me to realize that my true interest, perhaps, is not actually data analysis. A more accurate term for what I like is research methodology....the study of how to capture what you actually want to capture. I love the analysis Nate Silver did, but I'm also impressed with my husband who made 3 key observations about this game:

People spend more money when everyone else has lots of money.
It's hard to pick a winner, but easier to pick a team who will at least go to the Elite 8 and make you your money back.
Money you don't spend is already lost.

Three simple ideas anyone could work out, but have somehow still made him money. While we don't yet know the outcome, my guess is they'll work out yet again this year.

He has the Jayhawks, in case you're curious.

Bad Data, Bad!

Saturday, March 23, 2013

March Madness, Data Style...or not

No comments:

Post a Comment

Saturday, March 23, 2013

March Madness, Data Style...or not

No comments:

Post a Comment

Subscribe To