Friday, November 30, 2012

FYI, I'm done with 75% of my Christmas shopping.  Still have to get a tree though.

For those of you not done yet, I'll help you out with what to give me.   Here's a whole list!

And for my little genius baby, I'm thinking this "Outlier" bodysuit would be perfect....or perhaps a stuffed normal distribution?

Alright, enough shopping.  Need some entertainment?  Try the "thanks textbooks" tumblr. Featuring the best of the worst problems/examples/etc in textbooks.  Highlights in the commentary include "I’m less concerned with the question, “What does the scale read?”  and more concerned with the question, “Why the hell are we lubricating a hamster?”  and "Who has a “favorite” orange?  How long have you had this orange that you’ve bonded with it so much?  Who has an equation to calculate the weight of an orange?Is it your favorite because it happens to weigh nine pounds!?"

Thursday, November 29, 2012

A post that starts with a brain teaser, moves to a visual, and ends with a stern reminder

I wanted to put up a brain teaser yesterday, but the little one got his first cold.  Baby coughs are sad.

Anyway, one of the more famous statistical brain teasers is the birthday problem.  There are a few variations, but essentially the question goes something like this:

You're at a party with 23 guests, including you.  What are the chances that  two people there have the same birthday?

The trick of course is that no one has to have a specific birth date, so the answer is not 23/366, but instead around 50% (interestingly, if the party were 50 people, it goes up to 97%).    For a further explanation, see here.

What's interesting about this problem is that you have to assume every birth date is equally likely...which of course isn't true.  I've written before about uneven distribution of birthdays in the US, due in part to scheduled c-sections or induced labor.  Anyway, I saw an interesting heat map today of birthday distributions from the Daily Viz, which is what got me thinking about the brain teaser.

To note, this chart was made from a list of ranked birthdates, which is here.

I was a little struck by this, because I was thinking about how terrible I am at estimating things like this on my own.  The most common birthday in my circle of friends/family is Halloween.  The first week in April has the birth dates of my mother, sister and husband.  Neither of those time frames are overly popular within the general population, although I'd guess the difference between "most popular" and "least popular" are relatively small.  It was a good reminder that those I spend the most time with are not terribly representative of the population in general, on average.

Tuesday, November 27, 2012

Qualitative vs Quantitative probability

Ann Althouse linked to a local news story about a hospital in Minnesota that went 62 hours and 19 deliveries without delivering a baby girl*.

The comments on the Althouse post have a lot of smart people trying to figure out the probability and arguing about how unusual it is to deliver 19 boys in a row and if we should be impressed.  The point is made repeatedly that every combination of boy/girl deliveries is equally likely, which of course is true.  As I was reading through the comments though, it occurred to me that people are getting way too hung up on the quantitative probability here.

The real question is much easier:  are there any other combination of 19 deliveries that would have been as interesting to you?  Out of 524,288 possibilities, only 19 girls would have been as interesting as 19 boys.  For some it would be equally interesting at 18, 17 or 16, some not.  It's a little like a lottery ticket coming up 1 2 3 4 5 6 or 4 8 15 16 23 42.

The chances of something interesting happening are directly proportional to how many outcomes you find interesting. That's what I call a qualitative probability, not a quantitative one.  It's like that post from thankstextbooks.

*The Althouse post says 14 hours, but the article says 62 hours, not really sure where the discrepancy came from.

Meta on meta

The AVI has a poll up on polling, in reference to my post about polls.

Monday, November 26, 2012

I've recently been considering going more in depth with my stats education (especially the data analytics software stuff), and am checking out a few grad programs in applied statistics.

Anyone have any good suggestions?

Online and/or located in New England preferred.

Sunday, November 25, 2012

How important is important?

I saw an interesting link on Instapundit today, under the headline "men on strike".

It took me to a Fox News article entitled "The War on Men" which led with a study by the Pew Research Group that said:
According to Pew Research Center, the share of women ages eighteen to thirty-four that say having a successful marriage is one of the most important things in their lives rose nine percentage points since 1997 – from 28 percent to 37 percent. For men, the opposite occurred. The share voicing this opinion dropped, from 35 percent to 29 percent.

The article went on to elaborate that this was a huge societal change caused by feminist women being too angry and unmarriable for men to bother.

Really? Because feminism started in 1997?

Despite the hoards of internet commenters regaling everyone in the comments sections about how their own lives (and ex-wives), like, totes prove that women are awful (obvi), I felt a little dubious.  I was curious about  this survey.....if we were really reading this that women value marriage more than men now, was that not true in 1997?  I remember 1997, and I'm pretty sure the sexual revolution (cited in the article as part of the problem) was over by then.

Anyway, since most Pew Research studies are surveys of about 1000 people, I went searching for the sample size on this one.  I was curious what those 60 or so males were answering in 1997 that was so different.  Of course I had to search the Pew website for a while to find the survey (my suspicions grow when articles don't provide a link) but I found it here.

As I scrolled down, one graph caught my eye:

Wait a minute....that graph shows men and women being pretty equal on the topic of marriage.  What gives?

Here's the graph the Fox News article was talking about:

See the difference?  It's in the notes.

Men and women differ when the response is "one of the most important things" but not when you include the next answer down...."very important".

So the big culture strike is men moving marriage from "one of the most important" things to a "very important" thing.  That's not nearly as sensational as promised.

I'm actually curious what percentage of the respondents in this survey were married when they answered this.  For an unmarried person, this could be a bit of a "how often do you beat your wife?" question.  I mean, if you're not sure if you want to get married would you answer not important?  Because then it sounds like you're saying you'd be okay with an unsuccessful marriage.  I'm not sure what I would have answered prior to getting married myself....marriage always felt pretty optional to me.  Anyway, now that I am married, I would have definitely answered "one of the most important things".  If this had been two different questions, I would feel better about extrapolating from the results.

Friday, November 23, 2012

I guess it's a day late, but here are 4 ways to cook a turkey using NASA gear.

Speaking of crazy uses for things, did you know you can cook fish in your dishwasher?

Alright, that wasn't math or science related, but this is.  Neil Degrasse Tyson is teaming up with the GZA from Wu-Tang clan to teach kids math, and man, it ain't nothing to @#\$*& with.

Wednesday, November 21, 2012

On Marco Rubio and lines in the sand

In the post election fall out, no story has hit me as personally as the new media kerfluffle over Marco Rubio's "age of the earth" comments.  For those of you still trying to tune out, here's the recap.  Rubio, a Republican Senator from Florida, got asked in an interview with GQ how old he thought the earth was.  His reply heard round the world:
I'm not a scientist, man. I can tell you what recorded history says, I can tell you what the Bible says, but I think that's a dispute amongst theologians and I think it has nothing to do with the gross domestic product or economic growth of the United States. I think the age of the universe has zero to do with how our economy is going to grow. I'm not a scientist. I don't think I'm qualified to answer a question like that. At the end of the day, I think there are multiple theories out there on how the universe was created and I think this is a country where people should have the opportunity to teach them all. I think parents should be able to teach their kids what their faith says, what science says. Whether the Earth was created in 7 days, or 7 actual eras, I'm not sure we'll ever be able to answer that. It's one of the great mysteries.

This immediately caused cries of how scientifically ignorant he was, as the correct answer is apparently 4.54 billion years.  Rubio has been accused of putting religion ahead of science, and this has sparked a general conversation about how religious orthodoxy and science are incompatible.  In fact, Phil Plait over at Slate put it this way:

I got a chill when I read Rubio’s statements, “I think it has nothing to do with the gross domestic product or economic growth of the United States. I think the age of the universe has zero to do with how our economy is going to grow.”
Perhaps Senator Rubio is unaware that science—and its sisters engineering and technology— are actually the very foundation of our country’s economy? All of our industry, all of our technology, everything that keeps our country functioning at all can be traced back to scientific research and a scientific understanding of the Universe....Senator Rubio is exactly and precisely wrong. Science, and how it tells us the age of the Earth, has everything to do to do with how our economy will grow. By teaching our kids actual science, we can guarantee the future of this country and its economic growth. By hiding it from them, by equivocating about it with them, by providing false balance between reality and wishful thinking, what we guarantee is a future workforce that can’t distinguish between what’s real and what isn’t.

(Highlight in the original)

Now, I agree with Plait.  Science is critical to our understanding of the world.  I started this blog in part because it makes me incredibly sad exactly how little most people know about math and science, and how malleable most people believe facts are.  I think scientific literacy is one of the biggest gifts we can give our children, and obviously I spend a decent amount of my free time trying to promote more critical interpretations of popular facts....and that is where I disagree with Phil Plait and the other Rubio critics.

I think drawing a line in the sand over one specific issue like this is wrong.

I am not a young earth creationist....but I was raised by them.  I'm not talking about my parents either, I'm talking about the 13 years of Christian school education I received, including 7 years of strict Baptist teaching in middle school and high school.  Every science or math class I took for my entire pre-college career was taught by a young earth creationist (or at least someone who had been willing to say they were one).  In this environment, any science class not taught by an avowed Christian was immediately suspect.  When I announced my intention to go to a secular university and to study engineering, I fielded question after question about how I would be able to stay true to my faith while being taught science by those terrible atheists.  I was encouraged to change my choice of school or my choice of major, to change anything, because of the constant assaults on my beliefs I going to have to withstand.  I was told horror story after horror story of Christian kids singled out and flunked for standing up for what they believed in.  A friend's mother actually cried while talking to me about it.  For a time I reconsidered, but in the end I didn't change my mind and I went to college ready to face the fire.

Well, the fire never came.

In four years where I barely left the math/science buildings, in four years of biology, chemistry, organic chemistry, physics and engineering classes, I was never, not once, asked how old I thought the world was, how we all got here, or if I thought there was a place for God in any of it.

It's not just that no one asked, it's that it never came up.  I mean, I'm pretty sure in one of my biology classes there was a passing reference to "this is an evolutionary adaptation", but other than that, no one raised the subject.   We were too busy learning about how to multiply numbers in a matrix or how objects move on frictionless surfaces.  In fact the only time it came up was either when people found out I went to a Baptist high school ("oh, so you were taught the whole earth in six days thing? what was that like?) or when I'd run in to Christians who would want to grill me on how I was being treated.  Ironically, many of these Christians were in the psychology or sociology departments, both of which had professors FAR more critical of fundamental Christian beliefs than anything I encountered.

Over time, I came to be fairly critical of the particular high school I had gone to and the attitude of religious fundamentalism.  It was science, really, that set me free.  The ability to review evidence, to think critically, to decide what is and isn't a valid source, and a healthy sense of skepticism all moved me away from those people who claim religion mostly so they will always feel sure about everything.  I like feeling unsure.  I like admitting I could be wrong.  I like saying there's some ambiguity, and that I'll look in to it independently and form a conclusion. I will always love science for this.

However, when people use science to do the same thing to others that I feel religion was used to do to me, I get upset.  Science should be used to open people's minds to the idea of evidence based investigation, not to make fun of people because they repeat something they were raised with.  To set the bar at the age of the earth, to say that no one who even questions the 4.54 billion year number is allowed to come in, well I think that ensures that fewer people of faith will even bother trying to enter the sciences.  This needlessly perpetuates hostilities on both sides.  If religion throws down a gauntlet on one side, it's up to science to sit back and say "no problem, come on in, take a look around for yourself and see how you feel after".  Science is not an excuse to shove a conclusion down someone's throat.  Science is a process of teaching people how to reach a good conclusion to begin with.

Getting back to Rubio and his detractors....this is why I can't criticize the man.  Rubio's a Catholic, a lawyer, and a politician.  I doubt he's seriously sat down and studied geology, astronomy or anything else that would help him understand the age of the earth debate.  Additionally, his Catholic faith tells him that many conclusions could be valid (Catholics are not required to be young earth creationists).  So when he was asked a question outside his comfort zone, he said what he knew, admitted his limitations, and said what he didn't know.  To me, that's science.

Tuesday, November 20, 2012

If it makes you happy (it can't be that bad)

I don't have any citations to back me up, but I'm pretty sure it's a proven fact that a belly laugh from a baby is the most amazing sound on earth.

Anyway, I saw a great headline today, a classic "there's more to this story" moment:  "Sex and alcohol make you happier than kids and religion, study finds".  While I'm sure the headline made many a college student raise their hands with a "damn the man!", I was curious where this was all coming from.  What makes us happy is notoriously difficult (in part because the things that make us feel the best are paired with things that make us feel bad.....water tastes better if you're thirsty, showers are amazing when you're feeling gross, absence makes the heart grow fonder, etc)

This took quite a bit of searching around, as apparently this was not actually a published study, but rather a press release for a talk a postgrad is giving (or gave, Nov 14th) at the University of Cantebury.  It's a pity because I think the demographics of the survey population might be relevant, but I'll work with what I have.

The study itself seems pretty interesting.  While the headline is true, it really sells short what the authors were trying to do.  They were actually trying to capture how different actions effected people (in the moment) on four different levels....pleasure, meaning, engagement and happiness.   Sex was rated number one in all categories, but other ratings were more divided.  Alcohol/partying was highly rated for pleasure and happiness (2nd), but lower for meaning (10th).  The "kids" part was actually the activities of childcare and/or playing with kids....which I felt like covered a pretty broad range of interactions.  I mean, making my son laugh is the highlight of my day, but the time I spend changing diapers and calming fussiness?  Well, it's hard to put that all in the same category.  In the same vein "religion" was actually "religious activity"...which is slightly different IMHO.  Anyway, childcare and religious activity both rated high on meaning and happiness, and lower on pleasure and engagement.

Basically, the point of the study was not to measure pervasive life effects of individual actions, but rather to quiz people (via text message) at random points in time on what they were doing and how it made them feel. The upside of this is it doesn't rely on people's memories of what made them happy, which could be influenced by later context.  The downside of course is that it is an action devoid  of context.  Drinking alcohol could make someone happier at the moment, but the rating does not include any potential consequences that might come in later.   What was most interesting to me about this study was the things people do regularly that they seem to know won't bring them happiness, meaning, pleasure or particularly engage them, most notably Facebook time.

In the end, the headline just neglected 3 out of 4 categories and reported the most sensational results.  That's not surprising.  I actually ended up finding the study pretty interesting, and I'd like to see it published somewhere with more details.

Update on the House races

Patrick from ballotlines has updated his predictions about who would have the House if some of the seats were decided by popular vote within the state.  This time he included the potential voters who didn't vote because the races they had available were uncontested or otherwise not representative.

I always find it interesting to ponder the effects of changes like these on results.  Most of my job for the last few years centered on a project that tried to change behavior of employees by changing the system they were working in.  It's a fascinating thought project.

I suspect that this fear of the unknown is why we stick with an election system most people barely understand.  My guess is both political parties 9or at least their consultants) prefer small tweaks to the existing rules than a major overhaul to the whole operation.

Monday, November 19, 2012

Well this is awesome....

From datazoid.deviantart.com....quick, some get a kickstarter campaign going to make these real!

Friday, November 16, 2012

Looking for a Bad Data Bad approved Christmas present for a little girl in your life?  Reader David tipped me off to Goldi Blox....a new startup that's making toys to get girls interested in engineering early on in life.  She had me at "this is the toy I wish someone could have bought me when I was that age".

Also, are you traveling for Thanksgiving?  If so, you should know that surviving a plane crash is not nearly as uncommon as you've been led to believe.

Once you get to Thanksgiving dinner (safely), here's some fun tricks you could bust out with.  Science.

In the post election dissection, I've seen a lot of random correlations, but this one touched my heart.  Apparently coffee won Obama the election.  Of course by this logic, Texas is one of the next states that could go Dem....so take it with a grain of salt.  Or lump of sugar.  Either way, a cute reminder that correlation is not causation.

Speaking of elections, want to remember what the internet used to look like?  Dole/Kemp '96 is still up (for educational purposes apparently).

This one's a little different.  I was raised in a home/church community where one of the most enduring traditions of the holidays was a call for giving to those in need.  This year, I decided to give some money through donorschoose.org.  If you've never been there, it's a site that lets teacher's from around the country submit their "wish lists" for specific education projects they are working on, and donors can fund the project. I (naturally) went looking for stats projects and found a school in a high poverty area in North Carolina looking for some resources to make statistics more real for their students.  They only need a little over \$200 more to fund the project, and I thought I'd just put it out there to anyone who might be interested in pitching in.  I have no connection to any of these people, just a random act of kindness thing.  The project can be found here.

Thursday, November 15, 2012

Math you do as a Republican to make yourself feel better

The headline right there was my favorite quote of the whole election cycle.

I got a special request from a coworker of my father's who suggested to him that I should wade in to the murky water of the gerrymandering controversy.  There's a lot of data being thrown around*, but here's the gist:

Some Democrats are claiming that the Obama victory and the victory in the Senate have given the Dems a mandate....basically claiming that the country agrees with their policies and they should push forward with them no matter how much resistance from the other side.  Resistance is considered irrelevant, because the people didn't vote their opposition in.  Some Republicans on the other hand point to their victory in the House to say that actually they have the mandate, or at the very least that Obama/Dems do not have any sort of consensus.  The Dems counterpoint is that the Republicans only kept the house because of clever gerrymandering (redistricting) they orchestrated in 2010.  Thus, the Democrat mandate is even stronger than it appears because the Republicans cheated to get theirs.

So, was it all chicanery?  How do we assess election and gerrymandering data?

Well, the first step is to look at the popular vote.  I couldn't find any updates, but as of Nov 9th, the Democrats got more votes for their house candidates than the Republicans did (by a very small margin).

However, this may or may not mean much.

State politics are a funny thing.  In many states, people run unopposed, or with only token opposition.   It's hard to count popular vote when many races are foregone conclusions.  Additionally, on the state level I'd wager people are more likely to vote for incumbents, if only for the extra power they believe it gives them to have more senior congressional members representing them.  Adding to the difficulty of interpreting the numbers is California's new system of doing run off races....so we can't presume that all house seats were decided in Rep vs Dem contests.

Alright, so where does that leave us?

Ultimately, we have to cut through the mess and ask ourselves what a fairer system would be, and what the results would have been under said fairer system would have been.  This blog post over at ballotlines does that quite nicely.  The short version is this: even if the House seats were broken down based on popular vote by state, the Republicans would have kept the majority, though not by as wide a margin.

Another interesting take is here at the Monkey Cage blog, which revisits the 2008 district map, and shows the Republicans still winning the house, though again by a smaller margin.

So Dad, you were right, gerrymandering likely does NOT explain the house win, though it does seem to explain the magnitude.  That's just math you do as a Democrat to make yourself feel better**.

*Along with data being thrown around, there's also some FANTASTIC conspiracy theories.  The two best I've read in comments sections so far are:  (Republican) "Polls clearly show almost twice as many people self identify as conservative vs liberal.  For Obama to win raises some serious questions.  Given that Silicone Valley is in California, and Californians are liberal, I think we should check how the voting machines were programmed.  I believe Mitt Romney won 60% to 40% and the computer programmers changed millions of votes." (Democrat) "I understand that Romney does better among married women than single women.  Does anyone else think that's because so many conservative men are abusive and probably force their wives to vote Republican? At my polling place I saw people enter the voting booth together, my guess is it was men making sure their wives voted the way they wanted".  Actually, that first one is mostly just kind of tinfoil hat paranoid, the second one I found pretty disgusting.  Believing that many conservative men are capable of domestic violence is a kind of chilling way to view the world.

**All of the analysis here of course sidesteps the issue of how voter turnout would change if a new system were implemented.  We live in a country where (at last count) 42% of eligible voters didn't vote.  Since we can only guess at what those voters would have done, we can't know for sure how any new or different system would effect any of this.

Wednesday, November 14, 2012

Wednesday Brain Teaser - Driving down the highway

If the probability of observing a car in 30 minutes on a highway is 0.95, what is the probability of observing a car in 10 minutes (assuming constant default probability)?

Answer will be posted in the comment section sometime on Friday.

Sunday, November 11, 2012

Why you can't always rely on the experts....

In research criticism, it is not an uncommon event for someone to suggest that if something was really wrong with the research, the peer review process would have picked it up.

This is an understandable sentiment, but clearly not true.  Peer review is a good system of course, and peer reviewed papers are much more likely to be reliable than those not subject to it.  However, to imply that no one not on the review committee can or should point out errors in papers is silly.

I bring this up because there's a great article at Retraction Watch right now about a guy who was doing a little reading in the journal "Water Research" when he came across a paper that addressed one of his pet interests.  He was excited when he started reading it to find the authors seemed to share many of his opinions, and thought it was cool that they even used a lot of the same wording he would have......and then he realized the paper was his PhD thesis, with at least half of it copied word for word and attributed to another author.

Oops.

The paper ultimately got retracted, and it looks like the journal handled it well. However, it's a great example of how peer review is not a fool proof system.

The world always needs people who keep their eyes peeled for error.

Saturday, November 10, 2012

Weekend moment of zen 11-10-12

My father in law is reading Nate Silver's book.  He said he was getting bogged down in the description of Bayesian statistics.

I sent him this XKCD comics to help explain it to him:

I'm not sure it helped him, but it certainly made me giggle.

Friday, November 9, 2012

Electoral map fun

I was psyched to see a friend post this link to electoral map fun on Facebook today.  Mark Newman, a professor at the University of Michigan has done a series on different representations of the electoral map.  You should look at the whole thing, but here's a sample.

It's always been interesting to me how misleading the regular red/blue electoral map is:
This always makes it look like the red should easily have outnumbered the blue.  The link shows different breakdowns to account for population by state:
He includes breakdowns by county, and some with shades of purple to represent splits.  Interesting stuff.

Signs signs everywhere signs

Well, it appears that either there was no systematic bias against Republicans in the polls, or Nov 6th just happened to be the wrong time of the month for the Republicans.

My mother was with me on election night, and she mentioned being quite surprised that New Hampshire wasn't a closer race (52-46 for Obama), and even more surprised that Maggie Hassan beat Ovide Lamontagne by as wide a margin as she did (55-42).  Apparently the polls had showed a closer race, and many people she knew were convinced that bias meant the Republicans were actually leading.

I ended up driving back to New Hampshire with her, and I started to see where some of the problem had come up.  At least on the route I take, the roads were COVERED in Romney/Ryan and Lamontagne signs.  They outnumbered Obama/Biden and Hassan signs by quite a bit.

I was reflecting that I've heard that's the point of signs....to give the impression that there is a majority for one candidate, and that you are going against all of your neighbors if you vote otherwise.  I wondered how many people saw those signs and had at least some of that influence there opinions of the polls.  There can't be that many people voting for the other guy....I see hundreds of signs every morning that say otherwise.

This is yet another example of where proxy markers can fail.  Political signs along major routes reflect the dedication of a few, not necessarily the opinion of the many.

Monday, November 5, 2012

Election Eve and Polling Bias

Well it's election eve and Nate Silver is still predicting an Obama win....with the caveat that it is possible that if Romney wins it will mean nearly all state polling might be biased against Republicans.

I don't think he was saying this to be glib, or ruling the possibility out.  He actually goes quite in depth as to where he thinks error could occur.

To me though, this brought up an interesting point.....what do we do if it's true?  If nearly all swing state polls are saying Obama, and they break Republican, we will have to do quite a bit of reworking of our polling system.  But that's not what this post is about.

This post is actually about a rather entertaining comment I saw in a discussion about this.  Why haven't there been more concentrated efforts to skew polls?  Essentially, if you live in a swing state and hate political advertising, why not start a movement to get people in your state to all answer the same candidate to obscure the fact that it was a battleground state and reduce the number of dollars spent there?

This sounds wacky, but how many people would really have to buy in to this to make a difference?

Let's take my home state of New Hampshire.  As of January, there were about 770,000 registered voters.  As of today, polls show they are tied for Obama and Romney.  From what I can find, even the best polls only have a 10% response rate, and many are at 2 to 5%.  The UNH Granite State Poll is widely reported and only surveys 500 people.  It seems it would not take many people making an effort to answer their phones and state they are for a particular candidate to start to skew things.  Even if word got out, it would introduce enough uncertainty in to the polls to confuse the heck out of the political consultants and the media...and wouldn't that at least be entertaining for the rest of us?

It's not like this is unprecedented....it was tried with Sanjaya on American Idol and there were rumors about Bristol Palin on Dancing With the Stars.  Those efforts took far more people than it would take to skew the polls in a small state like New Hampshire.  With 58% of adults using Facebook to get political information, it shouldn't be too hard to mobilize people....just like Twitter was used to start chants at the Boston Garden during the playoffs last year.

This is the danger of big data.  While data driven decision making is awesome, it's also hackable.  I'm just curious what the back up plan is if polls don't work any more.