Gone but not forgotten

It's been a long summer.  In April, I wrote to let you know my uncle had died unexpectedly.  A few weeks later, a different uncle of the same name also passed away.  On Tuesday, my grandfather died.  It's an interesting coincidence that these three men were all named James, and that despite their disparate ages (56, 60, 89) all died within such a short time period.

I've done a lot of reflecting over the past several days, and I wanted to say a few words about my grandfather, then write a few things about where I go from here.  I've subdivided this so you can skip parts you're not interested in.

James R King
My grandfather was the original stat-man in our family.  He quite literally wrote the book on it.  As we went through his stuff this weekend, I was amused to find that he had also been the original stats blogger in the family.  Apparently he had spent years running a stats newsletter where he wrote about stats topics that interested him and then sent it out to those who payed him $10 or so for the privilege of reading his thoughts. Judging by his archives, it seems to me quite a few people were interested in what he had to say.

My grandfather was truly a man of his time in many many ways.  He was hard working, hard drinking, driven by duty to God, country, family, intellectual curiosity and deep desire to see things work correctly.  He served in two wars (WWII, Korea), helped put a man on the moon, and had a deep disdain for stupidity.  As recently as a few months ago, he was grilling me about how to apply quality principles to health services environments.  He was annoyed that the administration of his assisted living facility wouldn't take him on as an operational consultant.  He wasn't trying to get money, he was just annoyed that things could be done better.  I'm not sure they ever knew how much free brain power they lost.

Since I got the new on Tuesday, I've been reflecting on what it means to watch another member of the greatest generation slip away.  For me, I have lost not only a grandfather, but someone who understood my way of viewing the world.  For all that "geek culture" has become mainstream, it's still a bit of a lonely life for those of us who prefer to view the world through numbers and systems, and my grandfather was one of the few people I could count on to always know how I felt.  I'll be raising a martini or two over a spreadsheet or three in his memory, I'm sure.

The Future of this Blog
Three deaths in 5 months is a lot, especially when the people involved were meaningful to your family structure.  I've been slow in posting this summer, and at this point, I've realized I need a complete break.  I started this blog as a fun project to work out some frustrations I had about political campaigns, and it worked well for that.  I've loved the readers I've had and the conversations that took place here.  I hope to get back to this at some point, to renew those conversations, but right now I don't have it in me.

On the other hand...
I have some projects in the works you all might be interested in.  First and foremost, this blog has helped start an ongoing conversation with my (science teacher) brother about what it would take to give kids a good sense of how to apply math and science to the media that bombards them, and give them a good sense of practical scientific literacy.  These discussions have led to us start collaborating on an e-book/curriculum guide of sorts.  The idea is it would be a bit like this blog adapted for a classroom setting....a sort of "here's how you take the dry concepts you're hearing and here's when you should use them in the real world".  I'll be posting periodic updates on this project, so you can check back for those.

Also, I know many of my readers have pretty awesome blogs of their own.  I'm always available for guest posts and/or random stats commentary if you miss me :).

Again, I want to thank everyone who has made this blog such a fun place for me to write.  The internet certainly has it's ups and downs, but (in the words of the AVI) I have been happy to be part of this "small but excellent corner" of it.

Keep being 2SD above the norm, and good luck out there.

Autism and Labor

Commenter Erin brought up the recent hubbub regarding induced labor and autism, and while I'd like to comment on it, Science Based Medicine has already done a pretty thorough job.  They put the breakdown quite succinctly:
In the case of this study, either inducing/augmenting labor triggers autism in some children, children with autism are more likely to require induced labor, or some other factor(s) is a risk factor for both developing autism and needing to induce or augment labor. This current study does not contain data that can differentiate among these possibilities.
Induced labor is a hard thing to study because (unlike c-sections) induction is very rarely completely elective. It is almost always precipitated by some other complication.  It's an interesting study though, and definitely indicates a need for more research.  Anything that gets people off the vaccine thing makes me happy.

30 Days of Data Storytelling: Day 4 and 5

Two videos for today, a long-ish one that gives more details about how to do things, and a Hans Rosling video that is a great example of a story with data.  I've seen the Rosling video a few times, but it's worth a look just to see how he shows his data off.

The other video is a good primer of what to do and what not to do when presenting data.  If you have time, worth a watch.

Literally Unbelievable

In 2013, I'm pretty sure it's a pretty universal experience to have at least one Facebook friend who is a bit of a train wreck.  I have one such person on my list, and for a variety of reasons I cannot delete him.  He is quite prone to daily postings of dozens of ridiculous political comments/links/cartoons that range from condescendingly disagreeable to outright offensive.  A large part of this offensiveness, IMHO, comes from the fact that a decent amount of what he posts isn't actually true.

He seems to be a deep sucker for a story that fits his pre-existing narrative, and at least twice a day I see something out of him that doesn't even pass a basic sniff test.  To be fair, he at least occasionally gets called out on this.  Apparently this has been getting to him though (the "hey that story's not true" part), because last night he posted quite the disclaimer that let everyone know that he "thoroughly researches" every story he posts.  

A mere 10 hours later, with no irony and lots of anger, he posted this article: Lance Armstrong Fails Drug Test for Job at Target.

On the bright side, just a few posts down on my newsfeed, a different friend posted this list chronicling the 35 best times someone on Facebook thought The Onion was real.  These two friends don't know each other, so it was pretty serendipitous.

It's a great list, and apparently it's drawn from a whole website of this sort of thing called Literally Unbelievable.

Check your sources people, check your sources.  

30 Days of Data Storytelling: Day 3

Doubling up on the posts since I got behind.

Today's entry was this awesome data simulation/graph/narrative about Olympic long jumping.

I remember watching a few of these around the Olympics last year, and it was pretty cool.  It's a good overview of raw data, with visuals and comparisons to put it in context.  Context is one of the most underutilized aspects of data presentation.  Hearing "he jumped 26 feet" is impressive, but hearing "he jumped from the edge of the court past the 3 point line" gives context.

It's a short video, definitely worth a watch if you have the time.

30 Days of Storytelling: Day 2 (Pixar version)

So after posting the first two articles last week, I realized those were supposed to be a combined Day 1, making this the real day 2.

Day 2 was two interesting Pixar related a list of their rules for great storytelling and the other a short (about 3 minutes) video where they tell a story with no words.  If you've ever seen a Pixar movie, you know they can tell a fantastic story, so it was interesting to read their take on the craft.

A few of their rules particularly stood out as relevant to data stories:

#2 You gotta keep in mind what’s interesting to you as an audience, not what’s fun to do as a writer. They can be very different.
#11 Putting it on paper lets you start fixing it. If it stays in your head, a perfect idea, you’ll never share it with anyone.
#17 No work is ever wasted. If it’s not working, let go and move on – it’ll come back around to be useful later.

I'm sure there are others that could apply, but those are the 3 that really struck me.  Sometimes I find fun and funky data that no one else is interested in.  I'm always having to refocus on the question at hand.  When you analyze data a lot, the "normal stuff" can get boring, but normal is interesting to someone who's seeing it for the first time.  That bleeds in to can't always know what's interesting to people until you start to share it.  Testing reactions and assessing opinion is valuable.

When something flops, that's when #17 comes in.  I store all the data I come across for future use.  It's interesting how often something no one was interested in can later become critical.

The video's just cute.  Show it to the small child in your life.

McDonald's wages: not what they appear

Last week I saw a HuffPo headline peppered on several facebook/twitter feeds/etc that claimed that "Doubling McDonald's Salaries Would Cause Your Big Mac To Cost Just 68¢ More."  It was an interesting claim, but ultimately I didn't click on the link, as I tend to find most economic analysis pretty dubious from the get go.   Now, I know nothing about economics, but as a systems person, I generally believe you can't change things dramatically in one area  of a business (such as doubling salaries) and expect to fully know the results just by adding a few numbers (the cost of a Big Mac going up just $0.68).  

I actually almost blogged about it, when I saw a snippet on Volokh about the lack of thought about the repercussions of such a change on the type of person McDonald's hired.  Most people seemed to be assuming that all the poor folks currently working at McD's would get raises, but isn't it more likely the jobs would become more competitive and the population they hired would change?  Interesting thought.

Well, I'm now glad I didn't post on any of it, because apparently the whole analysis was crap anyway.    Apparently the guy who was looking at it left out the 80% of McDonald's that are franchises (but included the franchise fees as profits), and it excluded a bunch of other accounting issues I don't understand.  Oh, and the "study" that had shown this originally was the work of an independent undergrad and the HuffPo didn't recheck it.  

HuffPo has a retraction up in the place of the original article.

What baffles me most about their retraction is that they ask a blogger they have on their own staff to review it, and he calls BS immediately.  How did that conversation not happen before you put up the sensational claim?

30 Days of Data Storytelling: Day 2

Today's lesson is actually an academic-ish paper on the history and application of story telling to data presentations.  It's an interesting review of the literature up until now, and some ideas for future research.

It struck me while reading this that measuring story telling is a lot like measuring humor: most people know it when they hear it, but it's really hard to define what makes one person better than another at it.  For both topics you can get mired down in issues of audience and personal taste, but it's clear there are still some general rules. That visuals should be a part of storytelling if possible (and especially with data) is one of these general rules.

I certainly applaud for anyone who calls for more research about how to effectively communicate data through visuals.  Translating complex concepts in to usable information is what's going to allow data nerds to increase their scope of influence.

Overall impression: A longer read, but lots of interesting citations and resources mentioned.

30 Days of Data Storytelling: Day 1

The first day of the 30 Day guide is a short article from Harvard Business Review on the real job of data scientists.

It's a solid article, covering the idea that the point of data collection is not just to throw out a whole lot of data, but to use it to tell a story.  This is done primarily through making sure your presentation is visual and accessible...both sold ideas.

This is certainly good advice, especially if the data you're pulling is data related to a problem with no solution or getting information no one knows.  When data is being pulled to support a particular agenda, the "data scientist as storyteller" idea could get dangerous, but I'm not sure most big data is being used that way at the moment (though obviously instances of this will be more high profile).

Overall impression: good for a read if you're interested in data science, short piece, easy reading

30 Day Challenge: Data Storytelling

Anyone who knows me knows me knows I love a good list.  That's why I was so intrigued a few months ago when Juice Analytics put up this list of "30 Days to Data Storytelling"...their list of resources to watch/read/do/play in order to improve your data storytelling skills.

I've been wanting to take a look at this for a while now, and it occurred to me that August might be an excellent month for me to go through this and blog about the experience.  I'll likely still do a few regular posts, but most of them will be relating to the resources they list, adding my thoughts and commentary.

I know this is a departure from the norm, but hey, what's wrong with a little experimentation now and then?

Context? There's an app for that

Or at least a chrome extension.

The Dictionary of Numbers is a browser add on that gives context to the numbers people throw out there.  Apparently it will add in line context to numbers it encounters (ie 8 million people = about the population of NYC), and also provides a search function when that doesn't work.

I'm going to try this out and see how I like it.  Intentionally or not, people are always throwing out big numbers without proper context, and something like this could really help mollify that.

I'll be reporting back.

Boy or girl...the choice is yours!

Apparently I am enjoying my summer a bit too much, or just the right amount, depending on how you're measuring.  I have an interesting backlog of articles to get to, just so you know I haven't forgotten about you all.

The one that's been bugging me the most is this article, titled "3 mammals that "choose" their babies sex".  Now to be fair, the article does pretty quickly clarify that there's likely no "choosing" going on...but there is proof that the gender ratio at birth changes based on circumstance.  I got interested in this because the last mammal on the list is human beings.

This conclusion is based on two studies.  The first one found that the 400 billionaires in the US were more likely to have sons than daughters (60% sons, 40% daughters).  This study got some press around the last presidential election, where it was noted that Romney had 5 sons, and Obama had 2 daughters.  Apparently the higher son ratio existed irrespective of whether the wealth was fully inherited, "actively grown" or earned from scratch.  This only existed for the male billionaires however...when it was the woman who had the money, she had a more even ratio of children (52% sons, 48% daughters).  One of the theories behind this is that men with lots of resources have more children, whereas it appears women with lots of resources do not.  This means there would be a genetic advantage to having more sons if you were at this level.

Now that's interesting to me, but I'm sort of curious what would have happened if you cross-referenced this with when the children were born in relation to the earning of the money.  This whole notion is precipitated on the woman somehow knowing the resources were there first...and yet if you look at many billionaire bios, it seems that some children were born prior to any wealth (as was the case with Romney's first three sons).  The study authors said that because the ratio is no different for inherited vs earned wealth, it is clearly not a quality of the males that influences the ratio (such as more testosterone = more male babies AND more financial success) , but I'm not so sure.  This could be easily tested by seeing if the gender ratio shifts once money is made (could be even more interesting if the men involved had second families with different women once they made their money...would the first wife or second wife be more likely to have boys?)  Also, the study authors mention they found these sex ratios by googling the it totally out there to think male children might be mentioned more often than female children?  Especially due to last name issues?

Now, I was going to leave it at that originally, but then I was googling a bit myself, and I found this paper.  It turns out someone had thought of my critiques already, and decided to go back and redo this research to try to amend for timing of earned wealth and also to get more meticulous about the counting.  It turns out male children ARE more likely to be mentioned in Wikipedia pages, and that the real ratio is 52% sons/48% daughters (general population is 51%/49%).  It also turns out that those with inherited wealth are more likely to have more sons than daughters (57%/43%), and those who worked for it are slightly (but not statistically significantly) more likely to have sons than daughters.

Now I don't think it's too hard to figure out why sons come up more in google searches than guess is more sons take an active part in the family business and more daughters change their last names so they may appear to be unassociated with the father.  I think this whole thing highlights how important raw data source is when trying to study something.  I mean, the authors of the first paper did multiple regressions, but they didn't bother to spend much time making sure that Wikipedia was accurate????  My guess is far fewer people know about the refutation of this paper than the original.

Anyway, it's an interesting case of researchers confirming their own expectations.  No one went back to check the crazy ratio of sons/daughters, because they expected a skew.  It also shows how weirdly people use evolution at times...essentially the first paper argued that (for some) there was a selection effect happening before the event that should have skewed things actually occurred.  Timing is important, not just outcomes.  It's not that sex selection doesn't occur, but I would be hesitant to assign a specific mechanism without more data.

Wednesday Brain Teaser 7-3-13

Not a true brain teaser, but I figure people may be away for the looooong weekend.

I saw an article today complaining about how much new research gets left out of books about the Revolutionary War.  I'm not a history buff, but in my skimming of the article it seems his primary complaint is that books tend to go for narrative over ambiguous but accurate portrayals of events.  No kidding.

This got me thinking though, of a question for this week:

What historical fallacy, commonly taught in schools or repeated in the press, is most annoying to you?

Feel free to define "historical fallacy" as you see fit...I have no agenda here...I'm just genuinely curious.  

Happy 4th of July everyone!

A false positive nightmare

Once upon a time, I took an International Public Health class.  As part of this class, the professor was teaching us about false positives and false negatives (false positives = test results that say you have something when you do not, false negatives = test results that say you don't have something when you do), and he asked which one we'd prefer in an initial screening test for disease.  Most of the class said false positives...better to initially believe you have something and be told later you don't, right?  He agreed that we were likely the US at least.  However, he explained, in other countries this may not be the case.  In some areas, even an initial suggestion that you had something like HIV could lead to some major fallout...spouse leaving, getting let go from your job, etc...that may not be easy to correct even once the final results were in. The problem is not always what a patient will do with information, but rather what others might do with the information.

I thought of this today when I read this story about a new mom in Pennsylvania who got her 3 day old baby taken away because she had eaten a bagel before going in to labor.  Yeah, you read that right.  The bagel happened to contain poppy seeds, and it turns out this caused her to test positive for opiates, which caused the hospital to report her, which caused her to have her daughter ripped out of her hands right after she got home.  Now, this story didn't make a tremendous amount of sense to me, so I read through the whole lawsuit (the hospital settled).  A few details that fill in some of the blanks:

  • This hospital has mandatory drug testing for all moms in labor.  This is actually not standard hospital for example only did this if there was cause.  No behavior on the part of the mother triggered this.
  • The cutoff used for the initial screening test is low...100 nanograms/uL.  In contrast the cutoff for say, Olympic athletes is 1000 nanograms/uL.  For federal drug testing, it's 2000 nanograms/uL for codeine, and 4000 for morphine.  The mother's levels were 300 nanograms/uL on the initial test, and 500 on the confirmatory test.  
  • The doctor who saw the mom and baby thought they were fine, so didn't even tell them about the test results.  She assumed they were a false negative.
  • The hospital reported these positive tests to state, whose policy states that two positive drug tests are all that's needed to take the child away.  They did no other investigation prior to removing the child.
Now based on the fact that the hospital and state social services have both paid money and changed their policy, I'm going to assume most of what's said above is true.  Given that, this is a scary real world case of people not understanding the ramifications of a false positive.  

Now truly, in the real world, is it better that a (known to be healthy) 3 day old baby spend two extra days in the care of a mother who uses opiates while an investigation can be done, or is it better that new parents have their baby taken away for several days for no reason?  The answer depends heavily on how often are they happening relative to one another.  Is it worth it if they happen at equal rates?  More false positives than false negatives?  More false negatives than false positives?

This is why it's so critical that people in many professions understand statistics.  As part of the lawsuit, it was explicitly mentioned that the training of the case worker failed to properly advise them that this could happen and to conduct themselves accordingly.  The judge who granted the ex parte petition also seemed to not know/not care about the false positive issue. 

Obviously we'd love to get the right information all the time, but the false positive/false negative debate is really about choosing which type of bad information you'd rather get.  This is a difficult choice, but the way to mitigate that is to remember that numbers are harder to misinterpret when you take them in the whole context, rather than just as stand alone facts.  In this case, the numbers are's the standards set around them that cause the problems.

Single vs Married at Work

Sorry for the impromptu hiatus.  I wish I had a good reason, but it's really a few random personal issues combined with being totally obsessed with finishing the Games of Thrones books.  I decided I needed to put them down when I unironically called someone "craven".

Anyway, I've had a story up in my browser for a bit now that I've been meaning to comment on.  It's this Slate story about how "family-friendly" workplaces are discriminating against those who don't have kids, by making those without kids cover for those with them.

*Bias alert*  I have a great deal of sympathy for the argument that kids should not be the only acceptable reason for people to leave the office early on a regular basis.  If people are able to leave to get to soccer games for their kids, it should be just as valid if it's your own rec league soccer game.  Obviously people with kids will likely have more emergency calls, but I believe that too should apply to kids as well as parents needing a caretaker, aunts, uncles, nieces, nephews, pets etc.  If you're a caretaker for someone, you have my long as you're getting your job done or taking available leave as allowed.  OTOH, there are some highly competitive or otherwise inflexible jobs that just don't allow this sort of thing, and I know that sucks.  When I've worked in environments like that (example: where you had to work on major holidays) there were generally blanket rules for everyone to keep things fair (work 2 out of 3 of Thanksgiving/Christmas/New Years)   Either way, I'm not a fan of having two sets of rules based on personal life choices.  *End alert*

Given that I'm inherently sympathetic to the viewpoint expressed, I was interested to find that I got really annoyed at what I was perceiving as a bit of a bait and switch within the exemplified by this quote:

When almost half of the people in the U.S. are single, why do companies continue to cater to their employees who are married with children?" 
This quote came from an author of a book about discrimination against single people.  What irks me is that she's moving the goalposts around...are people being asked to do more work because they're single or because they're childless?  Yes, half of the population may be single, but as best I can tell nearly 80% of women have a biological child by age 44*.   That doesn't count step kids or adoptions, by the way.

Now there may be some data somewhere that shows married people with no kids get asked to do less than single people with no kids, but if it exists it was not included in the article.  At least anecdotally though, I think single people with kids actually tend to get more sympathy than married people with kids when it comes to time off.  That's absolutely fine with me...not having a back up must suck...but at least some of the single people she cited above will be singles with children getting more breaks than singles without children.

I guess it's just strange to me that we can all suffer through endless headlines about how many children are being born to unwed mothers and then turn around and imply that single = childless.  Additionally, the number of people checking "single" who are living with someone has been growing as well.  As family structures change, binary categories are less and less meaningful.  I don't doubt that some workplaces could get better at this, but we have to accurately identify the problem before we can agree on solutions.

Thursday Quickie: the Fake Blake

I've written about false literary attributions before, but I found this one particularly amusing.  Apparently a librarian in England figured out that a poem (written in the 1980s) was being falsely attributed to William Blake (a poet from the 1800s).  Worse yet, this poem was actually being taught in multiple classrooms as an example of his work.

It's one thing when students don't check their sources, but teachers?  Come on guys.

Wednesday, June 19, 2013

Wednesday Brain Teaser 6-19-13

I think my one last week was too hard.  I'll provide the answer as soon as I can find it...I'm trying to remember which book I got it out of.

Anyway, here's an easier one:

A car travels at a speed of 50 mph over a certain distance, and then goes 30 mph over the same distance on the way back.  What's the average speed for the trip?

Self righteous hand washing

In honor of my little sister taking her nursing boards today, I thought I'd do a post about a pet peeve of mine: hand washing.  Well not hand washing exactly, but rather those who get worked up in to a foamy lather* when others don't do it.  Let me explain

I'm a fairly avid reader of advice columns, and there is a genre of letter that pops up every few months that goes something like this "Dear _____,  my coworker doesn't wash their hands and it makes me wretch and think they're a disgusting human being.  How do I confront them?"

Now these people are never hospital/patient care type employees, they tend to be just regular office workers.  What gets me so annoyed is that I have worked in patient care, and when I got my nurses aide license I even had to wash my hands in front of a state inspector.  Washing your hands is not easy and almost everyone does it wrong.  That's what annoys me about these letters.  Unless they're very meticulous, these people are likely not even being very effective themselves...and even if they are doing it effectively, nearly everyone else they work with is doing it wrong. Also, as someone who carries hand sanitizer around just to avoid having to insufficiently wash my hand in a public restroom, I get annoyed at people who think water = clean.

I thought about this today because I heard about a large scale study that vindicated my feelings:  95% of people do not wash their hands properly.  Properly means with soap, lathering for at least 20 seconds.  If you want to be good enough to get your nurses aide license, you also better use a paper towel to shut the faucet off, and angle your hands downward when you rinse to make sure you're not spreading germs up your wrists.  People are so bad at this that many hospitals now recommend that hand washing only be used to remove stuff that may have gotten on your hands, and that hand sanitizer is what should be use to disinfect.

I like studies like this because they are very useful for reminding people that our self-assessment does not always match reality.  My guess is that a very high percentage of people believe they are washing their hands correctly.  It's like how everyone believes they're an above average driver.

Anyway, best of luck to my favorite little sister, may you be several deviations from the norm (in the passing direction of course).

*See what I did there?

NSA and Father's Day

Happy Father's Day to all those in the relevant group!

I saw my father yesterday, and we, like much of the country, spent some time talking about the NSA leaks and Snowden.   My father asked how I felt about it, and I answered in a way only a daughter who's been debating her father for decades could answer:  You already know how I feel about it Dad, we debated this years ago when Bush was President.  He was testing me. Nothing makes my Dad happier than knowing he raised kids who keep their opinions consistent regardless of who's in power.

At that point my Dad mentioned that he had seen a survey that showed that Democrats and Republicans have switched places when it comes to supporting programs like this.  Under Bush, Republicans supported NSA surveillance programs, now the don't.  Vice versa for the Democrats.  

I didn't have a chance until today to look up the survey my Dad was talking about, and I found a good breakdown at here.

There are actually 3 different polls cited:  one from 2002, one from 2006, and one from just recently.  The numbers do, in fact, flip (and 2006 is more dramatic than either of the other two years).  Eugene Volokh however, does an interesting take on the numbers, and points out a different spin:
If the 38% of Republicans who said no still say no today, and the 45% who say yes new said yes in 2002, that amounts to 83% (out of the average of 93.5% responding) whose answers were the same. Likewise, if the 41% of Democrats who said yes still say yes today, and the 43% who say no now said no in 2002, that amounts to 84% (out of the average of 94% responding) whose answers were the same. (I oversimplify here by assuming that the same people were surveyed today as before, despite the changing composition of the public overtime; but if you relax that assumption, then the consistency rate might be even higher.)
Those numbers actually sound pretty reasonable to me.  One also has to wonder how many of those 16/17% would actually admit they legitimately changed their minds.  11 years is a long time.  Even if you took the more dramatic 2006 numbers, about 75% of each party maintained their beliefs.

Now obviously it was not very likely that the same exact people were polled, so we don't actually have evidence that any individual changed their mind.  The one thing to keep in mind when you see polls like this talking about Democrat vs Republican attitudes is that the type of person who identifies themselves with either party is changing.  Here are the breakdowns of Dem vs Rep vs Independent for the 3 years listed:

              Dem      Rep      Ind
2002      31         30         30
2006      33         28         30
2012      32         24         38

  Even if these survey had polled the exact same group of people and they all had answered identically, the numbers would have changed based on changing political affiliation (or lack thereof).  Things to ponder.

Wednesday Brain Teaser 6-12-13

If you're finding my weekly brain teaser too low stakes, try this one, win a million bucks!

There are 11 ways of expressing the number 100 as a number and fraction using the nine digits once each.

91+ (5823/647) = 100

The challenge is to find some of the other 10 ways.

Hint: In 9 of them, that first number is above 80.  In one of them, it's less than 10.

Tuesday, June 11, 2013

I've mentioned before on this blog that I get annoyed when people link A to B and B to C and then proceed to assume that the relationship between A and C is just the average/sum/etc of the first two.  In pure mathematics, the technical term for this is transitivity and it tends to be pretty valid.

I learned recently that there is actually a term for this when applied to epidemiology research: teleoanalysis.  Developed in the realm of public health, it's defined as
the synthesis of different categories of evidence to obtain a quantitative general summary of (a) the relation between a cause of a disease and the risk of the disease and (b) the extent to which the disease can be prevented. 
It has also been criticized, in large part because it was invented to help support pre-existing assumptions.  Both papers I linked to reference the "does cutting back on saturated fat actually prevent heart disease" controversy as an example.

I was thinking of this recently when reader Dubbahdee sent me this article about bicycle helmet laws.  The issue follows the same formula as above:

A. Bicycle helmets protect cyclists
B. Mandatory helmet laws increase the number of cyclists who wear helmets


C. Bicycle helmet laws save lives

What's interesting is it appears this is not the case.  The paper's authors suggest that increased helmet laws decrease bike ridership, and apparently having lots of bicyclists in an area makes it safer for cyclists in general.  Also, helmet laws seem to potentially inoculate lawmakers against making any bigger changes...the sort that actually help cyclist safety (infrastructure building, etc).

I thought this was interesting because it's absolutely proven that you as an individual should wear a helmet, but the conclusions drawn from that weren't valid.  Someone out there guaranteed that these helmet laws would save x number of lives, and they were wrong.

Monday, June 10, 2013

Post migraine post

I had a nasty migraine last night that kept me up for most of the night, and I'm not sure I have a real post in me.

In lieu of that, I have a linguistic issue I'd like to get off my chest: Misnomer does not mean "error" or "misconception" refers to an error in naming.

I'm sure my very smart and wonderful readers know this, but 3 times in the past two weeks I've heard people make this error.  If you're going to try to use big/unusual words, please use them accurately.  Oh, and that also goes for phrases in Latin.  Saying part of your argument in Latin doesn't make you right.

Sunday, June 9, 2013

Sunday Fun Links 6-8-12

Good morning!  It's hard to find fun links this week.  Why?  Because George @#&$@* Martin wrecked my life.  Game of Thrones may land me in therapy.  I knew about the red wedding, but really George Martin, really?  Your life goal is to make your readers scared to turn the page?  Well you've got me.  Fine.  You win.

Alright, here we go, the New Yorker has an interactive map of the rise of the microbrews.  Little known fact:  there was an award winning microbrew in Texas named after me.  True story.

This here is possibly the most joyous/beautiful practical joke/prank I've ever seen.  It made me smile.

These are some pretty cool illustrations about how chemistry works.

Looking for some more summer reading?  How about a book that will tell you "what would Jesus drink?"

Thursday, June 6, 2013

Thursday Quickies: DNA and the legal system

In other DNA related news, Scalia's dissenting opinion regarding DNA sampling in Maryland v King was my favorite thing I read this week.

There's some interesting math behind the practice of using DNA matching as sole proof in criminal cases.  The stats are normally presented to the jury as though it was a one in five million chance the person is innocent...but if the size of DNA databases starts to grow, that could lead to several hits.  Additionally, the stats do not factor in the chance that the sample was contaminated, or the chance that your DNA ended up somewhere randomly rather than intentionally.

End message: it shouldn't be treated as perfect.

Thursday Quickies: Can I be in a Geico commercial now?

A few months ago now, I decided to get my DNA sequenced through 23 and me.  I got my health results back yesterday, and while I'm still waiting on the full ancestry results, I did get one interesting piece of information:  I share an uncommonly high amount of DNA with Neanderthal's.

Apparently your average person of European decent get 2.7% of their DNA from cavemen, and I actually have 3.2%.  That puts me in the 99th percentile.  I'm happy to finally have an explanation for why I'm so short and brutish.  

More on their science here.

Wednesday Brain Teaser 6-5-13

There is one four-digit whole number n, such that the last four digits of n^2 are in fact the original number n. What is n?

Tuesday, June 4, 2013

Dating and marriage in the age of the internet

In light of rule #6 from my post on Sunday, I thought I'd take a crack at this article I got sent by my wonderful (and single!) brother.  The headline reads "marriage from online meetings is more stable, satisfying".  In case you're curious, the study was sponsored by...wait for internet dating site.  

This doesn't actually make the finding illegitimate however, though it does indicate we should use some scrutiny.  

First, as I'm sure many of my older readers have already wondered, this study only focused on people who have been married at most since 2005.  Given some lead time for publication and all, that means that they were studying the incidence of divorce in marriages in the first 7 years or so.  Now this isn't totally crazy...about half of all divorces occur in the first ten years of marriage (This is what I learned in school, but now I can't find a good source for this, but this article seems to back me up), so this study does likely tell us something.  It's interesting though that the abstract uses the word "slightly" to describe the lower divorce rate/marital satisfaction.  It turns out that's pretty true, as the divorce rate for those meeting online is about 6%, and for those not meeting online it's 7.7%.  This difference was smaller when they controlled for other factors, but was still statistically significant (they don't list it).

Now I don't think this is totally crazy.  It's a small difference, but I would imagine that much of that could be attributed to people who went online looking for love/relationships vs people in the offline world who just fell in to relationships with people they encountered.  Actively desiring marriage would, I presume have a protective effect on said marriage once it occurs.

Overall though, it is interesting to ponder where this might go.  Are the divorce rates going to be higher once we get more than 7 years out? Are there other changes coming due to online meetings that we haven't noticed yet?  Additionally, there's evidence that the divorce rate is not continuing to climb because many who  would have gotten divorced are simply not getting married.  As those folks continue to opt out, how will things change?  I will be anxiously awaiting the eHarmony followup.

Tales of the footnotes

I've written before about my 5 reasons you should check citations, and it occurred to me recently that I need to add a 6th.  Here's my updated list, changes in bold:

  1. Check that the source cited actually exists
  2. Check that the source cited backs up the part of the sentence that really needs backing up.
  3. Check that the source cited actually backs up the thing it's being used to back up, and doesn't just reference it obliquely.
  4. Check that the source cited states the point as strongly as the article authors state it.
  5. Check that the reference isn't so old as to be outdated, replaced, or from a paper that has been unreplicatable.
  6. Check that the reference was from an actual journal and/or otherwise reflects real scientific inquiry

I add this one on because the word "study" and "survey" get tossed around rather loosely at times.  Two examples that made me think of this:

First, from England:

Mr Gove said: “Survey after survey has revealed disturbing historical ignorance, with one teenager in five believing Winston Churchill was a fictional character while 58 per cent think Sherlock Holmes was real."
Those surveys, the Department has now revealed in response to an FOI request, included research conducted by Premier Inn, the budget hotel chain, UKTV Gold and “an article by London Mums Magazine”. None are known for their work in this field.

Mr Gove is apparently the British equivalent of the Secretary of Education.

Second was from a website with a rather interesting name (Manboobz).  The owner was apparently reading a book in which he saw the claim that schoolgirls hit schoolboys 20 times more often than schoolboys hit schoolgirls.  Upon investigating that citation, he discovered that it was not actually a formal study, but a class project a friend of his had assigned her students at his request.

These may both be small things, and the points they make may or may not be valid...but when in doubt it's always worth checking the source of the source.  The answers could be surprising.

Thursday, May 30, 2013

Thursday Quickies: Moms as breadwinners

I've seen a few headlines in the past few days about the study that showed that moms are breadwinners in 4 out of 10 households.  It's based on this Pew Research study, and I feel like there's a few nuances not made clear in the headline:
  • The denominator was not women or couples, the denominator was "households with children under 18".  Thus any women without children or whose children are over 18 were not counted.
  • 63% of the female breadwinner households were single mothers
  • Of the 37% who were not single mothers, the only requirement was that they out-earn their husband.  There is no mention of a minimum a wife who earns $1000/year more than her husband is counted the same way a wife earning $50,000/year more is counted.  
Also interesting:  married households with female breadwinners have an income of four times more than households with a single female at the head ($80,000/year vs $23,000/year).  Households with a male breadwinner have a median income of about $78,000/year.  This shows some interesting selection guess is that women who earn high salaries and out earn their husbands are less likely to quit/drop to part time when kids come on the scene.  Since part of the normal debate around working/not working post-baby is "does my salary cover daycare costs", it would make sense that women who could answer a resounding "yes" would be more likely to stay on and keep the family income higher.

Thursday Quickies: Stats, law, college and sexual assault

Eugene Volokh had up an interesting article that touch on the intersection of stats and law.  It was on the topic of campus tribunals that hear sexual assault cases, and I thought it showed a fundamental principle of stats fairly nicely: when in doubt, put it in words.   He does this with 3 legal standards for evidence: beyond a reasonable doubt (95% confidence), clear and convincing evidence (75% to 80% confidence) and a preponderance of evidence (51% or more confidence).  He then says to determine the standard we should convert this in to words:
  • Better that 19  students  guilty of sexual assault remain at the university, with no discipline imposed, than one innocent student be expelled or otherwise disciplined
  • Better that 4 students guilty of sexual assault remain at the university, with no discipline imposed, than one innocent student be expelled
  • These outcomes are about equally bad for both students and the university
There's some other interesting legal discussion in his post, but I thought the conversion of legal standards and probabilities in to clear sentences was a particularly helpful way to frame the discussion.

Thursday Quickies: Patient "engagement" plus cost

First, I got in to a Twitter discussion today with patient engagement advocate Dave deBronkart about his article that criticized this study.  The study was being advertised under the headline "When doctors and patients share in decision making, hospital costs go up".  The issue is the headline implies that the study looked at what happened when patients and doctors made decisions together.  In reality, the study asked admitted patients if they felt they should make decisions with the doctor, not whether or not they actually did.  It turns out that the costs for the individual hospital stay was about $860 higher for those who wanted to collaborate (median cost was $14,000 to begin with).  None of this was tied to outcome or future costs, so we have no idea if this was $800 that saved money in the long run, or $800 wasted.  It's pretty insidious, because it's being used to justify a rather paternalistic model of medicine that many people (including Dave) have been working hard to get away from.

High School Rankings

John Tierney has an interesting piece up at the Atlantic about how national high school rankings are not only meaningless, but actually harmful.

He doesn't quibble much with local rankings, and agrees that if done correctly they can provide good information for residents.  As for national level rankings though, he says this:
let's call national rankings of high schools what they are: nonsense. There is no way to say, with any degree of accuracy at all, where any given high school ranks in relation to others in terms of how good it is or how challenging it is. 
Now this seems pretty sensible to me.  Ranking all the schools in a given state against each other can be meaningful, though more so within ranges than with strict numbers (is there really a meaningful difference between #34 in the state and #35?).  But to pluck a few from around the country?  That's not even useful.  When my husband and I went to buy a house, we knew the general area we were looking at, and school rankings were one of many factors we looked at when picking a town to buy in.  I'm pretty sure most people  do something similar.  This works of course because we already had the region picked out and knew the trade offs that came with the individual regions.  You're not going to use national rankings like this.

Additionally, he notes that at least one of the national lists (the "Challenge Index" from the Washington Post) literally uses only one metric to determine a challenging high school: the number of AP (or similar) tests taken by the seniors at the high school, divided by the number of seniors:

Note that the numerator is not even the number of such exams passed, but merely the number taken. So, a given school can rise on the list by increasing the number of its students who take "advanced" classes. Conversely, schools that are more discerning and thoughtful about which students ought to be taking AP classes end up suffering in the rankings. So, the list produces nonsensical anomalies such as high schools with very low graduation rates ranking much higher on the "Challenge Index" than excellent schools that don't game the ranking system...
This idea of ranking interested me.  Ultimately, we actually picked the school district we did in part because of the options it holds for the not-so-academically inclined.  Don't get me wrong, it's in the top 20% of high schools in the state, but not by much.  More importantly, the regional technical high school is here, and there's opportunities to learn how to make a good living even if college isn't your thing.  I live in a state with a great educational system, and my town is no exception.  I'm less worried about AP tests, and more worried about school districts that might push kids in to inappropriate classes to keep their numbers up, to the detriment of the child.  While a certain baseline level of knowledge should be mandatory, I want my son to be challenged, but not tortured.  I'm suspicious of schools who try to hard on these lists, because school ranking and the best interest of the child don't always collide.

Looking further down the line, it's interesting to note that even more advanced methodologies almost always use the percent of kids headed to college as a judge of the high school's rigor.  As college costs continue to spiral and become and worse and worse investment, I'm curious if we're going to see a bigger and bigger divide between rich neighborhoods and poor neighborhoods in terms of rankings.  This could drive people out of the poorer neighborhoods, not because the schools were actually worse, but because the metric used to assess them is so contingent on parents have the cash to send their kids to college.  Things to ponder.

Monday, May 27, 2013

Social media growth

I saw a weird headline last week about how teens interest in Facebook has been waning.

I was curious what constituted "waning", and it appears that most of the proof is that other social media platforms are growing quickly.  What was interesting is that buried in the report is the Facebook ubiquity level among teams: 94%.

I'm sort of curious how any software that has that level of ubiquity could be doing anything other than "waning".  It certainly can't be going up.

Also entertaining: Facebook execs called Facebooks loss of young people an "urban legend".  I'm pretty sure a company that's less than 10 years old shouldn't describe anything as an urban legend.

Just for giggles, here's the growth chart:

It's hard out there boys

Ann Althouse linked to an article about the struggle of working class male undergrads vs middle-class undergrads:

Combine the “chiselled out of rock” body of actor Ryan Reynolds, the intellectual prowess of writer Christopher Hitchens and the “funny, quirky” demeanour of film star Joseph Gordon-Levitt and you have the perfect role model for male middle-class undergraduates. 
But while bourgeois students can “seamlessly integrate” many types of masculinity, a study at two universities concludes that their working-class peers find squaring the many demands placed on the modern man more challenging.

This looked like an interesting study, and I was all ready to read up on it...but it hasn't been published yet.  It's a conference paper.  That's fine, but I was pretty interested that this article gave pretty much zero proof of the assertion that middle class males were seamlessly integrating different types of masculinity, or that working class ones were struggling.  The only piece of data reported suggested that middle class men weree integrating anything was that they included "well groomed" and "metrosexual" as priorities in being good looking, whereas working class men did not.

Other than that, the article was mostly researcher's continued assertion that this phenomena occurred...though I question her bias a bit as she stated that working class men's way of thinking about intelligence "belies an assumption of entitlement to dominance....arguably a refashioning of traditional male hegemony”.

So how much of this is data and how much was spin?  Who knows.  Despite what the journalist is reporting, we might all just have to wait for the paper.

Anti-science is party neutral

I didn't mention it in my post yesterday, but part of the impetus to my father sending me the link about the water fluoridation was an ongoing discussion we have about the reputation of Republicans as "anti-science".  I actually get asked about this a lot, and my standard answer tends to be something along the lines of "I think almost everyone is anti-science".

If it's a topic that interests you, I suggest you check out Harriet Hall's latest post at Science Based Medicine about progressive mythology in science.  Lots of "natural is always better" type fallacies.

Some people in the comments are noting that libertarians and lefties can frequently wind up on the same side of some of these issues (like with water fluoridation), but I think it's slightly different for the libertarians.  At least the ones that I know don't so much think water fluoridation is bad, as that the government should be letting individuals choose.  That's annoying to public health people, but it's a political opinion, not a scientific one.


You may noticed I've added my Twitter feed to the side bar.  I've just started messing with it a bit, but I'm putting up some interesting links that I don't get a chance to write about here, and it felt weird to keep things separate.  

Complaints/comments/concerns welcome as always, and if you have Twitter, follow me!

Fluoride in the water (fire in the sky)

My Dad sent me an article today about Portland's ongoing debate about putting fluoride in their water.

There's a lot of interesting science around water fluoridation, but that's not what caught my eye.  What I noticed was this paragraph:

 Almost every credible national, state, and local health and science organization—private and public—gives its blessing to optimal levels of water fluoridation: The American Medical Association, the American Dental Association, the Environmental Protection Agency, the World Health Organization, American Academy of Family Physicians, and  the Center for Disease Control and Prevention, which named the measure one of the 10 greatest public health achievements of the 20th century. They all agree that fluoridated water is perfectly safe and extremely effective at preventing tooth decay.
I was intrigued by that paragraph because the link they provide for the organizations that support water fluoridation has 11 pages of organization names and their statements supporting it.

While there's many well known names on there, I was thinking about how hard it really is to know about lesser known organizations, and how easy it is to confuse various organization names.

Example: the American Medical Association is one of the biggest medical groups in the country.  The Association of American Physicians is a group dedicated to furthering biomedical research.  The Association of American Physicians and Surgeons is a group dedicated to "fighting socialized medicine and the government takeover of medicine".

Now you might recognize the difference between the first one and the other two, but my guess is most people will not remember which one is which 20 minutes after you finish reading this blog post.

Now I'm certainly not saying that these 11 pages are crap...there's some big names on that list.  What I am saying is that random names of groups is something people must take some due diligence to investigate.  I'm sure that the anti-fluoridation people could also come up with a long list of organizations that support them, even if it represented far fewer people.  In this age of propaganda, we must remember that organization names alone may not be enough to convince people.   Too much data causes overload, and we can't blame people for this.  Now go brush your teeth.

Weekend randomness

Every time I hear someone say they're random, I think of 14 year old girls Myspace pages, and I remember why Facebook won.

I also think of computer programming classes, and how I always get way too interested in how different programming languages come up with their random numbers.  Some use a digit somewhere in the computers time stamp.  This website uses atmospheric noise.  Now that's random. 

Want to see how good you are at being random?  

Try these tests.  

Race and wealth, relative or absolute?

Recently, my brother was a contributor to an infographic his organization put together about race and the wealth gap.   Despite knowing that I am inherently biased against infographics, he called me and asked my opinion on some criticism it had received.  The whole thing's fairly large, so I'm only posting the piece that caused the controversy:
The graph at the top had caused some commenters to question the use of "average" in lieu of median, and if it was skewing the results.  

Luckily, since my brother has listened to me rant for years about people not sourcing their facts, he had mad sure this graphic included the source of the numbers...a report from the Urban Institute that can be found here

I was interested to see that they not only acknowledge that they use average over median, but also give the median numbers to show that the trend is essentially the same.  Here they are for 2010: 
                       Average               Median 
White             632,000                124,000                                                                                         
Black             98,000                  16,000                                                                                        
Hispanic        110,000                15,000

Using average numbers, the absolute gap between incomes is larger...however I was interested to see that using median the ratio of incomes would have looked larger (8 times lager vs 6 times larger).  Honestly, there's pluses and minuses to using either angle.

Absolute inequality generally favors the gap (higher value - lower value) as the important measure.  This can make sense in some situations, but it tends to depend on where you start.  The difference between a person who makes $20,000/year and someone who makes $90,000/year is very different from the difference between someone who makes $90,000/year and someone who makes $160,000/year.  

Relative inequality looks more at the ratio between two numbers.  It also really depends on where you start, and is skewed by small starting numbers.  If I change the price of something from 50 cents to a dollar, it's doubled, but you still can likely afford it.  If I change it from $20 to $40, I'm going to lose some customers.

So given that, did they use the right one here?  Well, I think it was probably a toss up choice.  Picking average made the graph and some numbers below look larger, but they made the 2010 ratio numbers look smaller.  If they had switched from average to median depending on what was more substantial, I would have taken issue, but as it is I don't think there was anything deceptive going on.  After all, had they used the median numbers, they would have also changed the axis and the difference would have looked just as dramatic.  

There's always the possibility that they could have put both to prove this point, but I'm pretty sure only someone like me would have enjoyed that.  

Wednesday Brain Teaser 5-15-13

What digit is the most frequent between 1 and 1,000 (inclusive)?

What digit is the least frequent?

Also, can you beat the AVI's score on GeoGuesser?  Apparently he hit 28,000.  I think I created a monster on this one.

Whoa unto you, you generation of vipers

I saw an interesting study today that claimed that 51% of Christians were actually acting more like Pharisees than Christ.  It was based on a survey given to almost 800 people of a variety of Christian persuasions (practicing Catholic, practicing Protestant, notional (identifies as Christian but does not go to church), Evangelical, and born-again but non-Evangelical), and it asked them a series of 20 questions to assess their attitudes and actions, and gave them a score of "Pharisee-like" or "Christ-like".  Here's what they found:

They did some interesting breakdowns here, and had some good documentation of their methods.  My only qualm really, is how did they get the assessment questions?  

Here they are:
Actions like Jesus:
  • I listen to others to learn their story before telling them about my faith.
  • In recent years, I have influenced multiple people to consider following Christ.
  • I regularly choose to have meals with people with very different faith or morals from me.
  • I try to discover the needs of non-Christians rather than waiting for them to come to me.
  • I am personally spending time with non-believers to help them follow Jesus.
Attitudes like Jesus:
  • I see God-given value in every person, regardless of their past or present condition.
  • I believe God is for everyone.
  • I see God working in people’s lives, even when they are not following him.
  • It is more important to help people know God is for them than to make sure they know they are sinners.
  • I feel compassion for people who are not following God and doing immoral things.
Self-Righteous Actions:
  • I tell others the most important thing in my life is following God’s rules.
  • I don’t talk about my sins or struggles. That’s between me and God.
  • I try to avoid spending time with people who are openly gay or lesbian.
  • I like to point out those who do not have the right theology or doctrine.
  • I prefer to serve people who attend my church rather than those outside the church.
Self-Righteous Attitudes:
  • I find it hard to be friends with people who seem to constantly do the wrong things.
  • It’s not my responsibility to help people who won’t help themselves.
  • I feel grateful to be a Christian when I see other people’s failures and flaws.
  • I believe we should stand against those who are opposed to Christian values.
  • People who follow God’s rules are better than those who do not.
Now I don't know how many of these statements most people would or would not agree with, but I thought a more interesting list could have been generated by asking various scholars in each of the surveyed denominations what their definitions were.  Different people have different interpretations of things, and statements like "I find it hard to be friends with people who seem to constantly do the wrong things." seem pretty likely to mean different things to different people.  I mean, I'm not friends with people who steal my stuff or are continuously mean to me.  Is that self-righteous?