Monday, August 19, 2013

Autism and Labor

Commenter Erin brought up the recent hubbub regarding induced labor and autism, and while I'd like to comment on it, Science Based Medicine has already done a pretty thorough job.  They put the breakdown quite succinctly:
In the case of this study, either inducing/augmenting labor triggers autism in some children, children with autism are more likely to require induced labor, or some other factor(s) is a risk factor for both developing autism and needing to induce or augment labor. This current study does not contain data that can differentiate among these possibilities.
Induced labor is a hard thing to study because (unlike c-sections) induction is very rarely completely elective. It is almost always precipitated by some other complication.  It's an interesting study though, and definitely indicates a need for more research.  Anything that gets people off the vaccine thing makes me happy.


30 Days of Data Storytelling: Day 4 and 5

Two videos for today, a long-ish one that gives more details about how to do things, and a Hans Rosling video that is a great example of a story with data.  I've seen the Rosling video a few times, but it's worth a look just to see how he shows his data off.

The other video is a good primer of what to do and what not to do when presenting data.  If you have time, worth a watch.

Tuesday, August 13, 2013

Literally Unbelievable

In 2013, I'm pretty sure it's a pretty universal experience to have at least one Facebook friend who is a bit of a train wreck.  I have one such person on my list, and for a variety of reasons I cannot delete him.  He is quite prone to daily postings of dozens of ridiculous political comments/links/cartoons that range from condescendingly disagreeable to outright offensive.  A large part of this offensiveness, IMHO, comes from the fact that a decent amount of what he posts isn't actually true.

He seems to be a deep sucker for a story that fits his pre-existing narrative, and at least twice a day I see something out of him that doesn't even pass a basic sniff test.  To be fair, he at least occasionally gets called out on this.  Apparently this has been getting to him though (the "hey that story's not true" part), because last night he posted quite the disclaimer that let everyone know that he "thoroughly researches" every story he posts.  

A mere 10 hours later, with no irony and lots of anger, he posted this article: Lance Armstrong Fails Drug Test for Job at Target.

On the bright side, just a few posts down on my newsfeed, a different friend posted this list chronicling the 35 best times someone on Facebook thought The Onion was real.  These two friends don't know each other, so it was pretty serendipitous.

It's a great list, and apparently it's drawn from a whole website of this sort of thing called Literally Unbelievable.

Check your sources people, check your sources.  

Monday, August 12, 2013

30 Days of Data Storytelling: Day 3

Doubling up on the posts since I got behind.

Today's entry was this awesome data simulation/graph/narrative about Olympic long jumping.

I remember watching a few of these around the Olympics last year, and it was pretty cool.  It's a good overview of raw data, with visuals and comparisons to put it in context.  Context is one of the most underutilized aspects of data presentation.  Hearing "he jumped 26 feet" is impressive, but hearing "he jumped from the edge of the court past the 3 point line" gives context.

It's a short video, definitely worth a watch if you have the time.

30 Days of Storytelling: Day 2 (Pixar version)

So after posting the first two articles last week, I realized those were supposed to be a combined Day 1, making this the real day 2.

Day 2 was two interesting Pixar related things...one a list of their rules for great storytelling and the other a short (about 3 minutes) video where they tell a story with no words.  If you've ever seen a Pixar movie, you know they can tell a fantastic story, so it was interesting to read their take on the craft.

A few of their rules particularly stood out as relevant to data stories:

#2 You gotta keep in mind what’s interesting to you as an audience, not what’s fun to do as a writer. They can be very different.
#11 Putting it on paper lets you start fixing it. If it stays in your head, a perfect idea, you’ll never share it with anyone.
#17 No work is ever wasted. If it’s not working, let go and move on – it’ll come back around to be useful later.

I'm sure there are others that could apply, but those are the 3 that really struck me.  Sometimes I find fun and funky data that no one else is interested in.  I'm always having to refocus on the question at hand.  When you analyze data a lot, the "normal stuff" can get boring, but normal is interesting to someone who's seeing it for the first time.  That bleeds in to #11....you can't always know what's interesting to people until you start to share it.  Testing reactions and assessing opinion is valuable.

When something flops, that's when #17 comes in.  I store all the data I come across for future use.  It's interesting how often something no one was interested in can later become critical.

The video's just cute.  Show it to the small child in your life.

Wednesday, August 7, 2013

McDonald's wages: not what they appear

Last week I saw a HuffPo headline peppered on several facebook/twitter feeds/etc that claimed that "Doubling McDonald's Salaries Would Cause Your Big Mac To Cost Just 68¢ More."  It was an interesting claim, but ultimately I didn't click on the link, as I tend to find most economic analysis pretty dubious from the get go.   Now, I know nothing about economics, but as a systems person, I generally believe you can't change things dramatically in one area  of a business (such as doubling salaries) and expect to fully know the results just by adding a few numbers (the cost of a Big Mac going up just $0.68).  

I actually almost blogged about it, when I saw a snippet on Volokh about the lack of thought about the repercussions of such a change on the type of person McDonald's hired.  Most people seemed to be assuming that all the poor folks currently working at McD's would get raises, but isn't it more likely the jobs would become more competitive and the population they hired would change?  Interesting thought.

Well, I'm now glad I didn't post on any of it, because apparently the whole analysis was crap anyway.    Apparently the guy who was looking at it left out the 80% of McDonald's that are franchises (but included the franchise fees as profits), and it excluded a bunch of other accounting issues I don't understand.  Oh, and the "study" that had shown this originally was the work of an independent undergrad and the HuffPo didn't recheck it.  

HuffPo has a retraction up in the place of the original article.

What baffles me most about their retraction is that they ask a blogger they have on their own staff to review it, and he calls BS immediately.  How did that conversation not happen before you put up the sensational claim?

Tuesday, August 6, 2013

30 Days of Data Storytelling: Day 2

Today's lesson is actually an academic-ish paper on the history and application of story telling to data presentations.  It's an interesting review of the literature up until now, and some ideas for future research.

It struck me while reading this that measuring story telling is a lot like measuring humor: most people know it when they hear it, but it's really hard to define what makes one person better than another at it.  For both topics you can get mired down in issues of audience and personal taste, but it's clear there are still some general rules. That visuals should be a part of storytelling if possible (and especially with data) is one of these general rules.

I certainly applaud for anyone who calls for more research about how to effectively communicate data through visuals.  Translating complex concepts in to usable information is what's going to allow data nerds to increase their scope of influence.

Overall impression: A longer read, but lots of interesting citations and resources mentioned.

Monday, August 5, 2013

30 Days of Data Storytelling: Day 1

The first day of the 30 Day guide is a short article from Harvard Business Review on the real job of data scientists.

It's a solid article, covering the idea that the point of data collection is not just to throw out a whole lot of data, but to use it to tell a story.  This is done primarily through making sure your presentation is visual and accessible...both sold ideas.

This is certainly good advice, especially if the data you're pulling is data related to a problem with no solution or getting information no one knows.  When data is being pulled to support a particular agenda, the "data scientist as storyteller" idea could get dangerous, but I'm not sure most big data is being used that way at the moment (though obviously instances of this will be more high profile).

Overall impression: good for a read if you're interested in data science, short piece, easy reading


30 Day Challenge: Data Storytelling

Anyone who knows me knows me knows I love a good list.  That's why I was so intrigued a few months ago when Juice Analytics put up this list of "30 Days to Data Storytelling"...their list of resources to watch/read/do/play in order to improve your data storytelling skills.

I've been wanting to take a look at this for a while now, and it occurred to me that August might be an excellent month for me to go through this and blog about the experience.  I'll likely still do a few regular posts, but most of them will be relating to the resources they list, adding my thoughts and commentary.

I know this is a departure from the norm, but hey, what's wrong with a little experimentation now and then?