Not so likely life of The Simpsons

For The Atlantic, Dani Alexis Ryskamp compares the financials of The Simpsons against present day medians, arguing that the fictional family’s lifestyle is no longer attainable:

The purchasing power of Homer’s paycheck, moreover, has shrunk dramatically. The median house costs 2.4 times what it did in the mid-’90s. Health-care expenses for one person are three times what they were 25 years ago. The median tuition for a four-year college is 1.8 times what it was then. In today’s world, Marge would have to get a job too. But even then, they would struggle. Inflation and stagnant wages have led to a rise in two-income households, but to an erosion of economic stability for the people who occupy them.

Someone should take this a step further and look at distributions and time series to show the shift, with The Simpsons as baseline.

Tags: , , , ,

Summary Statistics Tell You Little About the Big Picture

Mean, median, and mode. These are the first things you learn about in your introductory statistics course. It’s often all you hear about when you see data in the news. People form policies for populations, based on the generalized numbers.

However, these summary statistics can only tell you so much about a dataset, which means you can only learn a limited amount about what the data represents — the people, places, and things.

If you’re the one who consumes the data, you should wonder what the means and medians actually represent. If you’re the one who analyzes the data, spend time with the most granular that time and resources allow for. Something more interesting will almost always come out of it.

Tags: ,

Same summary statistics, completely different plots

Summary statistics such as mean, median, and mode can only tell you so much about a dataset. Their scope is limited because for them to be useful, you have to assume things like distribution and dependencies. Visualization helps you see what else there is.

Justin Matejka and George Fitzmaurice demonstrate in their paper for the ACM SIGCHI Conference, in which they developed a method to generate datasets that “are identical over a range of statistical properties, yet produce dissimilar graphics.

Tags: , ,