Condé Nast Traveler got 70 people from 70 different countries to count money on camera. Many times I found myself wondering, “Why would you ever do it like that?” There’s a metaphor for data and its interpretation somewhere in there.
Posted by: Nathan Yau
It's in the details of 100,000 moments. I analyzed the crowd-sourced corpus to see what brought the most smiles. Read More
Sandra Rendgen describes the history of “data” the word and where it stands in present day.
All through the evolution of statistics through the 19th century, data was generated by humans, and the scientific methodology of measuring and recording data had been a constant topic of debate. This is not trivial, as the question of how data is generated also answers the question of whether and how it is capable of delivering a “true” (or at least “approximated”) representation of reality. The notion that data begins to exist when it is recorded by the machine completely obscures the role that human decisions play in its creation. Who decided which data to record, who programmed the cookie, who built the sensor? And more broadly – what is the specific relationship of any digital data set to reality?
Oh, so there’s more to it than just singular versus plural. Imagine that.
Henry Hinnefeld answers the age-old debate of which Mario Kart character is best, using data as his guide.
Some people swore by zippy Yoshi, others argued that big, heavy Bowser was the best option. Back then there were only eight options to choose from; fast forward to the current iteration of the Mario Kart franchise and the question is even more complicated because you can select different karts and tires to go with your character. My Mario Kart reflexes aren’t what they used to be, but I am better at data science than I was as a fourth grader, so in this post I’ll use data to finally answer the question “Who is the best character in Mario Kart?”
For me, it doesn’t matter. You will smoke me regardless of which character I have, because I am world’s worst video game player.
A few years ago, Stephanie Yee and Tony Chu explained the introductory facets of machine learning. The piece stood out because it was such a good use of the scrollytelling format. Yee and Chu just published a follow-up that goes into more detail about bias, intentional or not. It’s equally worth your time.
(Seems to work best in Chrome.)
I feel like I was supposed to know what blockchain is a while ago, but I’ve only had a hand-wavy explanation on hand. And it wasn’t a very good one. Reuters provides a clear and concise visual explanation of how blockchain works. Now I can explain it to friends and family whenever there’s a Bitcoin spike or dip, or I can at least point them to this explainer.
Benjamin Schmidt, an assistant professor of history at Northeastern University, explored the space between words and drew the paths to get from one word to another. The above, for example, is the path between Seinfeld and Breaking Bad. Using Google News as the corpus, the steps:
- Take any two words. I used “duck” and “soup” for my testing.
- Find a word that is, in cosine distance, between the two words: that is, that is closer to both of them than either is to each other. Select for one as close to the midpoint as possible.* With “duck” and “soup,” that word turns out to be “chicken”: it’s a bird, but it’s also something that frequently shows up in the same context as soup.
- Repeat the process to find words between “duck” and “chicken.” That, in this corpus, turns out to be “quail.” The vector here seems to be similar to the one above–quail is food relatively more often than duck, but less overwhelmingly than chicken.
- Continue subdividing each path until no more intermediaries exist. For example, “turkey” works as a point between “quail” and “chicken”; but nothing intermediates between turkey and quail, or between turkey and chicken.
Schmidt’s results actually make a lot of sense.
See also: the Google arts experiment that motivated this one.
This is quite the scatterplot from Claire Cain Miller and Kevin Quealy for The Upshot. The vertical axis represents by how much girls or boys are better in standardized tests; the horizontal axis represents wealth; each bubble represents a school district; and yellow represents English test scores, and blue represents math test scores.
The result: a non-trend up top and a widening gap at the bottom.
In a spin on the view of ancient Earth and the shift of the continents, Ian Webster made a globe where you can enter a location and see what was in that spot millions of years ago. Not all addresses were working for me at the time, so you might want to try a major city if it’s doing the same for you. [via kottke]