Category Archives: open data
Datopolis is a board game by Ellen Broad and Jeni Tennison from the Open Data Institute, and as you might expect, it promotes the use of open data.
Datopolis is a board game about building things — services, websites, devices, apps and research — using closed and open data. It’s set in a town called Sheridan, which is gradually declining as shops close, teachers quit, hedgehogs go extinct and pollution rises. The tools that players build contribute to making Sheridan a healthier, wealthier, happier place to live.
Sounds good to me.
Government data is, shall we say, not the easiest to use and look at, which is why there are so many ongoing efforts to make it more accessible to both practitioners and the average citizen. There’s no doubt that the data is useful. The Sunlight Foundation does fine work with various projects, Census Reporter provides data at a glance, and efforts like IPUMS make certain large datasets easier to subset and grab.
Data USA, a collaboration between Deloitte, Macro Connections at the MIT Media Lab, and Datawheel, is another hefty project that aims to make government data feel less hairy. It uses data from a number of sources — the American Community Survey, the Bureau of Economic Analysis, and the Bureau of Labor Statistics, to name a few — to create profiles for locations, industries, occupations, and education.
Just enter your interest in the search box, and you quickly get common statistical breakdowns. Seems pretty great if you want summaries in a pinch. From there, you can embed and download charts, download data, and make comparisons. There is also an API, and the project is open source.
It feels like a statistical atlas of the United States, with modern functionality.
Kaggle just opened up a Datasets section to download and analyze public data.
At Kaggle, we want to help the world learn from data. This sounds bold and grandiose, but the biggest barriers to this are incredibly simple. It’s tough to access data. It’s tough to understand what’s in the data once you access it. We want to change this. That’s why we’ve created a home for high quality public datasets, Kaggle Datasets.
It's still really new and only has a handful of datasets but it looks interesting. The key is that it's not just a place to download data. Instead, they have analysis environments and make it easy to share code that makes use of the data. They also make it easy to share results.
Oftentimes, it's the getting-started hurdle that gets in the way of working with a large-ish dataset. Maybe this will help set things on the right path.