Friday, June 29, 2012

Big Data

It's taken me a while to get into the idea of "big data" as something special. I've been working on largish structured data sets for a while, and I think the challenges of getting valuable and reliable insights out of that data is challenging enough. My first thought was "now I'm supposed to try to extract something meaningful out of a mishmash of inconsistent, sparsely populated, questionably reliable data?"

Some things I've read over the past few months have helped me get over that hump, so I wanted to share them.

First was "The Information" by James Gleick. He's my new favorite author. This was a great history of information theory.

Second was "Thinking Fast and Slow" by Daniel Kahneman. He's a Nobel prize winner and proponent of behavior economics. Fascinating read.

Last was "Chaos" by James Gleick about the origins and founding of chaos as a mathematical science. Another amazing read.

These three books so close together helped me get a much stronger intuition for the power and meaning of insights that can come from big data and machine learning techniques. If you're on the fence about the usefulness of big data, start reading.