December 04, 2012

Filtering Signal From Noise: Big Data

In a "Big Data" world, we're told that we need to collect all sorts of data from all sorts of sources, yielding an "omni-channel" view of the world.

That may be true.

Now, I want for you to watch this time lapse video.  Essentially, one image is taken every twenty seconds during the course of the day ... in other words, the majority of the data has been stripped out of this video.

Watch.


Artistically, the video is interesting, right?  But more important, look at what happens to our understanding of that day when we strip out the vast majority of information.  By removing data (not by adding data), we are left with a unique story to convey.

The same thing happens with that hyped-up fad known as "Big Data", doesn't it?  We spend all of our time trying to combine data from different sources, so that we can find nuggets of actionable insights.

Now, sure, a Big Data advocate would say that you could collect all of the data, and then just accelerate the data so that the end result is the same as the video above.  Have at it.

But what would happen if we do the opposite?  What happens if we strip out all of the junk, the noise, so that we're left with only the good stuff, the "signal"?

Food for thought.

3 comments:

  1. Your end users should never be exposed to the raw data-only the pieces that help them in their work. Of course, the hard part is finding that signal.

    That's a major part of what Machine Learning and other analytics techniques are for-you process the data to figure out what's actually relevant.

    If there's one thing I've learned, it's that most people's intuition about what attributes actually matter is *completely* wrong. And that's part of what makes "big data" so fun. Done right, you find tons of counterintuitive, but useful, information!

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Satvik, tell the computer what to do, don't let the computer tell you what to do. You are much smarter than the machine!

    ReplyDelete