Showing posts with label Big Data. Show all posts
Showing posts with label Big Data. Show all posts

March 04, 2013

Big Data

Maybe a quarter of the questions I get these days are about what some call "Big Data".  For a brief primer, please refer to Wikipedia's definition of Big Data (click here).

There are two distinctly unique aspects to what the pundits call "Big Data".
  • Technology.
  • Applications.
I won't focus on Technology.  Rest assured that large vendors will develop slutions that promise to save the world.  You'll purchase the solutions, and you'll achieve varied levels of success ... just like you've been doing since we moved from mainframes to PCs in the late 80s.

I focus on Applications.

There are at least four key Application concepts to pay attention to.  They are:
  1. Complex Adaptive Systems.
  2. Dirty Algorithms.
  3. Hyper-Optimization.
  4. Brand Interaction.
Complex Adaptive Systems (click here):  This is what we fail to understand about our world.  Things connect, and they interact with each other, often yielding unpredictable outcomes.  In the catalog world, cataloger interaction with co-ops is representative of a Complex Adaptive System.  Catalogers volunteered customers to the co-ops, co-ops used algorithms to redefine the names, and then resold the names back to catalogers.  There are many participants in this system, dependent upon each other.  Their interactions yield unpredictable and unusual results (i.e. co-ops spinning 55+ customers to catalogers, accelerating the evolution of catalogers).

Dirty Algorithms:  This is my term, and it will be the bane of our existence!  Dirty algorithms seek to maximize the profitability of a portion of a Complex Adaptive System, without understanding how the Dirty Algorithm soils the entire Complex Adaptive System.  Example?  Easy!  Credit Default Swaps and their role in the meltdown of the global economy in 2008.  When a financial institution buys insurance to "spread the risk" of an investment, the financial institution is inserting a Dirty Algorithm into the Complex Adaptive System.  In the "Big Data" world, companies will routinely insert Dirty Algorithms into Complex Adaptive Systems.  9 times out of 10, this will not be done with malice, but rather, ignorance of how Complex Adaptive Systems work.  1 time in 10, this will be an act of pure evil.  We won't know the difference, we'll just be cleaning up messes all the time.

Hyper-Optimization:  We're at least a decade in to the era of Hyper-Optimization, and thus far, the results have not been pretty.  The best example of Hyper-Optimization happens in web analytics - earnest, honest, and well-intentioned analysts seek to increase conversion rates.  They take friction out of the system, spending time, resources, and money improving conversion rates, not realizing that the actual behavior exhibited by customers does not change ... that in reality, the web analyst caused a customer who visited the website 4 times before a purchase to visit 3 times before a purchase.  When the underlying behavior does not change, we are Hyper-Optimizing ... changing an outcome that does not fundamentally change the behavior.  This happens when we measure the wrong attribute.  If the web analyst measured annual frequency and annual repurchase rates, the web analyst would not Hyper-Optimize a meaningless outcome.  Email subject lines also fall under Hyper-Optimization ... here, marketers realize that conversion rates won't increase unless 20% off plus free shipping offers are provided.  The problem in this form of Hyper-Optimization is that the email marketer only attracts discount buyers, further fueling the need for future discounts.  If this behavior continues, the email marketer is no longer engaged in Hyper-Optimization, but rather, has introduced a Dirty Algorithm into the Complex Adaptive System.  Along these lines, Cyber Monday is the most disappointing version of Hyper-Optimization, whereby online brands now offer 30% off plus free shipping to yield the best final Monday of November in history, never minding that sales are depressed in the three weeks prior to Cyber Monday to wait for the discount.  Hyper-Optimization is a direct outcome of terrible measurement practices.

Brand Interaction:  Here's where most of us enter into our relationship with Big Data.  Most of us will treat Big Data as a glorified form of Campaign Management.  In Campaign Management, actions were linear and additive.  We mail 100,000 catalogs, we get $500,000 in demand ... we send 1,000,000 email messages, we get $200,000 in demand ... we buy 10,000 clicks for $0.50 each and we get 300 orders ... Cause and effect.  This is the world most of us honed our marketing skills in, in the 1980s and 1990s, a pre-Google world.  The reality is that we've always operated in a Complex Adaptive Ecosystem (called "the economy"), but we didn't have the data to help us understand the truth.  Most of the Big Data hucksters will operate in this realm, promising real-time decisions that dramatically boost profitability.  What they'll be doing, however, is a simple transfer of demand, from one party to another.  Yes, on a macro-economic level, growth can happen.  But by and large, on the level we deal with, we're trading demand among players.  Big Data solutions providers will simply push demand back and forth between those buying (or not buying) solutions ... and in some cases, will, by accident, interject a Dirty Algorithm that will cause all sorts of problems, or will Hyper-Optimize (pushing demand out of certain windows, into others).

Your Job?  Be smart, and I mean that with all honesty.  Most Big Data solutions will sound very seductive, on a Campaign Management / Brand Interaction level.  Your job is to ask solid questions, as you try to understand how a Big Data solution interacts on a Complex Adaptive System basis.  Are you being sold a Dirty Algorithm?  Who is demand being transferred from?  Are you simply Hyper-Optimizing a situation without yielding long-term growth?  As an example, remember that every time you use Dictionary.com, more than 200 cookies are placed on your computer.  Your simple level of inquisition at Dictionary.com results in hundreds of businesses harvesting information, pushing your inferences into the Complex Adaptive System called "Marketing", with Dictionary.com obtaining profit.  Those companies will attempt to influence you via Brand Interaction, in the form of Campaign Management.  You need to learn how this impacts you as a customer, and how it impacts the company you work for.

Go beyond the hype.  Study Big Data within the context of Complex Adaptive Systems, Dirty Algorithms, Hyper-Optimization, and Brand Interactions.  You'll find that Big Data is far more interesting at this level than what you read about in trade journals.

December 04, 2012

Filtering Signal From Noise: Big Data

In a "Big Data" world, we're told that we need to collect all sorts of data from all sorts of sources, yielding an "omni-channel" view of the world.

That may be true.

Now, I want for you to watch this time lapse video.  Essentially, one image is taken every twenty seconds during the course of the day ... in other words, the majority of the data has been stripped out of this video.

Watch.


Artistically, the video is interesting, right?  But more important, look at what happens to our understanding of that day when we strip out the vast majority of information.  By removing data (not by adding data), we are left with a unique story to convey.

The same thing happens with that hyped-up fad known as "Big Data", doesn't it?  We spend all of our time trying to combine data from different sources, so that we can find nuggets of actionable insights.

Now, sure, a Big Data advocate would say that you could collect all of the data, and then just accelerate the data so that the end result is the same as the video above.  Have at it.

But what would happen if we do the opposite?  What happens if we strip out all of the junk, the noise, so that we're left with only the good stuff, the "signal"?

Food for thought.

September 19, 2012

#bigdata and #smalldata

Maybe you've heard ...

... "Big Data" is going to save the world.  And generate profit for your business, the kind of profit that only omnichannel solutions could theoretically generate.  "Big Data" will decide who our President will be, it will decide the future of health care, and it will play a crucial role in protecting and/or violating our privacy.

There are at least three components to big data these days.
  1. Vendors who are offering solutions not all that dissimilar to the solutions offered over the past twenty-five years.
  2. Pundits who talk about big data, hoping to garner page views or followers on Twitter.
  3. People actually doing amazing work, not talking publicly about "Big Data" (hint, there's many people in this category).
If you're in the catalog world, then you've been dealing with "Big Data" for at least fifteen years ... you happily volunteered your most valuable asset (your customer list) to the co-ops.  Co-ops are the very definition of big data, they just never called themselves "Big Data".

If you're in the online world, then you've been dealing with "Big Data" for more than a decade.  Do you have any idea where your retargeting campaigns are being deployed?  Big Data is deciding that ... and has been for the past four thousand days.

Here are seven "Big Data" themes that are actually recycled and repackaged concepts from the past twenty-five years.
  1. You need a new database infrastructure to handle the volume of data.
  2. You need to combine data across all channels to obtain a "360 degree" view of the customer.
  3. Your database infrastructure needs to be "fast" ... in the parlance of the day, providing results in "real time".
  4. You need sophisticated data mining algorithms to find "nuggets of gold" in the data that humans cannot detect.
  5. You need ad-hoc query tools that allow all employees to query the database on their own, obtaining their own answers.
  6. You will need a testing platform, so that you can "optimize" your business results and become "data driven".
  7. You will self-actualize, with business performance at the top of Maslow's Hierarchy of Needs.
Now, let's be honest.  There is something to "Big Data", it's not a fad, it's the hype that is out of control.

But it doesn't change where we spend the majority of our time, and it doesn't change these fundamental concepts:
  1. You will always need a new database infrastructure, as the volume of data you'll manage will always increase.  This hasn't changed in 25 years.
  2. We've been combining data from all channels since the advent of e-commerce.  The 360 degree view of customers didn't fundamentally change anything, it simply revealed that business is complicated.
  3. Database infrastructures have always needed to be fast, going back to SB37 memory errors on IBM mainframe computers running Easytrieve Plus.
  4. We've been told for decades that data mining algorithms would find "nuggets of gold" in the data that humans couldn't detect.  Remember the Neural Network craze of the early 1990s?  How many "nuggets of gold" do you recall unearthing from your exploration of Neural Networks and Genetic Algorithms?  A half-dozen in twenty years, if you're lucky?  Certainly not a half-dozen a week.
  5. The business intelligence phase of vendor hype gave us ad-hoc query tools (Business Objects, Microstrategy).  How many garden-variety, non-data employees used these tools for anything other than producing simple reports?  Only a small fraction of employees using web analytics tools ever pushed this front forward.  Now, these poor web analytics folks are being told they are "outdated" in a "Big Data" world, so they are rebranding themselves as "digital" analysts.  These folks will always be valuable, regardless of the evolution of Big Data.  Their smarts matter, folks.
  6. Folks have been testing/optimizing for thousands of years.  Complex multivariate testing has been documented as far back as the 1930s - 1940s ... Google "Snedecor and Cochrane" for details.
  7. You will not self-actualize.  If anything, you'll implode.  "Big Data" will likely give you insights that are worthwhile, however, you have to pull the rest of your business with you, in the direction you want them to go.  Have you ever tried to align the thoughts of 50,000 employees at Nordstrom, for instance?  Hint - it's not easy!
I'd like to introduce a new concept to you.

Let's call it "small data", or as they say on Twitter, #smalldata.

"Big Data" seems to be about tools and techniques and data integration and hardware and software and automation and post-CRM theory and real-time analytics and KPIs and reporting.

"small data" is all about teaching people what you learn, evangelizing ideas, encouraging employees to be great via information.

#smalldata
  1. Is all about analyzing the data you have to make decisions today.
  2. Is about the "99%".  99% of the decisions you make on a daily basis have nothing to do with "Big Data".
  3. Doesn't require integrated data across all sources compiled in real-time.
  4. Can easily be accomplished in Excel.  Or Google Analytics.
  5. Requires the analyst to be a great communicator.
  6. Is about the message, not about the hardware, software, or database platforms.
  7. Is actionable.
  8. Is not geeky.
  9. Is about teaching.
  10. Is about profit.
  11. Values and honors "what came before" Big Data.
  12. Gladly leverages "Big Data" when appropriate.
#smalldata

Go evangelize it ... you use #smalldata every single day ... always have.