Kevin Hillstrom: MineThatData: False Metrics

January 25, 2009

False Metrics

In our world, we've heard the phrase "Multichannel customers are the best customers" for about a dozen years.

This phrase is used by retail and catalog CEOs, by consultants, and especially by the vendor community. It is frequently used as the reason for implementing various marketing strategies, opening stores, or for purchasing software systems. It is sort of like telling somebody to buy an oven because really good food comes out of the oven --- not understanding all of the dynamics that go into making and baking good food.

The multichannel metric is an easy one to calculate. Folks will filter only the twelve month buyer file, and will query that file, summing past demand based on past physical channels (phone, web, stores) purchased from. Others calculate the metric in a "forward looking" way, measuring twelve-month future spend as a function of past channels.

Here's an example, using aggregated and dummied-up multichannel data.

1 Channel In The Past = $169 in the future.
2 Channels In The Past = $226 in the future.
3 Channels In The Past = $362 in the future.

So, we've proven the statement, right?

Well, there's another way to validate the information. We can build models --- Logistic Regression for the repurchase rate, Ordinary Least Squares for spend per repurchaser.

In this example, I built models using numerous variables for a traditional business that has three physical channels.

Square Root Of Months Since Last Purchase.
Purchases In The Past 12 Months.
Purchased 13+ Months Ago.
Average Order Value
Is Customer An E-Mail Subscriber? 1 = Yes, 0 = No.
Number Of Physical Channels Customer Purchased From.
Number Of Merchandise Divisions Purchased From.
Number Of Online Channels Purchased From (Paid Search, Affiliates, Natural Search, Blogs, Shopping Comparison Sites, Portal Advertising, etc.).
Square Root Of Catalogs Mailed In Subsequent Twelve Months.

Here's the Logistic Regression Coefficients And Wald Statistics

Constant = -2.386.
Orders Past 12 Months = 0.324, Wald Statistic = 2800.
Square Root Of Subsequent Catalogs Mailed = 0.293, Wald Statistic = 1800.
Orders 13+ Months Ago = 0.084, Wald Statistic = 1300.
Square Root Of Recency = -0.187, Wald Statistic = 720.
Number Of Merchandise Divisions Purchased From = 0.049, Wald Statistic = 200.
Number Of Web Advertising Channels Purchased From = -0.073, Wald Statistic = 100.
E-Mail Subscriber = 0.110, Wald Statistic = 65.
Physical Channels Purchased From = -0.070, Wald Statistic = 40.

And here's the Ordinary Least Squares Regression Coefficients, and T-Statistics.

Constant = -105.943
Average Order Value = 1.343, t = 87.
Number Of Twelve Month Orders = 73.483, t = 77.
Number Of 13+ Month Orders = 6.384, t = 25.
Number Of Web Advertising Channels Purchased From = -18.483, t = -15.
Square Root Of Subsequent Catalogs Mailed = 14.395, t = 13.
E-Mail Subscriber = 28.548, t = 8.

Now, the statisticians out there will start "nibbling on the cookie", trying to pick apart all of the problems they see with the modeling strategy and variable definition. Go ahead, pick away.

The CEO will want to understand the strategic implications of the model results. What did we learn?

Traditional RFM variables are the most important variables. We want recent customers who order frequently and spend a lot each time they order. Traditional catalogers would say "duh", but this is news for many web analysts and e-mail gurus and e-commerce leaders. An entire set of KPIs can be developed to understand if the business is evolving in a favorable manner.

Catalog mailings are important, but have a diminishing rate of return as they are mailed, meaning each additional catalog is less and less important.

Merchandise divisions are somewhat important. You want a customer who purchases DVD players and LCD televisions --- that is more important than having a customer who buys just iPods. Notice that the coefficient is "small", meaning it isn't nearly as important as having customers who purchase multiple times.

Physical channels have a negative coefficient. In other words, after controlling for RFM factors, having multichannel customers means nothing, in fact, it is negative!

E-mail is interesting --- e-mail as a program has the same value as getting a customer to purchase about 1/3 of an order --- so e-mail doesn't have a huge amount of value. Now in the spending model, e-mail adds $29 of value, so that is good. Here's the thing, folks. Take your twelve month audience, and for each subscriber, plug the customer data into the model. Sum, across all customers, total 12 month future value due to e-mail marketing, and compare that with the demand your e-mail reporting system told you that you generated over the past twelve months for customers who were twelve month buyers at this time last year. Do the numbers tie out? They shouldn't. The model coefficients, if derived properly, are going to be more accurate at communicating true e-mail value than the open-rate, click-through-rate, conversion-rate metrics everybody is taught to look at.

Web advertising channels are negative --- in this case, you don't want a customer buying because of paid search and affiliates and portal advertising. These customers might be doing too much comparison shopping and won't build long-term loyalty the same way as other customers do.

Similar trends happen in the spending model.

You see, we don't always need to build models for targeting purposes. We build models for strategic purposes, for communicating to CEOs, for determining a marketing strategy.

And the models frequently debunk established best practices, don't they?

This data suggests it is more important to get customers to buy from multiple merchandise divisions than from multiple channels. So why not focus on revamping an e-mail marketing program by offering multiple merchandise divisions in the creative template you are using? Why not focus on landing pages that offer multiple merchandise divisions?

This data suggests that catalog marketing has a point of diminishing returns. Why not test the appropriate number of catalogs to send to different customer segments?

This data suggests that customers age rapidly --- orders 13+ months ago are worth maybe 15% or 20% as much as recent orders --- and response decreases as months since last purchase increase. So when you have a customer who hasn't purchased in six months, think strategically about how you re-engage that customer --- or let the customer go and acquire a new one.

This data suggests that physical channels ultimately have little meaning --- you don't care if the customer buys online and in stores --- you need to care about what the customer is buying, not the channel they are buying it from. Now channels may have importance if you cannot acquire customers in the catalog/phone channel that you currently acquire online. It's important to understand these dynamics --- and these are dynamics we seldom talk about.

Strategy, as outlined in this article, is sorely missing from multichannel marketing. We don't instruct our statisticians, if we're blessed enough to be able to afford one, to create strategic models. And we sure don't lead our co-ops, who have these answers and many, many more answers embedded in their databases --- we don't demand answers from them, we just pay them for access to inexpensive prospects. The large web analytics vendors can help us with this as well.

The title of this post is "False Metrics". We are surrounded by false metrics.

"The open rate of the e-mail was 19%, therefore, it failed".
"Our website conversion rate increased to 4.04383%, so we're doing better".
"Multichannel customers are the best customers".
"The matchback algorithm says catalogs drive 80% of web sales, so catalogs matter."
"Paid search isn't working, but a lot of catalog buyers seem to be using it."

We'll move our industry forward when we start focusing on strategic analyses that parse customer data into understandable and actionable components.

4 comments:

Anonymous9:06 AM
Good post Kevin.

I find the biggest problem is debunking a lot of these "myths". I think one of the worst roadblocks in doing so is working against the measurement systems that email/search marketers use. These tools (typically web analytics tools like Omniture), provide no visibility into the true (read incremental) impact of these programs/campaigns.

Before even thinking about trying to make these programs effective there is so much education that needs to take place that it makes you wonder if anyone truly understands how to measure.

Thanks for doing your part in moving things on the right path.
ReplyDelete
Replies
MineThatData9:53 AM
We understand what our software tools tell us to understand.

In many ways, software is at step 5 of a 100 step process to get us to truly understanding customer behavior.
ReplyDelete
Replies
Mike2:46 PM
In my experience, multichannel customers were found to have spent the most because they (by default) had made more than 1 purchase! While a silly mistake, I observed some high ranking professionals make this error.

I am not surprised by your findings Kevin, however I'd add that there are possible exceptions. Certain brands convey their message/products/image better in specific channels and I'd expect customer behavior to change after a purchase. (For example, a high end apparel retailer may be able to display superior service and "convert" you to a more loyal customer after a retail transaction.)
ReplyDelete
Replies
MineThatData2:56 PM
Oh sure, there's always exceptions. No two companies have customers that behave the same way.
ReplyDelete
Replies

Note: Only a member of this blog may post a comment.

Kevin Hillstrom, President, MineThatData

Kevin is President of MineThatData, a consultancy that helps CEOs understand the complex relationship between Customers, Advertising, Products, Brands, and Channels. Kevin supports a diverse set of clients, including internet startups, thirty million dollar catalog merchants, international brands, and billion dollar multichannel retailers. Kevin is frequently quoted in the mainstream media, including the New York Times, Boston Globe, and Forbes Magazine.

Prior to founding MineThatData, Kevin held various roles at leading multichannel brands, including Vice President of Database Marketing at Nordstrom, Director of Circulation at Eddie Bauer, and Manager of Analytical Services at Lands' End.

You may contact kevin at kevinh@minethatdata.com.

How Is Your Information Used?

When you subscribe to this blog, your information and email address will never be bought/sold. Ever. You are simply subscribing to the newsletter. You are welcome to unsubscribe at any time, no worries.

Cookies are used to measure website usage via Google Analytics and StatCounter.

FAQ For Vendors / Content Providers

1 - I do not accept advertising on this blog.

2 - I do not accept unsolicited content, including interviews, press releases, podcasts, discussions, posts, or other associated content promoting your products, services, or events. This blog is designed to promote my products, services, and content.

3 - As a continuation of (2), I do not accept guest blog posts, regardless of your situation. And I will not link to your blog post or white paper.

4 - I do not exchange links. In fact, I no longer publish reciprocal links to other websites.

January 25, 2009

False Metrics

4 comments:

The Contrast