Kevin Hillstrom: MineThatData: A/B Testing: Here's An Example

November 09, 2010

A/B Testing: Here's An Example

Here's an example of what I see, over and over and over again, when evaluating A/B tests within the e-mail marketing genre.

Say you have a list of 500,000 e-mail addresses. You send your standard campaign on a Monday. Later in the week, you tabulate your results:

500,000 recipients.
20% open rate = 100,000.
Of the opens, 20% click through to the website = 20,000 visit website.
Of the clicks, 5% convert and buy something = 1,000 orders.
Average Order Value = $100.
Total Demand = 1,000 * $100 = $100,000.
Demand per Recipient = $100,000 / 500,000 = $0.20.

Here's one of the usual outcomes, when measuring e-mail marketing campaigns via A/B tests. You hold out a big quantity, so that you can accurately measure the spend with confidence.

Mailed Group = 400,000 Recipients, $300,000 spent = $0.75 per customer.
Holdout Group = 100,000 Held Out, $45,000 spent = $0.45 per customer.
Incremental Lift = $0.75 - $0.45 = $0.30 per customer.

By the way, yes, I realize many of you want to apply significance tests and confidence intervals and all that stuff, go ahead and do so.

This is why I'm not a fan of open/click/conversion. A mail/holdout test proves the actual value of an e-mail marketing campaign. In this case, we observe $0.30 lift, whereas open/click/conversion yields $0.20 lift.

E-mail marketers, why would you not want to know that your campaigns are working 50% better than when measured via opens/click/conversion?

Just as often, the results aren't optimistic.

Mailed Group = 400,000 Recipients, $300,000 spent = $0.75 per customer.
Holdout Group = 100,000 Held Out, $75,000 spent = $0.75 per customer.
Incremental Lift = $0.75 - $0.75 = $0.00 per customer.

So often, opens/clicks/conversion takes credit for orders that would have happened anyway. This is a very difficult concept for the non-testing audience to grasp. You see, customers will order regardless whether you market to them or not. In some companies, more than 80% of orders will happen without marketing. In other companies, less than 20% of orders will happen without marketing. I've measured both instances, strategically, you end up taking very different marketing approaches with outcomes on either end.

Here's another tidbit. You usually see the $0.30 outcome, or you see the $0.00 outcome ... you seldom see the numbers tie out with opens/clicks/converts. Furthermore, there isn't a ton of variability ... so if you start to see the $0.30 outcome, you're likely to see a result that is consistently better than opens/clicks/converts, or vice versa. Consistency of results will happen if you pick a control group that is large enough to be stable. You don't want a control group of 5,000 customers, you need big numbers in order to get "big reliability"!

This is why you have to execute A/B or multivariate or factorial tests. You need to measure how much of your business will happen without marketing. Classic open/click/conversion metrics really struggle with this topic.

4 comments:

Mark Price8:19 AM
Kevin -- thanks for clearly pointing out the benefits of holdouts or control groups. In my experience, control groups are even more important when the marketing mix is more complex, which it usually is. Customers receiving emails also receive direct mail pieces, as well as potentially forms of mass media.

A rigorous control group methodology is critical to identifying the effects of a particular communication program. If not, then marketers are left at the end of the year "holding the bag" when the CMO asks for incrementality of each of their efforts.
ReplyDelete
Replies
MineThatData8:23 AM
Yup, you get good results when you use control groups across disciplines, good point!
ReplyDelete
Replies
Mark Price6:16 PM
Kevin -- how successful have you been at teasing out the benefits of different types of direct-to-customer communications that may be occurring simultaneously? For example 2-3 emails that land at the same time as a direct mail piece. Do you set up staggered control groups to determine a lift, or do you have another approach?
ReplyDelete
Replies
MineThatData7:26 PM
I have several catalog clients who execute 3-month long factorial designs ... 4 groups ... one group gets catalog+email, one gets catalog, one gets email, one gets nothing.

You do those kind of tests, and you quickly learn what impact catalogs and email have on search, and what impact catalogs and email have on each other. All of it is fascinating!
ReplyDelete
Replies

Note: Only a member of this blog may post a comment.

Kevin Hillstrom, President, MineThatData

Kevin is President of MineThatData, a consultancy that helps CEOs understand the complex relationship between Customers, Advertising, Products, Brands, and Channels. Kevin supports a diverse set of clients, including internet startups, thirty million dollar catalog merchants, international brands, and billion dollar multichannel retailers. Kevin is frequently quoted in the mainstream media, including the New York Times, Boston Globe, and Forbes Magazine.

Prior to founding MineThatData, Kevin held various roles at leading multichannel brands, including Vice President of Database Marketing at Nordstrom, Director of Circulation at Eddie Bauer, and Manager of Analytical Services at Lands' End.

You may contact kevin at kevinh@minethatdata.com.

How Is Your Information Used?

When you subscribe to this blog, your information and email address will never be bought/sold. Ever. You are simply subscribing to the newsletter. You are welcome to unsubscribe at any time, no worries.

Cookies are used to measure website usage via Google Analytics and StatCounter.

FAQ For Vendors / Content Providers

1 - I do not accept advertising on this blog.

2 - I do not accept unsolicited content, including interviews, press releases, podcasts, discussions, posts, or other associated content promoting your products, services, or events. This blog is designed to promote my products, services, and content.

3 - As a continuation of (2), I do not accept guest blog posts, regardless of your situation. And I will not link to your blog post or white paper.

4 - I do not exchange links. In fact, I no longer publish reciprocal links to other websites.

November 09, 2010

A/B Testing: Here's An Example

4 comments:

The Contrast