## November 09, 2010

### A/B Testing: Here's An Example

Here's an example of what I see, over and over and over again, when evaluating A/B tests within the e-mail marketing genre.

Say you have a list of 500,000 e-mail addresses.  You send your standard campaign on a Monday.  Later in the week, you tabulate your results:
• 500,000 recipients.
• 20% open rate = 100,000.
• Of the opens, 20% click through to the website = 20,000 visit website.
• Of the clicks, 5% convert and buy something = 1,000 orders.
• Average Order Value = \$100.
• Total Demand = 1,000 * \$100 = \$100,000.
• Demand per Recipient = \$100,000 / 500,000 = \$0.20.
Here's one of the usual outcomes, when measuring e-mail marketing campaigns via A/B tests.  You hold out a big quantity, so that you can accurately measure the spend with confidence.
• Mailed Group = 400,000 Recipients, \$300,000 spent = \$0.75 per customer.
• Holdout Group = 100,000 Held Out, \$45,000 spent = \$0.45 per customer.
• Incremental Lift = \$0.75 - \$0.45 = \$0.30 per customer.
By the way, yes, I realize many of you want to apply significance tests and confidence intervals and all that stuff, go ahead and do so.

This is why I'm not a fan of open/click/conversion.  A mail/holdout test proves the actual value of an e-mail marketing campaign.  In this case, we observe \$0.30 lift, whereas open/click/conversion yields \$0.20 lift.

E-mail marketers, why would you not want to know that your campaigns are working 50% better than when measured via opens/click/conversion?

Just as often, the results aren't optimistic.
• Mailed Group = 400,000 Recipients, \$300,000 spent = \$0.75 per customer.
• Holdout Group = 100,000 Held Out, \$75,000 spent = \$0.75 per customer.
• Incremental Lift = \$0.75 - \$0.75 = \$0.00 per customer.
So often, opens/clicks/conversion takes credit for orders that would have happened anyway.  This is a very difficult concept for the non-testing audience to grasp.  You see, customers will order regardless whether you market to them or not.  In some companies, more than 80% of orders will happen without marketing.  In other companies, less than 20% of orders will happen without marketing.  I've measured both instances, strategically, you end up taking very different marketing approaches with outcomes on either end.

Here's another tidbit.  You usually see the \$0.30 outcome, or you see the \$0.00 outcome ... you seldom see the numbers tie out with opens/clicks/converts.  Furthermore, there isn't a ton of variability ... so if you start to see the \$0.30 outcome, you're likely to see a result that is consistently better than opens/clicks/converts, or vice versa.  Consistency of results will happen if you pick a control group that is large enough to be stable.  You don't want a control group of 5,000 customers, you need big numbers in order to get "big reliability"!

This is why you have to execute A/B or multivariate or factorial tests.  You need to measure how much of your business will happen without marketing.  Classic open/click/conversion metrics really struggle with this topic.