Kevin Hillstrom: MineThatData: Test Outliers

May 24, 2017

Test Outliers

I've been recording my blood pressure readings. Let me show you the last seven systolic readings.

135
162
120
110
127
117
124

One number sticks out, right? It's that 162 number.

Here is the average including the 162 number.

Here is the average excluding the 162 number.

Which number is "right"?

128 or 122

If you have a credible reason for throwing out the 162 figure, then 122 is right. If you have a credible reason for keeping it, then 128 is right.

The same logic applies to the tests you perform.

We all see our tests ruined by outliers. Here's a common one. Here are average order values for customers who purchased in a test.

$119.
$84.
$99.
$79.
$143.
$21,477.

Why in the name of Snedecor and Cochrane would you include the $21,477 order in your results?

Well, you'd keep it in there if 15% of all orders were $21,477 or greater.

But if 0.1% of all orders are $21,477 or greater? You adjust it down ... change it to $150 or whatever the 99th percentile is for average order values.

I'm confident few of you are adjusting for outliers.

And then you wonder why your test results are all over the board?

I know, I know, you don't have the coding chops to remove outliers, and you don't want to invest a half-year learning how to code, so you want a rule-of-thumb that you can apply. Ok, try this one on for size. If you are concerned about large orders influencing your test results, analyze response-rate / conversion-rate. If response/conversion results are significantly different than spending results, you have an outlier problem. If you have an outlier problem?

Measure the difference between response/conversion. Say it is 6%.
Average your average order values between test/control groups.
Apply the "average" average order value to both groups.
This leaves you with a 6% difference in spend between the two groups.

Thoughts?

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Kevin Hillstrom, President, MineThatData

Kevin is President of MineThatData, a consultancy that helps CEOs understand the complex relationship between Customers, Advertising, Products, Brands, and Channels. Kevin supports a diverse set of clients, including internet startups, thirty million dollar catalog merchants, international brands, and billion dollar multichannel retailers. Kevin is frequently quoted in the mainstream media, including the New York Times, Boston Globe, and Forbes Magazine.

Prior to founding MineThatData, Kevin held various roles at leading multichannel brands, including Vice President of Database Marketing at Nordstrom, Director of Circulation at Eddie Bauer, and Manager of Analytical Services at Lands' End.

You may contact kevin at kevinh@minethatdata.com.

How Is Your Information Used?

When you subscribe to this blog, your information and email address will never be bought/sold. Ever. You are simply subscribing to the newsletter. You are welcome to unsubscribe at any time, no worries.

Cookies are used to measure website usage via Google Analytics and StatCounter.

FAQ For Vendors / Content Providers

1 - I do not accept advertising on this blog.

2 - I do not accept unsolicited content, including interviews, press releases, podcasts, discussions, posts, or other associated content promoting your products, services, or events. This blog is designed to promote my products, services, and content.

3 - As a continuation of (2), I do not accept guest blog posts, regardless of your situation. And I will not link to your blog post or white paper.

4 - I do not exchange links. In fact, I no longer publish reciprocal links to other websites.

May 24, 2017

Test Outliers

No comments:

Post a Comment

The Trap