November 11, 2020

-188 + 192*x

It was 1991 at Lands' End. We were greatly ramping-up our testing work. And when we wanted to execute a test, we needed to understand how the results might "vary".

Back then, a test was sampled from the population who would receive a catalog. Maybe that audience was 4,000,000 customers. If the catalog was a productive catalog, it might generate $10.00 per catalog mailed. If the catalog wasn't productive, circulation would be reduced and the catalog might generate $4.00 per catalog mailed.

If you want to measure a 10% difference in sales for two groups performing around $4.00 per book, you need fewer customers than if you are trying to measure at 10% difference in sales for two groups performing around $10.00 per book. This is an issue called "heteroscedasticity".

So I built an equation that measured variability around different dollar-per-book estimates. The equation was a simple one:

  • -188 + 192*(Expected Dollar per Book).
If we expected one group to generate $4.00 per book and the control group to generate $3.60 per book, we'd calculate the variability at point estimate:
  • $4.00 = -188 + $192*4.00 = 580.
  • $3.60 = -188 + $192*3.60 = 503.
Then we'd enter the data into our statistical equation.
  • (4.00 - 3.60) / SQRT(580/25000 + 503/25000).
  • T = 1.92.
As long as T > 2.00, we would execute the test with the sample size promoted by the equation.

In this case, the sample size was too small, so we had to increase it.

  • (4.00 - 3.60) / SQRT(580/30000 + 503/30000).
  • T = 2.11.
You probably already have a calculator that you enjoy using. If not, contact me and we'll get something set up for you for your data at minimal cost (kevinh@minethatdata.com).

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Key Findings In The Time Lapse Analysis

Here's our Time Lapse Analysis from the past two days. Remember, green cells indicate areas of the customer file that are growing. Red c...