November 11, 2020

-188 + 192*x

It was 1991 at Lands' End. We were greatly ramping-up our testing work. And when we wanted to execute a test, we needed to understand how the results might "vary".

Back then, a test was sampled from the population who would receive a catalog. Maybe that audience was 4,000,000 customers. If the catalog was a productive catalog, it might generate $10.00 per catalog mailed. If the catalog wasn't productive, circulation would be reduced and the catalog might generate $4.00 per catalog mailed.

If you want to measure a 10% difference in sales for two groups performing around $4.00 per book, you need fewer customers than if you are trying to measure at 10% difference in sales for two groups performing around $10.00 per book. This is an issue called "heteroscedasticity".

So I built an equation that measured variability around different dollar-per-book estimates. The equation was a simple one:

  • -188 + 192*(Expected Dollar per Book).
If we expected one group to generate $4.00 per book and the control group to generate $3.60 per book, we'd calculate the variability at point estimate:
  • $4.00 = -188 + $192*4.00 = 580.
  • $3.60 = -188 + $192*3.60 = 503.
Then we'd enter the data into our statistical equation.
  • (4.00 - 3.60) / SQRT(580/25000 + 503/25000).
  • T = 1.92.
As long as T > 2.00, we would execute the test with the sample size promoted by the equation.

In this case, the sample size was too small, so we had to increase it.

  • (4.00 - 3.60) / SQRT(580/30000 + 503/30000).
  • T = 2.11.
You probably already have a calculator that you enjoy using. If not, contact me and we'll get something set up for you for your data at minimal cost (kevinh@minethatdata.com).

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

No Context

Read this article and you'll be struck with a notable finding ( click here ). There is no context here. "Pureplay decreased by 51%&...