## November 11, 2020

### -188 + 192*x

It was 1991 at Lands' End. We were greatly ramping-up our testing work. And when we wanted to execute a test, we needed to understand how the results might "vary".

Back then, a test was sampled from the population who would receive a catalog. Maybe that audience was 4,000,000 customers. If the catalog was a productive catalog, it might generate \$10.00 per catalog mailed. If the catalog wasn't productive, circulation would be reduced and the catalog might generate \$4.00 per catalog mailed.

If you want to measure a 10% difference in sales for two groups performing around \$4.00 per book, you need fewer customers than if you are trying to measure at 10% difference in sales for two groups performing around \$10.00 per book. This is an issue called "heteroscedasticity".

So I built an equation that measured variability around different dollar-per-book estimates. The equation was a simple one:

• -188 + 192*(Expected Dollar per Book).
If we expected one group to generate \$4.00 per book and the control group to generate \$3.60 per book, we'd calculate the variability at point estimate:
• \$4.00 = -188 + \$192*4.00 = 580.
• \$3.60 = -188 + \$192*3.60 = 503.
Then we'd enter the data into our statistical equation.
• (4.00 - 3.60) / SQRT(580/25000 + 503/25000).
• T = 1.92.
As long as T > 2.00, we would execute the test with the sample size promoted by the equation.

In this case, the sample size was too small, so we had to increase it.

• (4.00 - 3.60) / SQRT(580/30000 + 503/30000).
• T = 2.11.
You probably already have a calculator that you enjoy using. If not, contact me and we'll get something set up for you for your data at minimal cost (kevinh@minethatdata.com).