June 24, 2008

Geek Alert!! Channel Preference And The Hyperbolic Tangent Function

The final step of most of my Multichannel Forensics projects involves predicting channel preference.

This is an important step, because customers that are likely to purchase from self-serve channels in the future (online, stores) may require less advertising than customers who purchase from full-service channels (catalog ordering over the telephone).

A handy mathematical transformation for estimating channel preference in a two-channel situation is the "Hyperbolic Tangent Function" (this was used extensively at Lands' End in the early 1990s to isolate customers likely to return most of their merchandise, allowing us to suppress mailings from these folks).

In the modeling process, you assign your dependent variable a value of -0.999 (telephone), and the other channel a value of +0.999 (online). If a customer splits dollars across both channels, the value is 0. If the customer splits dollars 2/3 phone, 1/3 online, you do a weighted average, yielding -0.333.

Then you transform the dependent variable ... (0.5)*LN((1+x)/(1-x)), where x is the value listed above.

Now you run your ordinary least squares regression against the transformed dependent variable, predicting the channel customers will purchase from in the future.

The current customer file is scored using this model. Once each customer has a score, you transform the score back to a numerical value ... (EXP(2*s)-1) / (EXP(2*s)+1), where "s" equals your score.

Customers with a highly positive prediction are likely to buy online (in this example), and therefore, may not need catalog advertising.

This works for e-mail marketing as well. If you are an online pureplay, -0.999 represents customers who do not ever buy from e-mail marketing, +0.999 represents customers who always buy because of e-mail marketing. Score the file, identify those likely to require e-mail marketing to purchase, and market accordingly.

The typical process employed by many online and catalog marketers these days involves the following steps.
1. Run a Multichannel Forensics analysis on the customer file to determine channel migration patterns.
2. Predict the probability of purchasing in the future using Logistic Regression.
3. If the customer purchases, predict future spend per purchaser using OLS Regression.
4. Multiply Step 2 by Step 3, yielding future spend.
5. Calculate future ad spend per customer, or model the relationship ... the relationship is built on the incremental value generated by advertising, not all demand spent by the customer.
6. Calculate future profitability by individual customer.
7. Use OLS Regression and the Hyperbolic Tangent Function to calculate channel preference.
8. Given profitability and channel preference, create a contact strategy for each customer.
When done well, the online or catalog brand can identify ad savings that can be re-allocated to customer acquisition activities.

1. Anonymous9:04 AM

Very interesting. I'm going to try this, but I need a little help;

Would you mind describing the complete datset that would be required to run this regression? Also, what are some of the vexing data quality issues to watch out for?

While we're geeking out... what is it about this particular transformation that makes it work for estimating channel preference? If you'd rather not clutter your blog, perhaps you can offer a non-wiki reference for those of us who are interested in more details?

2. You are going to have a dataset where the dependent variable is channel purchased from --- -0.999 for phone, 0.999 for online --- or -0.999 for online, 0.999 for stores.

Your independent variables are likely to be comparable to the RFM-based variables you use to do your typical lifetime value or campaign-based regression analyses.

This transformation helps you keep your dependent variable bounded by -1 and +1. Using OLS with -1/1 as a dependent variable causes problems, because the predicted value can be outside of -1/1.