June 15, 2020

Really, Really Bad Models

First, industry vendors attack me (I know this because you tell me they do this when you meet with them) by saying that my models are too simple ... there are only a handful of variables and the techniques (logistic regression, ordinary least squares regression) are "old school". They attack, of course, because they're trying to sell something complicated. When they attack me, ask the attacker what s/he thinks about the term "parsimony", because if the person is a credible stat expert they know about the importance of building a model with as few variables as possible. If they understand the term and the meaning of parsimony, ask them why they are trying to sell you something that is more complex than necessary?

Second, just because a person is building you a model doesn't mean that the person has any clue whatsoever what they are doing. I don't care that they've been employed by "Vendor X" for the past four years and have worked with all of the "Leading Brands". Why suggest this? I've told you the story ... sitting in the Executive Conference Room at a "major brand". On one side of the table was the vendor, saying that their model and 1,033 variables (it was more than a thousand, yes ... more than a thousand) was the best option for the brand. On the other side of the table was a PhD researcher hired by the "brand" to "in-house" math-related stuff. His model was reasonable ... maybe 10-15 independent variables ... but his dependent variable was complete nonsense. He was predicting who was going to buy from the brand, not who was going to buy from the catalog. I asked the researcher why he didn't calibrate the model toward A/B style mail/holdout tests, tests that clearly showed that retail buyers had NO INTEREST in catalogs whatsoever and therefore shouldn't be included in any circulation plan? The conversation went something like this:
  • Researcher:  Are you actually questioning me?
  • Kevin: What?
  • Researcher:  What gives you the right to even question me or my credentials?
  • Kevin:  Because you don't know what you are doing. You have mail/holdout tests that clearly tell you that 80% of your customer base could care less about catalogs and shouldn't be mailed. Why are you building models that will prioritize those customers?
  • Researcher:  You clearly know nothing about building a brand.
  • Kevin: What you are doing will cause you to generate less profit, thereby harming your brand.
  • Researcher:  I'm mailing who I want to mail, and those will be customers who are loyal to the brand.
  • Kevin:  Do you agree that if a customer won't spend incremental dollars because of catalogs that the customer shouldn't receive a catalog?
  • Researcher:  No.
  • Kevin:  Why not?
  • Researcher:  Just because I don't. This conversation is over. I swear, you don't know anything about math, and "Vendor X" really doesn't know anything about math.
When the CFO asked me who to believe, I told the CFO to believe me.

The CFO again asked me which party (vendor or in-house employee) to believe? I said "neither".

I was not asked back for a few years.

Here's the problem ... if you aren't trained in statistics ... and you don't need a ton of training ... you don't know ... you COULDN'T know ... that you are being bamboozled by an "expert". There are times the expert doesn't know that the expert is clueless.

Our industry uses a lot of really, really bad models. The bad models cost us sales, and cost us profit.

Give a QuickScore a try. And if that's not the direction you want to go in, no worries. But then please figure out how the heck you are going to vet the experts when you don't have the skills to vet the experts.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Cost Differences

Do you remember Bernie Mac in Oceans Eleven ... negotiating van prices? Muttering nonsense about Aloe Vera while squeezing the sales dude...