There are many variables that I like to analyze, on an annual basis, in an Online / Retail Dynamics project.
Merchandise Categories: I sum annual demand by merchandise category, then divide the total by how much the customer spent in the past twelve months. This gives me a fraction (0.00 to 1.00) of amount spent in each category. Some clients want two years or five years or all history included. I find this is not an optimal way to analyze what customers purchase - who cares that you purchased a love seat in 2004? In these cases, I weight historical spend ... maybe 100% for 12-month purchases, 50% for 13-24 month, 30% for 25-36 month, 20% for 37-48 month, 14% for 49-60 month, and 10% for 61+ month purchases. This greatly minimizes the influence of old purchases, especially older high-dollar purchases.
Website Visits: Here's a little secret not many folks want you to know - in most of my projects, I'm asked to analyze twelve-month website visitation behavior. Having said that, on average, only website visits in the past 15 - 30 days have any influence on future behavior. Often, I'll create three variables ... website / mobile app interactions in past 30 days, then from 31-90 days ago, and finally, 91-365 days ago. But again, only the most recent website / mobile app interaction matters. Recency is critically important online, folks. Heck, sometimes I'm asked to group website visits into buckets ... 1 visit last year, 2 visits last year, 3-5 visits last year, 6-10 visits last year, 11-50 visits last year, 51-100 visits last year, 100+, that kind of thing. Whatever works for you is fine, just make sure you have a defensible point of view.
Purchases: I like to sum twelve-month (or historical weighted) dollars by channel ... retail, smartphone, tablet, desktop/laptop, call center, that kind of thing. Then I'll divide the totals within channel by total annual (or historical weighted) dollars.
Website Characteristics: Here, I like to categorize activity on a weighted basis ... 100% for 0-30 day activity, 20% for 31-90 day activity, 5% for 91-365 day activity. I'll create 1/0 indicators for all key characteristics (cart, email click-through, referral from Bing, that kind of thing), then I will weight each characteristic by time (100%, 20%, 5% as mentioned above), and create a percentage. The weighting becomes important ... if a customer visited via Bing 100 days ago and Google yesterday, the Google visit is weighted at 100%, the Bing visit at 20%, meaning that the customer has a Google preference at a rate of 100/120 = 83%, while the customer prefers Bing at 20/120 = 17%. On an annual basis, the weightings really help us understand how the customer behaves.
Store Distance: I'll plug 1/0 indicators into my analysis for 0-5 mile bands, 6-10 mile, 11-25 mile, 26-50 mile, and 51+ mile bands. You will learn that visitation behavior changes as customers get further and further away from a store.
Zip Codes: I categorize zip codes by Catalog-Centric, Online-Centric, and Retail-Centric. Behavior in each classification is simply different, and quite interesting! You probably have your own algorithm for categorizing each zip code, so use that.
Tomorrow, I'll show you how I cook this information up - the discussion might get a bit geeky, but that's the nature of the work I'm doing when analyzing Online / Retail Dynamics.