Snakes on Ladders: Encouraging People to Climb the Engagement Ladder

3 Jun

Marketers love engagement ladders. To increase engagement with a product, many companies segment their users based on usage, for instance, into heavy (super), medium (average), and light, and prod their users to climb the ladder by suggesting they do things that people in the segment above them are doing and which they aren’t doing (as frequently).

At first blush, it sounds reasonable, even obvious. The trouble with the seemingly obvious, however, is that a) it gives the illusion of understanding, which prevents us from thinking carefully (because there is nothing more to understand!), and b) it doesn’t always make sense.

Let’s start by assuming that the ladder metaphor makes sense. The only thing that we need to do is to implement it correctly.

The ladder metaphor is built on the idea of stable rungs. If the classification into “light”, “medium”, and “heavy” is not durable—for instance, if someone classified as “heavy” can move to “light” next month on their own accord—what we learn by comparing “heavy” users to “medium” users may prove deleterious for the “medium” users.

Thus, it is useful to have stable rungs. To build stable rungs, start by assessing the stability of rungs by building transition matrices over time. If the rungs are not durable over time frames over which you want to see an effect, bolster them by extending the observation time over which usage is measured or using multiple measures. For instance, if usage over the last month does not produce durable rungs, it may be because usage is heavily seasonal. To fix that, switch to usage over multiple months or a seasonally adjusted number.

Once you have stable rungs, the next task is to come up with a set of actions that marketers can encourage users to take. The popular method to arbitrate between potential actions is to regress adjacent rungs on the set of potential actions and find the ones that are most highly correlated or have the highest beta. The popular method may seem reasonable but it isn’t. Assume away causality and you still care about how useful, actionable, and easy a recommended action is. The highest beta doesn’t mean the lowest cost per incremental improvement (again, assuming away causal concerns and taking betas at face value). And there is no way to address such concerns without experimenting and finding out what works best. (The message that works the best is a sum of the action being recommended and how that action is being encouraged.)

There is one minor nuance to the above. It pays to have ‘no action’ as an action if ‘no action’ isn’t your control group. Usage-based sorting merely sorts the users by kinds of people—by people who don’t need to use the product more often than thrice a month versus those who do. Who are we to say that they need to use the product more? Fact is that often enough the correlation between usage and retention is small. And doing nothing may prove better than annoying people with unwanted emails.

Lastly, the ladder metaphor leads some to believe that we need to stand up the same ladder for everyone. Using the highest beta or the most effective treatment means recommending the same (best) action to everyone. This is what I call the ‘mail merge’ heuristic. Mail merge is plausibly very highly correlated with the usage of MS-Word. But it would be an utter disaster if MSFT recommended it to me—I plan to quit the MSFT ecosystem if it comes to pass. Ideally, we want to encourage people to cross rungs by using more things in the software that are useful for them. (In fact, it isn’t clear how else we can induce a user to use the software more.) You can learn different ladders by modeling heterogeneity in treatment effects and then use simple algebra to find the best one for each person.

5 is smaller than 1.9!

10 Feb

“In the late 1990s, the leading methods caught about 80 percent of fraudulent transactions. These rates improved to 90–95 percent in 2000 and to 98–99.9 percent today. That last jump is a result of machine learning; the change from 98 percent to 99.9 percent has been transformational.

An improvement from 85 percent to 90 percent accuracy means that mistakes fall by one-third. An improvement from 98 percent to 99.9 percent means mistakes fall by a factor of twenty. An improvement of twenty no longer seems incremental.”


From Prediction Machines by Agarwal, Gans, and Goldfarb.

One way to compare the improvements is to compare differences in percentages —5 and 1.9. That is what I would have done. That is so because conditional on the same difference in percentages, lower the base, the greater the multiplicative factor, which makes it a cheap way of making small improvements look better. Even then, for consistency, the comparison would have been between percentage increases in accuracy, between (90 – 85)/85 and (99.9 – 98)/98. But, AGG had to flip the estimand to percentage errors to make the latter relative change look better.

Operating Efficiently: Thumb Rules for Increasing Operational Efficiency

5 Aug

ABC ships cereal to people. ABC has a large operations team that handles customer complaints, e.g., “I got the wrong kind of cereal,” “the cereal was too old,” “the cereal arrived too late,” etc., and custom requests, e.g., “I would like seventy custom boxes shipped to a company retreat”, “I would like the delivery date to be changed,” etc. ABC is interested in providing customer service at a lower cost. What are its options? Here are some thumb rules:

  1. Prevent Work: 
    1. Prevent complaints from arising. Prevention will cost money so it is tempting to think of it as a trade-off. In the long term, prevention is generally financially beneficial.  
    2. Self-Serve: Build tools that allow customers to self-serve. It can be a win-win.
  2. Convert Externalities to Internalities: What special favors are customers asking that are not part of the price? For instance, are customers contacting you for changing delivery dates? Are you charging them for such changes? Bottom line: do not provide services that people are not willing to pay for.
  3. Staff Appropriately
    1. Forecast different kinds of work (by different work we mean work for which you pay different amounts of money and need to hire different people or train differently), come up with ideal shifts, and incentives for staying longer or going home sooner when reality doesn’t match up to reality. If you can forecast months in advance, it can inform your hiring or ‘right-sizing’ plans.
    2. Reduce Specialization: One thing that gets in the way of reducing staffing in having a lot of specialization. 
    3. Smooth Work by Separating Urgent from Non-Urgent Work: Say that a lot of work arrives in a narrow window. Not all of it is urgent. Build tools like ‘call me back’ to deal with non-essential work.  
    4. Simplify Work: Make sure that you don’t need to train people a lot to do the work.
  4. Make People More Efficient
    1. Train: Train people so that they can get more done per unit of time.
    2. Incentivize: Make sure workers and managers are optimally incentivized.
    3. Better Tools and Processes: Invest in tools and processes that help people do the job quicker. For instance, building tools that allow you to seamlessly transfer work between shifts by conveying all the relevant info. 
    4. Prioritize Work: For the same resources, one way to provide better quality is to prioritize work correctly.
  5. Hire more efficient people and fire inefficient people.
  6. Reduce Work: Automate work that can be automated. It includes semi-automation: automating portions of work.