Against Complacency

19 Nov

Even the best placed among us are to be pitied. Human lives today are blighted by five things:

  1. Limited time. While we have made impressive gains in longevity over the last 100 years, our lives are still too short. 
  2. Less than excellent health. Limited lifespan is further blighted by ill-health. 
  3. Underinvestment. Think about Carl Sagan as your physics teacher, a full-time personal trainer to help you excel physically, a chef, abundant access to nutritious food, a mental health coach, and more. Or an even more effective digital or robotic analog.
  4. Limited opportunity to work on impactful things. Most economic production happens in areas where we are not (directly) working to dramatically enhance human welfare. Opportunities to work on meaningful things are further limited by economic constraints.
  5. Crude tools. The tools we work with are much too crude which means that many of us are stuck executing on a low plane.

Deductions

  1. Given where we are in terms of human development, innovations in health and education are likely the most impactful though innovations in foundational technologies like AI and computation that increase our ability to innovate are probably still more important.
  2. Given that at least a third of the economy is government money in many countries, government can dramatically affect what is produced, e.g., the pace at which we increase longevity, prevent really bad outcomes like an uninhabitable planet, etc.

Pastations: Is it Time to Move Beyond Presentations?

26 Dec

In an influential essay, The Cognitive Style of PowerPoint, Tufte argues that (PowerPoint) presentations are unsuitable for serious problems. The essay is largely polemical, with Tufte freely mixing points about affordances of the medium with criticisms of bad presentations and lazy broadsides.

Hilarious stuff first:

  1. “All 3 reports have standard PP format problems: elaborate bullet outlines; segregation of words and numbers (12 of 14 slides with quantitative data have no accompanying analysis); atrocious typography; data imprisoned in tables by thick nets of spreadsheet grids; only 10 to 20 short lines of text per slide.”
  2. “On this single Columbia slide, in a PowerPoint festival of bureaucratic hyper-rationalism, 6 different levels of hierarchy are used to classify, prioritize, and display 11 simple sentences”
  3. “In 28 books on PP presentations, the 217 data graphics depict an average of 12 numbers each. Compared to the worldwide publications shown in the table at right, the statistical graphics based on PP templates are the thinnest of all, except for those in Pravda back in 1982, when that newspaper operated as the major propaganda instrument of the Soviet communist party and a totalitarian government.”

In the essay, I could only rescue two points about affordances (that I buy):

  1. “When information is stacked in time, it is difficult to understand context and evaluate relationships.”
  2. Inefficiency: “A talk, which proceeds at a pace of 100 to 160 spoken words per minute, is not an especially high resolution method of data transmission. Rates of transmitting visual evidence can be far higher. … People read 300 to 1,000 printed words a minute, and find their way around a printed map or a 35mm slide displaying 5 to 40 MB in the visual field. Yet, in a strange reversal, nearly all PowerPoint slides that accompany talks have much lower rates of information transmission than the talk itself. As shown in this table, the PowerPoint slide typically shows 40 words, which is about 8 seconds worth of silent reading material. The slides in PP textbooks are particularly disturbing: in 28 textbooks, which should use only first-rate examples, the median number of words per slide is 15, worthy of billboards, about 3 or 4 seconds of silent reading material. This poverty of content has several sources. First, the PP design style, which typically uses only about 30% to 40% of the space available on a slide to show unique content, with all remaining space devoted to Phluff, bullets, frames, and branding. Second, the slide projection of text, which requires very large type so the audience can read the words.”

From Working Backwards, which cites the article as the reason Amazon pivoted from presentations to 6-pagers for its S-team meetings, there is one more reasonable point about presentations more generally:

“…the public speaking skills of the presenter, and the graphics arts expertise behind their slide deck, have an undue—and highly variable—effect on how well their ideas are understood.”

Working Backwards

The points about graphics arts expertise, etc., apply to all documents but are likely less true for reports than presentations. (It would be great to test the effect of the prettiness of graphics on their persuasiveness.)

Reading the essay made me think harder about why we use presentations in meetings about complex topics more generally. For instance, academics frequently present to other academics. Replacing presentations with 6-pagers that people quietly read and comment on at the start of the meeting and then discuss may yield higher quality comments and better discussion and better evaluation of the scholar (and the scholarship).

p.s. If you haven’t seen Norvig’s Gettysburg Address in PowerPoint, you must.

p.p.s. Ed Haertel forwarded me this piece by Sam Wineburger on why asking students to create powerpoints is worse than asking them to write an essay.

p.p.p.s. Here’s how Amazon runs its S-team meetings (via Working Backwards):

1. 6-pager (can have appendices) distributed at the start of the discussion.

2. People read in silence and comment for the first 20 min.

3. Rest 40 min. devoted to discussion, which is organized by 1. big issues/small issues, 2. people going around the room, etc.

4. One dedicated person to take notes.

Snakes on Ladders: Encouraging People to Climb the Engagement Ladder

3 Jun

Marketers love engagement ladders. To increase engagement with a product, many companies segment their users based on usage, for instance, into heavy (super), medium (average), and light, and prod their users to climb the ladder by suggesting they do things that people in the segment above them are doing and which they aren’t doing (as frequently).

At first blush, it sounds reasonable, even obvious. The trouble with the seemingly obvious, however, is that a) it gives the illusion of understanding, which prevents us from thinking carefully (because there is nothing more to understand!), and b) it doesn’t always make sense.

Let’s start by assuming that the ladder metaphor makes sense. The only thing that we need to do is to implement it correctly.

The ladder metaphor is built on the idea of stable rungs. If the classification into “light”, “medium”, and “heavy” is not durable—for instance, if someone classified as “heavy” can move to “light” next month on their own accord—what we learn by comparing “heavy” users to “medium” users may prove deleterious for the “medium” users.

Thus, it is useful to have stable rungs. To build stable rungs, start by assessing the stability of rungs by building transition matrices over time. If the rungs are not durable over time frames over which you want to see an effect, bolster them by extending the observation time over which usage is measured or using multiple measures. For instance, if usage over the last month does not produce durable rungs, it may be because usage is heavily seasonal. To fix that, switch to usage over multiple months or a seasonally adjusted number.

Once you have stable rungs, the next task is to come up with a set of actions that marketers can encourage users to take. The popular method to arbitrate between potential actions is to regress adjacent rungs on the set of potential actions and find the ones that are most highly correlated or have the highest beta. The popular method may seem reasonable but it isn’t. Assume away causality and you still care about how useful, actionable, and easy a recommended action is. The highest beta doesn’t mean the lowest cost per incremental improvement (again, assuming away causal concerns and taking betas at face value). And there is no way to address such concerns without experimenting and finding out what works best. (The message that works the best is a sum of the action being recommended and how that action is being encouraged.)

There is one minor nuance to the above. It pays to have ‘no action’ as an action if ‘no action’ isn’t your control group. Usage-based sorting merely sorts the users by kinds of people—by people who don’t need to use the product more often than thrice a month versus those who do. Who are we to say that they need to use the product more? Fact is that often enough the correlation between usage and retention is small. And doing nothing may prove better than annoying people with unwanted emails.

Lastly, the ladder metaphor leads some to believe that we need to stand up the same ladder for everyone. Using the highest beta or the most effective treatment means recommending the same (best) action to everyone. This is what I call the ‘mail merge’ heuristic. Mail merge is plausibly very highly correlated with the usage of MS-Word. But it would be an utter disaster if MSFT recommended it to me—I plan to quit the MSFT ecosystem if it comes to pass. Ideally, we want to encourage people to cross rungs by using more things in the software that are useful for them. (In fact, it isn’t clear how else we can induce a user to use the software more.) You can learn different ladders by modeling heterogeneity in treatment effects and then use simple algebra to find the best one for each person.

5 is smaller than 1.9!

10 Feb

“In the late 1990s, the leading methods caught about 80 percent of fraudulent transactions. These rates improved to 90–95 percent in 2000 and to 98–99.9 percent today. That last jump is a result of machine learning; the change from 98 percent to 99.9 percent has been transformational.

An improvement from 85 percent to 90 percent accuracy means that mistakes fall by one-third. An improvement from 98 percent to 99.9 percent means mistakes fall by a factor of twenty. An improvement of twenty no longer seems incremental.”


From Prediction Machines by Agarwal, Gans, and Goldfarb.

One way to compare the improvements is to compare differences in percentages —5 and 1.9. That is what I would have done. That is so because conditional on the same difference in percentages, lower the base, the greater the multiplicative factor, which makes it a cheap way of making small improvements look better. Even then, for consistency, the comparison would have been between percentage increases in accuracy, between (90 – 85)/85 and (99.9 – 98)/98. But, AGG had to flip the estimand to percentage errors to make the latter relative change look better.

Operating Efficiently: Thumb Rules for Increasing Operational Efficiency

5 Aug

ABC ships cereal to people. ABC has a large operations team that handles customer complaints, e.g., “I got the wrong kind of cereal,” “the cereal was too old,” “the cereal arrived too late,” etc., and custom requests, e.g., “I would like seventy custom boxes shipped to a company retreat”, “I would like the delivery date to be changed,” etc. ABC is interested in providing customer service at a lower cost. What are its options? Here are some thumb rules:

  1. Prevent Work: 
    1. Prevent complaints from arising. Prevention will cost money so it is tempting to think of it as a trade-off. In the long term, prevention is generally financially beneficial.  
    2. Self-Serve: Build tools that allow customers to self-serve. It can be a win-win.
  2. Convert Externalities to Internalities: What special favors are customers asking that are not part of the price? For instance, are customers contacting you for changing delivery dates? Are you charging them for such changes? Bottom line: do not provide services that people are not willing to pay for.
  3. Staff Appropriately
    1. Forecast different kinds of work (by different work we mean work for which you pay different amounts of money and need to hire different people or train differently), come up with ideal shifts, and incentives for staying longer or going home sooner when reality doesn’t match up to reality. If you can forecast months in advance, it can inform your hiring or ‘right-sizing’ plans.
    2. Reduce Specialization: One thing that gets in the way of reducing staffing in having a lot of specialization. 
    3. Smooth Work by Separating Urgent from Non-Urgent Work: Say that a lot of work arrives in a narrow window. Not all of it is urgent. Build tools like ‘call me back’ to deal with non-essential work.  
    4. Simplify Work: Make sure that you don’t need to train people a lot to do the work.
  4. Make People More Efficient
    1. Train: Train people so that they can get more done per unit of time.
    2. Incentivize: Make sure workers and managers are optimally incentivized.
    3. Better Tools and Processes: Invest in tools and processes that help people do the job quicker. For instance, building tools that allow you to seamlessly transfer work between shifts by conveying all the relevant info. 
    4. Prioritize Work: For the same resources, one way to provide better quality is to prioritize work correctly.
  5. Hire more efficient people and fire inefficient people.
  6. Reduce Work: Automate work that can be automated. It includes semi-automation: automating portions of work.