Recommendation systems are ubiquitous. They determine what videos and news you see, what books and products are ‘suggested’ to you, and much more. If asked about the origins of personalization, my hunch is that some of us will pin it to the advent of the Netflix Prize. Wikipedia goes further back—it puts the first use of the term ‘recommender system’ in 1990. But the history of personalization is much older. It is at least as old as heterogeneous treatment effects (though latent variable models might be a yet more apt starting point). I don’t know for how long we have known about heterogeneous treatment effects but it can be no later than 1957 (Cronbach and Goldine Gleser, 1957).
Here’s Ed Haertel:
“I remember some years ago when NetFlix founder Reed Hastings sponsored a contest (with a cash prize) for data analysts to come up with improvements to their algorithm for suggesting movies subscribers might like, based on prior viewings. (I don’t remember the details.) A primitive version of the same problem, maybe just a seed of the idea, might be discerned in the old push in educational research to identify “aptitude-treatment interactions” (ATIs). ATI research was predicated on the notion that to make further progress in educational improvement, we needed to stop looking for uniformly better ways to teach, and instead focus on the question of what worked for whom (and under what conditions). Aptitudes were conceived as individual differences in preparation to profit from future learning (of a given sort). The largely debunked notion of “learning styles” like a visual learner, auditory learner, etc., was a naïve example. Treatments referred to alternative ways of delivering instruction. If one could find a disordinal interaction, such that one treatment was optimum for learners in one part of an aptitude continuum and a different treatment was optimum in another region of that continuum, then one would have a basis for differentiating instruction. There are risks with this logic, and there were missteps and misapplications of the idea, of course. Prescribing different courses of instruction for different students based on test scores can easily lead to a tracking system where high performing students are exposed to more content and simply get further and further ahead, for example, leading to a pernicious, self-fulfilling prophecy of failure for those starting out behind. There’s a lot of history behind these ideas. Lee Cronbach proposed the ATI research paradigm in a (to my mind) brilliant presidential address to the American Psychological Association, in 1957. In 1974, he once again addressed the American Psychological Association, on the occasion of receiving a Distinguished Contributions Award, and in effect said the ATI paradigm was worth a try but didn’t work as it had been conceived. (That address was published in 1975.)
This episode reminded me of the “longstanding principle in statistics, which is that, whatever you do, somebody in psychometrics already did it long before. I’ve noticed this a few times.”
Reading Cronbach today is also sobering in a way. It shows how ad hoc the investigation of theories and coming up with the right policy interventions was.