STEMing the Rot: Does Relative Deprivation Explain Low STEM Graduation Rates at Top Schools?

26 Sep

The following few paragraphs are from Sociation Today:


Using the work of Elliot (et al. 1996), Gladwell compares the proportion of each class which gets a STEM degree compared to the math SAT at Hartwick College and Harvard University.  Here is what he presents for Hartwick:

Students at Hartwick College

STEM MajorsTop ThirdMiddle ThirdBottom Third
Math SAT569472407
STEM degrees55.0%27.1%17.8

So the top third of students with the Math SAT as the measure earn over half the science degrees. 

    What about Harvard?   It would be expected that Harvard students would have much higher Math SAT scores and thus the distribution would be quite different.  Here are the data for Harvard:

Students at Harvard University

STEM MajorsTop ThirdMiddle ThirdBottom Third
Math SAT753674581
STEM degrees53.4%31.2%15.4%

     Gladwell states the obvious, in italics, “Harvard has the same distribution of science degrees as Hartwick,” p. 83. 

    Using his reference theory of being a big fish in a small pond, Gladwell asked Ms. Sacks what would have happened if she had gone to the University of Maryland and not Brown. She replied, “I’d still be in science,” p. 94.


Gladwell focuses on the fact that the bottom-third at Harvard is the same as the top third at Hartwick. And points to the fact that they graduate at very different rates. It is a fine point. But there is more to the data. The top-third at Harvard have much higher SAT scores than the top-third at Hartwick. Why is it the case that they graduate with a STEM degree at the same rate as the top-third at Hartwick? One answer to that is that STEM degrees at Harvard are harder. So harder coursework at Harvard (vis-a-vis Hartwick) is another explanation for the pattern we see in the data and, in fact, fits the data better as it explains the performance of the top-third at Harvard.

Here’s another way to put the point: If preferences for graduating in STEM are solely and almost deterministically explained by Math SAT scores, like Gladwell implicitly assumes, and the major headwinds are because of relative standing, then we should see a much higher STEM graduation rate for the top-third at Harvard. We should ideally see an intercept shift across schools, which we don’t see, but a common differential between the top and the bottom third.

Self-Recommending: The Origins of Personalization

6 Jul

Recommendation systems are ubiquitous. They determine what videos and news you see, what books and products are ‘suggested’ to you, and much more. If asked about the origins of personalization, my hunch is that some of us will pin it to the advent of the Netflix Prize. Wikipedia goes further back—it puts the first use of the term ‘recommender system’ in 1990. But the history of personalization is much older. It is at least as old as heterogeneous treatment effects (though latent variable models might be a yet more apt starting point). I don’t know for how long we have known about heterogeneous treatment effects but it can be no later than 1957 (Cronbach and Goldine Gleser, 1957).  

Here’s Ed Haertel:

“I remember some years ago when NetFlix founder Reed Hastings sponsored a contest (with a cash prize) for data analysts to come up with improvements to their algorithm for suggesting movies subscribers might like, based on prior viewings. (I don’t remember the details.) A primitive version of the same problem, maybe just a seed of the idea, might be discerned in the old push in educational research to identify “aptitude-treatment interactions” (ATIs). ATI research was predicated on the notion that to make further progress in educational improvement, we needed to stop looking for uniformly better ways to teach, and instead focus on the question of what worked for whom (and under what conditions). Aptitudes were conceived as individual differences in preparation to profit from future learning (of a given sort). The largely debunked notion of “learning styles” like a visual learner, auditory learner, etc., was a naïve example. Treatments referred to alternative ways of delivering instruction. If one could find a disordinal interaction, such that one treatment was optimum for learners in one part of an aptitude continuum and a different treatment was optimum in another region of that continuum, then one would have a basis for differentiating instruction. There are risks with this logic, and there were missteps and misapplications of the idea, of course. Prescribing different courses of instruction for different students based on test scores can easily lead to a tracking system where high performing students are exposed to more content and simply get further and further ahead, for example, leading to a pernicious, self-fulfilling prophecy of failure for those starting out behind. There’s a lot of history behind these ideas. Lee Cronbach proposed the ATI research paradigm in a (to my mind) brilliant presidential address to the American Psychological Association, in 1957. In 1974, he once again addressed the American Psychological Association, on the occasion of receiving a Distinguished Contributions Award, and in effect said the ATI paradigm was worth a try but didn’t work as it had been conceived. (That address was published in 1975.)

This episode reminded me of the “longstanding principle in statistics, which is that, whatever you do, somebody in psychometrics already did it long before. I’ve noticed this a few times.”

Reading Cronbach today is also sobering in a way. It shows how ad hoc the investigation of theories and coming up with the right policy interventions was.

Teaching Social Science

12 Sep

Three goals: impart information, spur deeper thinking about the topic and the social world more generally, and inculcate care in thinking. As is perhaps clear, working toward achieving any one of these goals creates positive externalities that help achieve other goals. For instance, care in exposition, which is a necessary though insufficient condition for imparting correct information, is liable to produce, either through mimesis or further thought, care in how students think about questions.

Supplement such synergies by actively seeking and utilizing pertinent opportunities during both, class-wide discussions about the materials, and one-to-one discussions about research projects, to raise (and clarify) relevant points. During discussions, encourage students to seriously consider questions about epistemology, fundamental to science but also more generally to reasoning and discourse, by weaving in questions such as, “What is the claim that we are making?”, and “When can we make this claim and why?”.

Some of the epistemological questions are most naturally (and perhaps best) handled when students are engaged in working on their own research projects. Guiding students as they collect and analyze their own data provides unique opportunities to discuss issues related to research design, and logic. And it is my hunch that students are more engaged with the material (and hence learn more of it, and think more about it) when they work on their own projects than when asked to learn the materials through lectures alone. For instance, undergraduates at Stanford often excel at knowing the points made in the text, but often have yet to spend time thinking about the topic itself. My sense is (and some experience corroborates it) that thinking broadly about an issue allows students to gain new insights, and helps them contextualize their findings better. It also spurs curiosity about the social world and the broader set of questions about society. Hence, in addition to the above, ask students to discuss the topics that they are working on more generally, and think carefully and deeply about what else could be going on.

That’s Smart! What We Mean by Smartness and What We Should

16 Aug

Many people conceive of intelligence as a CPU. To them, being more intelligent means having a faster CPU. And this is despairing as the clock speed is largely fixed. (Prenatal and childhood nutrition can make a sizable difference, however. For instance, iodine deficiency in children causes mental retardation.)

But people misconceive intelligence. Intelligence is not just the clock speed of the CPU. It is also the short-term cache and the OS.

Clock speed doesn’t matter much if there isn’t a good-sized cache. The size of short-term memory matters enormously. And the good news is that we can expand it with effort.

A super fast CPU with a large cache is still only as good as the operating system. If people know little or don’t know how to reason well, they generally won’t be smart. Think of the billions of people who came before we knew how to know (science). Some of those people had really fast CPUs. But many of them weren’t able to make much progress on anything.

The chances an ignoramus who doesn’t know correlation isn’t causation will come across as stupid are also high. In fact, we often mistake being knowledgeable and possessing rules of how to reason well for being intelligent. Though it goes without saying, it is good to say it: how much we know and knowledge of how to reason better is in our control.

Lastly, people despair because they mistake skew for the variance. People believe there is a lot of variance in processing capacity. My sense is that variance in the processing capacity is low, and the skew high. In layman’s terms, most people are as smart as the other with very few very bright people. None of this is to say that the little variance that exists is not consequential.

Toward a Better OS

Ignorance of relevant facts limits how well we can reason. Thus, increasing the repository of relevant facts at hand will help you reason smarter. If you don’t know about something, that is ok. Today, the world’s information is at your fingertips. It will take you some time to go through things but you can become informed about more things than you think possible.

Besides knowledge, there are some ‘frameworks’ of how to approach a problem. For instance, management consultants have something called MECE. This ‘framework’ can help you reason better about a whole slew of problems. There are likely others.

Besides reasoning frameworks, there are simple rules that can help you reason better. You can look up books devoted to common errors in thinking, and use those to derive the rules. The rules can look as follows:

  1. Correlation is not the same as causation
  2. Don’t select on the dependent variable. What I call the ‘7 Habits of Successful People’ rule.
  3. Replace categorical thinking with continuous where possible, and be precise. For e.g., rather than claim that ‘there is a risk’, quantify the risk. Or replace the word possibility with probability where applicable.
  4. Have a better grasp of your own ignorance using some of the tricks described here.
  5. The tree of inference starts with the question. Think hard about what data you would need to answer the question well. And then what data you have. And then calibrate your assessments about the answer based on the difference between the data you would have liked to have and the data you have.

How Are Academic Disciplines Divided?

18 Jul

The social sciences are split into disciplines like Psychology, Political Science, Sociology, Anthropology, Economics, etc. There is a certain anarchy to the way they are split. For example, while Psychology is devoted to understanding how the individual mind works, and sociology to the study of groups, Political science is devoted merely to an aspect of groups—group decision making.

One of the primary reasons the social sciences are divided so is because of the history of how social sciences developed. As major figures postulated important variables that constrain the social world, fields took shape around them. The other pertinent variables that explain some of the new disciplines in social sciences are changes in technology, and more broadly changing social problems. For example, the discipline of Communication took shape around the time mass media became popular.

The way the social sciences are currently divided has left them with a host of inefficiencies which leave them largely inefficacious in a variety of scenarios where they can offer substantive help. Firstly, The containerized way of understanding the social world provide inadequate ways of understanding complex social systems that are imposed upon by a variety of variables that range from the individual to the institutional. And secondly, the largely discipline-specific theoretical motivations lead academic to concoct elaborate theories that often misstate their applicability in complex ecosystems. We all know how economics never met common sense till of recently. It isn’t that disciplines haven’t tried to bridge the inter-disciplinary divide, they certainly have by creating sub-disciplines ranging from social-psychology (in psychology) to political psychology (in Political Science), and in fact that is exactly where some of the most exciting research is taking place right now, the problem is that we have been slow to question the larger restructuring of the social sciences. The question then arises as to what should we put at the center of our focus of our disciplines? The answer is by no means clear to me though I think it would be useful to develop competencies around primary organizing social structures/institutions.

Role of Social Science

Let me assume away the fact that most social science knowledge will end up in the society either through Capitalism or selective uptake by policymakers. Next, we need to evaluate how social science can meaningfully contribute to society. One intuitive way would be to create social engineering departments that are focused on specific social problems. The advice is by no means radical— certainly Education as a discipline has been around for some time, and relatively recently departments (or schools) devoted to Public Health, Environmental Policy have opened up across college campuses. Secondly, social science should create social engineering departments that help offer solutions for real-life problems, much the same way engineering departments affiliated with natural sciences do and try experimenting with how for example different institutional structures would affect decision making. Lastly, social scientists have a lot more to offer to third world countries which have yet to be overrun by brute Capitalism. What social science departments need to do is lead more data collection efforts in third world countries and offer solutions.