Limits of Harms From Affirmative Action

17 Nov

Stories abound about unqualified people getting admitted to highly selective places because of quotas. But the chances are that these are merely stories with no basis in fact. If an institution is highly selective and if the number of applicants is sufficiently large, quotas are unlikely to lead to people with dramatically lower abilities being admitted even when there are dramatic differences across groups. Relatedly, it is unlikely to have much of an impact on the average ability of the admitted cohort. If the point wasn’t obvious enough, it would be after the following simulation. Say the mean IQ of the groups differs by 1 s.d. (which is the difference between Black and White IQ in the US). Say that the admitting institution only takes 1000 people. In the no-quota regime, the top 1000 people get admitted. In the quota regime, 20% of the seats are reserved for the second group. With this framework, we can compare the IQ of the last admitee across the conditions. And the mean ability.

# Set seed for reproducibility
set.seed(123)

# Simulate two standard normal distributions
group1 <- rnorm(1000000, mean = 0, sd = 1)  # Group 1
group2 <- rnorm(1000000, mean = -1, sd = 1)  # Group 2, mean 1 sd lower than Group 1

# Combine into a dataframe with a column identifying the groups
data <- data.frame(
  value = c(group1, group2),
  group = rep(c("Group 1", "Group 2"), each = 1000000)
)

# Pick top 800 values from Group 1 and top 200 values from Group 2
top_800_group1 <- head(sort(data$value[data$group == "Group 1"], decreasing = TRUE), 800)
top_200_group2 <- head(sort(data$value[data$group == "Group 2"], decreasing = TRUE), 200)

# Combine the selected values and estimate the mean
combined_top_1000 <- c(top_800_group1, top_200_group2)

# IQ of the last five admitees
round(tail(head(sort(data$value, decreasing = TRUE), 1000)), 2)
[1] 3.11 3.11 3.10 3.10 3.10 3.10

round(tail(combined_top_1000), 2)
[1] 2.57 2.57 2.57 2.57 2.56 2.56

# Means
round(mean(head(sort(data$value, decreasing = TRUE), 1000)), 2)
[1] 3.37

round(mean(combined_top_1000), 2)
[1] 3.31

# How many people in top 1000 from Group 2 in no-quota?
sorted_data <- data[order(data$value, decreasing = TRUE), ]
top_1000 <- head(sorted_data, 1000)
sum(top_1000$group == "Group 2")
[1] 22

Under no-quota, the person with the least ability who is admitted is 3.1 s.d. above the mean while under quota, the person with the least ability who is admitted is 2.56 s.d. above the mean. The mean ability of the admitted cohort is virtually indistinguishable—3.37 and 3.31 for the no-quota and quota conditions respectively. Not to put too fine a point—the claim that quotas lead to gross misallocation of limited resources is likely grossly wrong. This isn’t to say there isn’t a rub. With a 1 s.d. difference, the representation in the tails is grossly skewed. Without quota, there would be just 22 people from Group 2 in the top 1000. So 178 people from Group 1 get bumped. This point about fairness is perhaps best thought of in context of how much harm comes to those denied admission. Assuming enough supply across the range of selectivity—this is approximately true for the U.S. for higher education with a range of colleges at various levels of selectivity—it is likely the case that those denied admission at more exclusive institutions get admitted at slightly lower ranked institutions and do nearly as well as they would have had they been admitted to more exclusive institutions. (See Dale and Kreuger, etc.).

p.s. In countries like India, 25 years ago, there was fairly limited supply at the top and large discontinuous jumps. Post liberalization of the education sector, this is likely no longer true.

p.p.s. What explains the large racial gap in SAT scores of the admittees to Harvard? It is likely that it is founded in Harvard weighing factors such as athletic performance in admission decisions.