## Limits of Harms From Affirmative Action

17 Nov

Stories abound about unqualified people getting admitted to highly selective places because of quotas. But the chances are that these are merely stories with no basis in fact. If an institution is highly selective and if the number of applicants is sufficiently large, quotas are unlikely to lead to people with dramatically lower abilities being admitted even when there are dramatic differences across groups. Relatedly, it is unlikely to have much of an impact on the average ability of the admitted cohort. If the point wasn’t obvious enough, it would be after the following simulation. Say the mean IQ of the groups differs by 1 s.d. (which is the difference between Black and White IQ in the US). Say that the admitting institution only takes 1000 people. In the no-quota regime, the top 1000 people get admitted. In the quota regime, 20% of the seats are reserved for the second group. With this framework, we can compare the IQ of the last admitee across the conditions. And the mean ability.

# Set seed for reproducibility
set.seed(123)

# Simulate two standard normal distributions
group1 <- rnorm(1000000, mean = 0, sd = 1)  # Group 1
group2 <- rnorm(1000000, mean = -1, sd = 1)  # Group 2, mean 1 sd lower than Group 1

# Combine into a dataframe with a column identifying the groups
data <- data.frame(
value = c(group1, group2),
group = rep(c("Group 1", "Group 2"), each = 1000000)
)

# Pick top 800 values from Group 1 and top 200 values from Group 2
top_800_group1 <- head(sort(data$value[data$group == "Group 1"], decreasing = TRUE), 800)
top_200_group2 <- head(sort(data$value[data$group == "Group 2"], decreasing = TRUE), 200)

# Combine the selected values and estimate the mean
combined_top_1000 <- c(top_800_group1, top_200_group2)

# IQ of the last five admitees
round(tail(head(sort(data$value, decreasing = TRUE), 1000)), 2) [1] 3.11 3.11 3.10 3.10 3.10 3.10 round(tail(combined_top_1000), 2) [1] 2.57 2.57 2.57 2.57 2.56 2.56 # Means round(mean(head(sort(data$value, decreasing = TRUE), 1000)), 2)
[1] 3.37

round(mean(combined_top_1000), 2)
[1] 3.31

# How many people in top 1000 from Group 2 in no-quota?
sorted_data <- data[order(data$value, decreasing = TRUE), ] top_1000 <- head(sorted_data, 1000) sum(top_1000$group == "Group 2")
[1] 22

Under no-quota, the person with the least ability who is admitted is 3.1 s.d. above the mean while under quota, the person with the least ability who is admitted is 2.56 s.d. above the mean. The mean ability of the admitted cohort is virtually indistinguishable—3.37 and 3.31 for the no-quota and quota conditions respectively. Not to put too fine a point—the claim that quotas lead to gross misallocation of limited resources is likely grossly wrong. This isn’t to say there isn’t a rub. With a 1 s.d. difference, the representation in the tails is grossly skewed. Without quota, there would be just 22 people from Group 2 in the top 1000. So 178 people from Group 1 get bumped. This point about fairness is perhaps best thought of in context of how much harm comes to those denied admission. Assuming enough supply across the range of selectivity—this is approximately true for the U.S. for higher education with a range of colleges at various levels of selectivity—it is likely the case that those denied admission at more exclusive institutions get admitted at slightly lower ranked institutions and do nearly as well as they would have had they been admitted to more exclusive institutions. (See Dale and Kreuger, etc.).

p.s. In countries like India, 25 years ago, there was fairly limited supply at the top and large discontinuous jumps. Post liberalization of the education sector, this is likely no longer true.

p.p.s. What explains the large racial gap in SAT scores of the admittees to Harvard? It is likely that it is founded in Harvard weighing factors such as athletic performance in admission decisions.

16 Nov

There are a few different options for buying time with industry experts, e.g., https://officehours.com/, https://intro.co/, etc. However, there is no marketplace for buying academics’ time. Some surplus is likely lost as a result. For one, some academics want advice on what they write. To get advice, they have three choices—academic friends, reviewers, or interested academics at conferences or talks. All three have their problems. Or they have to resort to informal markets like Kahneman.

## Tipping Lemons: Market Failures in Tips

4 Oct

Say that people can be easily identified by characteristic C. Say that the average tip left by people of group C_A is smaller than !C_A with a wide variance in tipped amounts within each group. Let’s assume that the quality of service (two levels: high or low) is pro-rated by the expected tip amount. Let’s assume that the tip left by a customer is explained by the quality of service. And let’s also assume that the expected tip amount from C_A is low enough to motivate low-quality service. The tip is provided after the service. Assume no-repeat visitation. The optimal strategy for the customer is to not tip. But the service provider notices the departure from rationality from customers and serves accordingly. If the server had complete information about what each person would tip, then the service would be perfectly calibrated by the tipped amount. However, the server can only rely on crude surface cues, like C, and estimate the expected value of the tip. Given that, the optimal strategy for the server is to provide low-quality service to C_A, which would lead to a negative spiral.

## Is The Price Right? Dramatic Variability in Prices of Generic Goods Over Short Periods on Major Web Retailers

31 Aug

It used to be that retail prices of generic products like coffee mugs, soap, etc., moved slowly. Not anymore. On major web retailers like Amazon, for a range of generic household products, the variation in prices over short periods of time is immense. For instance, on 12-Piece Porcelain, 12 Oz. Coffee Mug Set, the price ranged between $20.50 and$35.71 over the last year or so, with a hefty day-to-day variation.

On PCPartPicker, the variation in prices for Samsung SSD is equally impressive. Prices zig-zag on multiple sites (e.g., Dell, Adorama) by $100 over a matter of days multiple times over the last six months. (The cross-site variation—price dispersion—at a particular point in time is also impressive.) Take another example. Softsoap Liquid Hand Soap, Fresh Breeze – 7.5 Fl Oz (Pack of 6) shows a very high-frequency change between$7.44 and $11. (See also Irish Spring Men’s Deodorant Bar Soap, Original Scent – 3.7 Ounce.) What explains the within-site over-time variation? One reason could be supply and demand. There are three reasons I am skeptical of the explanation. First, on Amazon, the third-party new item price time series and Amazon price time series do not appear to be correlated (statistics by informal inspection or as one of my statistics professors used to call it—the ocular distortion test—so caveat emptor). On PCPartPicker, you see much the same thing: the cross-retailer price time series frequently crossover. Second, related to the first point, we should see a strong correlation in overtime price curves across substitutes. We do not. Third, the demand for generic household products should be readily forecastable, and the optimal dry good storage strategy is likely not storing just enough. Further, I am skeptical of strong non-linearities in the marginal cost of furnishing an item that is not in the inventory—much of it should be easily replenishable. The other explanation is price exploration, with Amazon continuously exploring the profit-maximizing price. But this is also unpersuasive. The range over which the prices vary over short periods of time is too large, especially given substitutes and absent collusion. Presumably, companies have thought about the negative consequences of such wide price exploration bands. For instance, you cannot build a reputation as the ‘cheapest’ (unless there is coordination or structural reason for prices to move together.) So I come empty when it comes to explanations. There is the crazy algorithm theory—as inventory dwindles, Amazon really hikes the price, and when it sees no sales, it brings the price right back down. It may explain the frequent sharp movements over a fixed band that you see in some places but plausibly doesn’t explain a lot of the other patterns we see. Forget the explanations and let’s engage with the empirical fact. My hunch is that customers are unaware of the striking variation in the prices of many goods. Second, if customers become aware of this, their optimal strategy would be to use sites like CamelCamelCamel or PCPartPicker to pick the optimal time for purchasing a good. If retailers are somehow varying prices to explore profit-maximizing pricing (minus price discrimination based on location, etc.), and if all customers adopt the strategy of timing the purchase, then, in equilibrium, the retailer strategy would reduce to constant pricing. p.s. I found it funny that there are ‘used product’ listings for soap. p.p.s. I wrote about the puzzle of price dispersion on Amazon here. ## Out of Network: The Tradeoffs in Using Network Based Targeting 1 Aug In particular, in 521 villages in Haryana, we provided information on monthly immunization camps to either randomly selected individuals (in some villages) or to individuals nominated by villagers as people who would be good at transmitting information (in other villages). We find that the number of children vaccinated every month is 22% higher in villages in which nominees received the information. From Banerjee et al. 2019 The buildings, which are social units, were randomized to (1) targeting 20% of the women at random, (2) targeting friends of such randomly chosen women, (3) targeting pairs of people composed of randomly chosen women and a friend, or (4) no targeting. Both targeting algorithms, friendship nomination and pair targeting, enhanced adoption of a public health intervention related to the use of iron-fortified salt for anemia. Coupon redemption reports showed that unadjusted adoption rates were 13.6% (SE = 1.5%) in the friend-targeted clusters, 11.2% (SE = 1.4%) in pair-targeted clusters, 9.1% (SE = 1.3%) in the randomly targeted clusters, and 0% in the control clusters receiving no intervention. From Alexander et al. 2022 Here’s a Twitter thread on the topic by Nicholas Christakis. Targeting “structurally influential individuals,” e.g., people with lots of friends, people who are well regarded, etc., can lead to larger returns per ‘contact.’ This can be a useful thing. And as the studies demonstrate, finding these influential people is not hard—just ask a few people. There are, however, a few concerns: 1. One of the concerns with any targeting strategy is that it can change who is treated. When you use network-based targeting, it biases the treated sample toward those who are more connected. That could be a good thing, especially if returns are the highest on those with the most friends, like in the case of curbing contagious diseases, or it could be a bad thing if the returns are the greatest on the least connected people. The more general point here is that most ROI calculations for network targeting have only accounted for costs of contact and assumed the benefits to be either constant or increasing in network size. One can easily rectify this by specifying the ROI function more fully or adding “fairness” or some kind of balance as a constraint. 2. There is some stochasticity that stems from which person is targeted, and their idiosyncratic impact needs to be baked into standard error calculations for the ‘treatment,’ which is the joint of whatever the experimenters are doing and what the individual chooses to do with the experimenter’s directions (compliance needs a more careful definition). Interventions with targeting are liable to have thus more variable effects than without targeting and plausibly need to be reproduced more often before they used as policy. ## Noise: A Flaw in Book Writing 10 Jul This is a review of Noise, A Flaw in Human Judgment by Kahneman, Sibony, and Sunstein. The phrase “noise in decision making” brings to mind “random” error. Scientists, however, shy away from random error. Science is mostly about systematic error, except, perhaps, quantum physics. So Kahneman et al. conceive of noise as seemingly random error that is a result of unmeasured biases. For instance, research suggests that heat causes bad mood. And bad mood may, in turn, cause people to judge more harshly. If this were to hold, the variability in judging stemming from the weather can end up being interpreted as noise. But, as is clear, there is no “random” error, merely bias. Kahneman et al. make a hash of this point. Early on, they give the conventional formula of total expected error as the sum of bias and variance (they don’t further decompose variance into irreducible error and ‘random’ error) with the aim of talking about the two separately, and naturally, never succeed in doing that. The conceptual issues ought not detract us from the important point of the book. It is useful to think about human judgment systems as mathematical functions. We should expect the same inputs to map to the same output. It turns out that it isn’t even remotely true in most human decision-making systems. Take insurance underwriting, for instance. Given the same data (realistic but made-up information about cases), the median percentage difference between quotes between any pair of underwriters is an eye-watering 55% (which means that for half of the cases, it is worse than 55%), about five times as large as expected by the executives. There are a few interesting points that flow from this data. First, if you are a customer, your optimal strategy is to get multiple quotes. Second, what explains ignorance about the disagreement? There could be a few reasons. First, when people come across a quote from another underwriter, they may ‘anchor’ their estimate on the number they see, reducing the gap between the number and the counterfactual. Second, colleagues plausibly read to agree—less effort and optimizing for collegiality, asking, “Could this make sense?”, than read to evaluate, “Does this make sense?” (see my notes for a fuller set of potential explanations.) Data from asylum reviews is yet starker. “A study of cases that were randomly allotted to different judges found that one judge admitted 5% of applicants, while another admitted 88%.” (Paper.) Variability can stem from only two things. It could be that the data doesn’t allow for a unique judgment (irreducible error). (But even here, the final judgment should reflect the uncertainty in the data.) Or that at least one person is ‘wrong’ (has a different answer than others). Among other things, this can stem from: 1. variation in skill, e.g., how to assess patent applications 2. variation in effort, e.g., some people put more effort than others 3. agency and preferences, e.g., I am a conservative judge, and I can deny an asylum application because I have the power to do so 4. biases like using irrelevant information, e.g., weather, hypoglycemia, etc. (Note: a lack of variability doesn’t mean we are on to the right answer.) The list of proposed solutions is extensive—from selecting better judges to the wisdom of the crowds to using models to training people better to more elaborate schemes like dividing the decision task and asking people to make relative than absolute judgments. The evidence backing the solutions is not always hefty, which meshes with the ideolog-like approach to evidence present everywhere in the book. When I did a small audit of the citations, three things stood out (the overarching theme is adherence to the “No Congenial Result Scrutinized or Left Uncited Act”): 1. Extremely small n studies cited without qualification. Software engineers. Quote from the book: “when the same software developers were asked on two separate days to estimate the completion time for the same task, the hours they projected differed by 71%, on average.” The underlying paper: “In this paper, we report from an experiment where seven experienced software professionals estimated the same sixty software development tasks over a period of three months. Six of the sixty tasks were estimated twice. 2. Extremely small n studies cited without qualification. Israeli Judges. Hypoglycemia and judgment: “Our data consist of 1,112 judicial rulings, collected over 50 d in a 10-mo period, by eight Jewish-Israeli judges (two females) who preside over two different parole boards that serve four major prisons in Israel.” 3. Surprising but likely unreplicable results. “When calories are on the left, consumers receive that information first and evidently think “a lot of calories!” or “not so many calories!” before they see the item. Their initial positive or negative reaction greatly affects their choices. By contrast, when people see the food item first, they apparently think “delicious!” or “not so great!” before they see the calorie label. Here again, their initial reaction greatly affects their choices. This hypothesis is supported by the authors’ finding that for Hebrew speakers, who read right to left, the calorie label has a significantly larger impact..” (Paper.) “We show that if the effect sizes in Dallas et al. (2019) are representative of the populations, a replication of the six studies (with the same sample sizes) has a probability of only 0.014 of producing uniformly significant outcomes.” (Paper.) 4. Citations to HBR. Citations to think pieces in Harvard Business Review (10 citations in total based on a keyword search) and books like ‘Work Rules!’ for a fair many claims. Here are my notes for the book. ## Profit Maximizing Staffing 12 Oct Say that there is a donation solicitation company. Say that there are 100M potential donors they can reach out to eachyear. Let’s also assume that the company gets paid on a contingency fee basis, getting a fixed percentage of all donations. The company currently follows the following process: it selects 10M potential donors from the list using some rules and reaches out to them. The company gets donations from 2M donors. Also, assume that agents earn a fixed percentage of the dough they bring in. What’s profit-maximizing staffing? The company’s optimal strategy for staffing (depending on the risk preference) is: $p_i*\alpha*v_i - c_i > 0$ where $p_i$ reflects the probability of donation from potential donor $I$, $v_i$ is the value of the donation from the ith customer, $\alpha$ is the contingency fee, and $c_i$ is the cost of reaching out to the potential donor. Modeling $c_i$ can be challenging because the cost may be a function of donor attributes but also the granularity at which you can purchase labor, the need for specialists for soliciting donations from different potential donors, e.g., language, etc. For instance, classically, it may well be that you can only buy labor in chunks, e.g., full-time workers for some time. We leave these considerations out for now. We also take as fixed the optimal strategy to reach out to each donor.) The data we have the greatest confidence in pertains to cases where we tried and observed an outcome. The data for the 10M can look like this: cost_of_contact, donation 10, 0 15, 1 20, 100 25, 0 30, 1000 .., .. We can use this data to learn a regression within the 10M and then use the model to predict the rank. If you use the model to rank the 10M you get next year, you can get greater profits from not pursuing the 8M. If you use it to rank the remaining 90M, you are making the assumption that donors who were not selected but are otherwise similar to those who were chosen, are similar in their returns. It is likely not the case. To get better traction on the 90M, you need to get new data, starting with a random sample, and using deep reinforcement learning to figure out the kind of donors who are profitable to reach out to. ## Fairly Certain: Using Uncertainty in Predictions to Diagnose Roots of Unfairness 8 Jul One conventional definition of group fairness is that the ML algorithms produce predictions where the FPR (or FNR or both) is the same across groups. Fixating on equating FPR etc. can harm the very groups we are trying to help. So it may be useful to rethink how to solve the problem of reducing unfairness. One big reason why the FPR may vary across groups is that, given the data, some groups’ outcomes are less predictable than others. This may be because of the limitations of the data itself or because of the limitations of algorithms. For instance, Kearns and Roth in their book bring up the example of college admissions. The training data for college admissions is the decisions made by college counselors. College counselors may well be worse at predicting the success of minority students because they are less familiar with their schools, groups, etc., and this, in turn, may lead to algorithms performing worse on minority students. (Assume the algorithm to be human decision-makers and the point becomes immediately clear.) One way to address worse performance may be to estimate the uncertainty of the prediction. This allows us to deal with people with wider confidence bounds separately from people with narrower confidence bounds. The optimal strategy for people with wider confidence bounds people may be to collect additional data to become more confident in those predictions. For instance, Komiyama and Noda propose something similar (pdf) to help overcome a lack of information during hiring. Or we may need to figure out a way to compensate people based on their uncertainty interval. The average width of the uncertainty interval across groups may also serve as a reasonable way to diagnose this particular problem. ## Equilibrium Fairness: How “Fair” Algorithms Can Hurt Those They Purport to Help 7 Jul One definition of a fair algorithm is an algorithm that yields the same FPR across groups (an example of classification parity). To achieve that, we often have to trade in some accuracy. The final model is thus less accurate but fair. There are two concerns with such models: 1. Net Harm Over Relative Harm: Because of lower accuracy, the number of people from a minority group that are unfairly rejected (say for a loan application) may be a lot higher. (This is ignoring the harm done to other groups.) 2. Mismeasuring Harm? Consider an algorithm used to approve or deny loans. Say we get the same FPR across groups but lower accuracy for loans with a fair algorithm. Using this algorithm, however, means that credit is more expensive for everyone. This, in turn, may cause fewer people of the vulnerable group to get loans as the bank factors in the cost of mistakes. Another way to think about the point is that using such an algorithm causes net interest paid per borrowed dollar to increase by some number. It seems this common scenario is not discussed in many of the papers on fair ML. One reason for that may be that people are fixated on who gets approved and not the interest rate or total approvals. ## Market Welfare: Why Are Covid-19 Vaccines Still Underfunded? 11 Jun “To get roughly 70% of the planet’s population inoculated by April, the IMF calculates, would cost just$50bn. The cumulative economic benefit by 2025, in terms of increased global output, would be $9trn, to say nothing of the many lives that would be saved.” https://www.economist.com/leaders/2021/06/09/the-west-is-passing-up-the-opportunity-of-the-century The Economist frames this as an opportunity for G7. And it is. But it is also an opportunity for third-world countries, which plausibly can borrow$50bn given the return on investment. The fact that money hasn’t already been allocated poses a puzzle. Is it because governments think about borrowing decisions based on whether or not a policy is tax revenue positive (which a 180x return ought to be even with low tax collection and assessment rates)? Or is it because we don’t have a marketplace where we can transact on this information? If so, it seems like an important hole.

Here’s another way to look at this point. Among countries where the profits mostly go to a few, why do the people at the top not come to invest together so that they can harvest profits later? Brunei is probably an ok example.

## Liberalizing Daughters: Do Daughters Cause MCs to be Slightly More Liberal on Women’s Issues?

25 Dec

Two papers estimate the impact of having a daughter on Members of Congress’ (MC’s) position on women’s issues. Washington (2008) finds that each additional daughter (conditional on the number of children) causes about a 2 point increase in liberalism on women’s issues using data from the 105th to 108th Congress. Costa et. al 2019 use data from 110th to 114th Congress to find there is a noisily estimated small effect that cannot be distinguished from zero.

### Same Number, Different Interpretation

Washington (2008) argues that a 2 point effect is substantive. But Costa et al. argue that a 2–3 point change is not substantively meaningful.

“In all five specifications, the score increases by about two points with each additional daughter parented. For all but the 106th Congress, the number of female children coefficient is significantly different from zero at conventional levels. While that two point increase may seem small relative to the standard deviations of these scores, note that the female legislators, on average, score a significant seven to ten points higher on these rating scores. In other words, an additional daughter has about 25% of the impact on women’s issues that one’s own gender has.”

From Washington 2008

“The lower bound of the confidence interval for the first coefficient in Model
1, the effect of having a daughter on AAUW rating, is −3.07 and the upper
bound is 2.01, meaning that the increase on the 100-point AAUW scale for
fathers of daughters could be as high as 2.01 at the 90% level, but that AAUW
score could also decrease by as much as 3.07 points for fathers of daughters,
which is in the opposite direction than previous literature and theory would
have us expect. In both directions, neither the increase nor the decrease is
substantively very meaningful.

From Costa et. al 2019

### Different Numbers

The two papers—Washington’s and Costa et al.—come to different conclusions. But why? Besides different data, there are fair many other differences in modeling choices including (p.s. this is not a comprehensive list):

1. How the number of children are controlled for. Washington uses fixed effects for the number of children. This makes sense if you conceive the number of daughters as a random variable within people with the same number of children. Another way to think of it is as a block randomized experiment. Costa et al. write, “Following Washington (2008), we also include a control variable for the total number of children a legislator has.” But control for it linearly.
2. Dummy Vs. Number of Daughters. Costa et al. have a ‘has daughter’ dummy that codes as 1 any MC with 1 or more daughter while Washington uses the number of daughters as the ‘treatment’ variable.

## It Depends! Effect of Quotas On Women’s Representation

25 Dec

“[Q]uotas are often thought of as temporary measures, used to improve the lot of particular groups of people until they can take care of themselves.”

Bhavnani 2011

So how quickly can be withdraw the quota? The answer depends—plausibly on space, office, and time.

“In West Bengal …[i]n 1998, every third G[ram] P[anchayat] starting with number 1 on each list was reserved for a woman, and in 2003 every third GP starting with number 2 on each list was reserved” (Beaman et al. 2012). Beaman et al. exploit this random variation to estimate the effect of reservation in prior election cycles on women being elected in the subsequent elections. They find that 1. just 4.8% of the elected ward councillors in non-reserved wards, 2. this number doesn’t change if a GP has been reserved once before, and 3. shoots up to a still-low 10.1% if the GP has been reserved twice before (see the last column of Table 11 below).

In a 2009 article, Bhavnani, however, finds a much larger impact of reservation in Mumbai ward elections. He finds that a ward being reserved just once before causes a nearly 18 point jump (see the table below) starting from a lower base than above (3.7%).

p.s. Despite the differences, Beaman et al. footnote Bhavnani’s findings as: “Bhavnani (2008) reports similar findings for urban wards of Mumbai, where previous reservation for women improved future representation of women on unreserved seats.”

Beaman et al. also find that reservations reduce men’s biases. However, a 2018 article by Amanda Clayton finds that this doesn’t hold true (though the CI are fairly wide) in Lesotho, Kenya.

## Political Macroeconomics

25 Dec

### Look Ma, I Connected Some Dots!

In late 2019, in a lecture at the Watson Center at Brown University, Raghuram Rajan spoke about the challenges facing the Indian economy. While discussing the trends in growth in the Indian economy (I have linked to the relevant section in the video. see below for the relevant slide), Mr. Rajan notes:

“We were growing really fast before the great recession, and then 2009 was a year of very poor growth. We started climbing a little bit after it, but since then, since about 2012, we have had a steady upward movement in growth going back to the pre-2000, pre-financial crisis growth rates. And then since about mid-2016 (GS: a couple of years after Mr. Modi became the PM), we have seen a steady deceleration.”

Raghuram Rajan at the Watson Center at Brown in 2019 explaining the graph below

The statement is supported by the red lines that connect the deepest valleys with the highest peak, eagerly eliding over the enormous variation in between (see below).

### See Something, Say Some Other Thing

Not to be left behind, Mr. Rajan’s interlocutor Mr. Subramanian shares the following slide about investment collapse. Note the title of the slide and then look at the actual slide. The title says that the investment (tallied by the black line) collapses in 2010 (before Mr. Modi became PM).

### Epilogue

If you are looking to learn more about some of the common techniques people use to lie with charts, you can read How Charts Lie. (You can read my notes on the book here.)

## No Props for Prop 13

14 Dec

Proposition 13 enacted two key changes: 1. it limited property tax to 1% of the cash value, and 2. limited annual increase of assessed value to 2%. The only way the assessed value can change by more than 2% is if the property changes hands (a loophole allows you to change hands without officially changing hands).

One impressive result of the tax is the inequality in taxes. Sample this neighborhood in San Mateo where taxes range from $67 to nearly$300k.

Take out the extremes, and the variation is still hefty. Property taxes of neighboring lots often vary by well over $20k. ) My back-of-the-envelope estimate of standard deviation based on ten properties chosen at random is$23k.)

Sample another from Stanford where the range is from ~$2k to nearly$59k.

Prop. 13 has a variety of more material perverse consequences. Property taxes are one reason by people move from their suburban houses near the city to other more remote, cheaper places. But Prop. 13 reduces the need to move out. This likely increases property prices, which in turn likely lowers economic growth as employers choose other places. And as Chaste, a long-time contributor to the blog points out, it also means that the currently employed often have to commute longer distances, which harms the environment in addition to harming the families of those who commute.

p.s. Looking at the property tax data, you see some very small amounts. For instance, $19 property tax. When Chaste dug in, he found that the property was last sold in 1990 for$220K but was assessed at $0 in 2009 when it passed on to the government. The property tax on government-owned properties and affordable housing in California is zero. And Chaste draws out the implication: “poor cities like Richmond, which are packed with affordable housing, not only are disproportionately burdened because these populations require more services, they also receive 0 in property taxes from which to provide those services.” p.p.s. My hunch is that a political campaign that uses property taxes in CA as a targeting variable will be very successful. p.p.p.s. Chaste adds: “Prop 13 also applies to commercial properties. Thus, big corps also get their property tax increases capped at 2%. As a result, the sales are often structured in ways that nominally preserve existing ownership. There was a ballot proposition on the November 2020 ballot, which would have removed Prop 13 protections for commercial properties worth more than$3M. Residential properties over $3M would continue to enjoy the protection. Even this prop failed 52%-48%. People were perhaps scared that this would be the first step in removing Prop 13 protections for their own homes.” ## Too Much Churn: Estimating Customer Churn 18 Nov A new paper uses financial transaction data to estimate customer churn in consumer-facing companies. The paper defines churn as follows: There are three concerns with the definition: 1. The definition doesn’t make clear what is the normalizing constant for calculating the share. Given that the value “can vary between zero and one,” presumably the normalizing constant is either a) total revenue in the same year in which customer buys products, b) total revenue in the year in which the firm revenue was greater. 2. If the denominator when calculating s_fit is the total revenue in the same year in which the customer buys products from the company, it can create a problem. Consider a case where there is a customer that spends$10 in both year t and year t-k. And assume that the firm’s revenue in the same years is $10 and$20 respectively. In this case, the customer hasn’t changed his/her behavior but their share has gone from 1 to .5.
3. Beyond this, there is a semantic point. Churn is generally used to refer to attrition. In this case, it covers both customer acquisition and attrition. It also covers both a reduction and an increase in customer spending.

A Fun Aside

“Netflix similarly was not in one of our focused consumer-facing industries according to our SIC classification (it is found with two-digit SIC of 78, which mostly contains movie producers)” — this tracks with my judgment of Netflix.