Partisan Morality

11 Jun

Sinn Féin and Fianna Fáil have said that activists posed as members of a polling company and went door-to-door to canvass the opinions of voters.

https://amp.rte.ie/amp/1227134/

The rationale is simple. If you pose as an SF worker, you are likely to be met with shut doors or opinions in favor of SF got under slight duress. Is it a bridge too far or is it a harmless lie? More generally, do we use the same moral reasoning paradigm for violations by co-partisans and opposing partisans? My hunch is that for such kinds of violations we use a deontological framework for opposing partisans and a consequential one for co-partisans. The framework we use may switch depending on the circumstance. One way to test it would be to do a survey experiment with the above news article, switching parties. To get a better baseline, it may be useful to do three conditions: party_a, party_b, consumer_brand, e.g., Coke, etc.

The Hateful ATE: The Effect of Affective Polarization

7 Jun

In a new paper, Broockman et al. use a clever manipulation to induce “three decades of change in affective polarization”:

In typical trust games, there are two players. Player 1 receives a cash allocation and is instructed to give “some, all, or none” of the money to Player 2. The player is also told that the researchers will triple any amount Player [1] gives to Player 2 and that Player 2 can return some, all, or none of the money back to Player 1. Therefore, the more Player 1 expects reciprocity from Player 2, the more money they should allocate to Player 2 in anticipation they will receive a larger sum in return, and the better off Player 2 will be. For example, if Player 1 gives all her money to Player 2, this sum would be tripled, and Player 2 could return half of the tripled amount to Player 1—leaving both players with 50% more than Player 1’s initial allocation. But if Player 1 gives no money to Player 2, Player 1 leaves with only her initial allocation and Player 2 leaves with nothing.

First, we always make participants take the role of Player 2. This means they always first observe an allocation another player makes to them. Second, across three consecutive rounds of game play, participants are told they are interacting with three other respondents of the opposite political party who have each been allocated $10. However, they are in fact are interacting with computerized opponents who offer allocations based on a pre-determined script. Participants randomized to the Positive Experience condition receive allocations from Player 1 of $8, $7 and $8 (tripled to $24, $21 and $24) respectively across the three rounds of the game. However, those in the Negative Experience condition receive $0 allocations in all three rounds.

Broockman et al. 2021

Next, comes the punchline. “Player 1’s reason for their allocation to you: your partisanship (all rounds), your income (Round 2)”. See Page 65.

Being told that a co- or opposing- partisan gave $0 versus being told that they gave $8, $7, and $8 because of your partisanship across three rounds has a dramatic effect on partisans’ feelings: partisans’ feelings toward opposing partisans become ‘cooler,’ it doesn’t affect their feelings towards co-partisans (impressive), and (strangely) polarizes their feelings toward elites (see the figure below).

Three comments are in order.

First, the manipulation is unrealistic given previous effect sizes (see here).“The average amount allocated to copartisans in the trust game was $4.58 (95% confidence interval [4.33, 4.83]), representing a “bonus” of some 10% over the average allocation of $4.17.”

Second, the manipulation principally ought to change perceptions of how trusting people are and not how trustworthy they are. We don’t manipulate how deceitful the other person is but how fearful they are of not having their actions reciprocated. Disliking less trusting people is slightly weird and plausibly points to how the underlying antipathy can be exacerbated by treatments that do not present a clear reason for judging another person more harshly. Or it could be that not being seen as being trustworthy and losing out on money as a result of it is insulting and aggravating.

Whatever the reason, generalizing from a bad personal interaction to all other members of a group is disturbing. (The fact that treatment cools people’s feelings toward opposing partisans suggests people expect better from them, which is interesting.) Ascribing feelings from a bad personal experience to elites seems odder (and more disturbing) still.

The absence of commensurate co- and opposing- partisan feeling panels for elites feels odd.

The paper finds that having a “bad” personal experience (vis-a-vis a better one) with an opposing partisan increases interpersonal animus (plus polarization of feelings toward partisan elites) but doesn’t cause partisans to like opposing partisan MCs less or co-partisan MCs more (though see above. Note that the pooled estimate for the opposing party is 1.5% or so—which is about what I would expect; it likely deserves another run at the bank). (I didn’t understand the change from co-partisan and opposing-partisan MCs to “own MCs” in the next analysis, so I am omitting that.) The paper discusses other DVs: 

  1. Interest in expressing party-consistent issue preferences (no effect)
  2. Support for bi-partisan legislation (~ more in favor)
  3. Opposition to democratic norms (pooled index seems to move by d = .09 and is nearly sig. at conventional levels). (I make a special reference to the index because presumably it has the least measurement error and is least likely to show an idiosyncratic pattern given sample size. There is also a small point about how multiple comparison adjustments are made—plausibly they should account for measurement error.)
  4.  Endorsement of partisan-congenial claims (Ds yes; Rs no)

The theorized path from bad personal experience with a co- (or opposing) partisan to opposition to democratic norms, etc., seems convoluted to me. So let’s unpack the theoretical underpinnings of the expectations. Interpersonal animus among partisans is an indicator of affective polarization. And the experiment successfully manipulates interpersonal animus. So what’s the issue? One escape hatch is that the concept is not uni-dimensional. Another is that any increase in interpersonal affect manifests in political consequences only over long periods as it causes people to watch different media, trust different things, etc.

This Time It’s Different: Polarization of the American Polity

10 Jan

In a new paper, Pierson and Shickler contend that this era of polarization is different. They fear that polarization this time will continue to intensify because the three “meso-institutions”—interest groups, state parties, and the media—that were the bulwark against polarization in earlier eras are themselves polarized or have changed in ways that they offer much less resistance:

  1. State Parties
    • State Parties Have Polarized “state party platforms are more similar across states and more distinctive across parties than in earlier eras (Paddock 2005, 2014; Hopkins & Schickler 2016).”
    • Federal Government is Much Bigger. This means state concerns matter less — which brought cross-cutting cleavages into play. “Although it has received less discussion in the analysis of polarization, a second development in the 1960s and early 1970s—what Skocpol (2003, p. 135) has termed the “long 1960s”—was also critical: a dramatic expansion and centralization of public policy (Melnick 1994, Pierson 2007, Jones et al. 2019). Civil rights legislation was only the entering wedge. During the long 1960s, liberal Congresses enacted, often on a bipartisan basis, major new domestic spending programs (especially Medicaid and Medicare, which now account for roughly a quarter of federal spending as well as, in the case of Medicaid, a big share of state spending). They greatly enlarged the regulatory state, creating powerful new federal agencies (such as the Environmental Protection Agency) and enacting extensive rules covering environmental and consumer protection as well as workplace safety.”
  2. Interest Groups Have Polarized
    • “The powerful US Chamber of Commerce provides a striking illustration of the broader trend. Traditionally conservative but studiously nonaligned, it now carefully coordinates its extensive electoral activities with the Republican Party, and its political director (a former GOP operative) can refer unselfconsciously to Republican Senate candidates as “our ticket” (Hacker & Pierson 2016).”
  3. Media —- the usual story

Why This Time is Different

  • “The Civil War era represents an obvious extreme point in the intensity of divisions, yet the period of partisan polarization was remarkably brief: The major American parties featured deep internal divisions on slavery up until the mid-to-late 1850s, and the new Republican majority became deeply divided over Reconstruction and key economic questions soon after the war ended.”

Questions and Notes

  • Why are business interest groups not more bipartisan? For instance, if the US Chambers of Commerce is going hard R, is it a sign that it represents businesses of a particular sector/region? Is the consolidation of the economy (GDP) in cities causing this? If so, then how does the oncoming WFH change affect these things?
  • Given wide swings in policy regimes are expensive for business—for one, they cannot plan, what are the kinds of plays eventually big businesses will come up with. In some ways, for instance, Twitter banning Trump is predictable. Businesses will opt for stability where they can.
  • The more frightening turn in American politics is toward populism and identity politics—so much for the end of politics.
  • The party coalitions keep evolving. For instance, in 2020, poor White people were firmly in the column of Republicans. While as late as 2004, as Bartels pointed out, they were not.

No Props for Prop 13

14 Dec

Proposition 13 enacted two key changes: 1. it limited property tax to 1% of the cash value, and 2. limited annual increase of assessed value to 2%. The only way the assessed value can change by more than 2% is if the property changes hands (a loophole allows you to change hands without officially changing hands). 

One impressive result of the tax is the inequality in taxes. Sample this neighborhood in San Mateo where taxes range from $67 to nearly $300k.

Take out the extremes, and the variation is still hefty. Property taxes of neighboring lots often vary by well over $20k. ) My back of the envelope estimate of standard deviation based on ten properties chosen at random is $23k.)

Sample another from Stanford where the range is from -$2k (not clear what causes negative taxes) to nearly $59k.

Prop. 13 has a variety of more material perverse consequences. Property taxes are one reason by people move from their suburban houses near the city to other more remote, cheaper places. But Prop. 13 reduces the need to move out. This likely increases property prices, which in turn likely lowers economic growth as employers choose other places. And as Chaste, a long-time contributor to the blog points out, it also means that the currently employed often have to commute longer distances, which harms the environment in addition to harming the families of those who commute.

p.s. Looking at the property tax data, you see some very small amounts. For instance, $19 property tax. When Chaste dug in, he found that the property was last sold in 1990 for $220K but was assessed at $0 in 2009 when it passed on to the government. The property tax on government-owned properties and affordable housing in California is zero. And Chaste draws out the implication: “poor cities like Richmond, which are packed with affordable housing, not only are disproportionately burdened because these populations require more services, they also receive 0 in property taxes from which to provide those services.”

p.p.s. My hunch is that a political campaign that uses property taxes in CA as a targeting variable will be very successful.

p.p.p.s. Chaste adds: “Prop 13 also applies to commercial properties. Thus, big corps also get their property tax increases capped at 2%. As a result, the sales are often structured in ways that nominally preserves existing ownership.

There was a ballot proposition on the November 2020 ballot, which would have removed Prop 13 protections for commercial properties worth more than $3M. Residential properties over $3M would continue to enjoy the protection. Even this prop failed 52%-48%. People were perhaps scared that this would be the first step in removing Prop 13 protections for their own homes.”

Dismissed Without Prejudice: Evaluating Prejudice Reduction Research

25 Sep

Prejudice is a blight on humanity. How to reduce prejudice, thus, is among the most important social scientific questions. In the latest assessment of research in the area, a follow-up to the 2009 Annual Review article, Betsy Paluck et al., however, paint a dim picture. In particular, they note three dismaying things:

Publication Bias

Table 1 (see below) makes for grim reading. While one could argue that the pattern is explained by the fact that lab research tends to have smaller samples and has especially powerful treatments, the numbers suggest—see the average s.e. of the first two rows (it may have been useful to produce a $sqrt(1/n)$ adjusted s.e.)—that publication bias very likely plays a large role. It is also shocking to know that just a fifth of the studies have treatment groups with 78 or more people.

Light Touch Interventions

The article is remarkably measured when talking about the rise of ‘light touch’ interventions—short exposure treatments. I would have described them as ‘magical thinking’ for they seem to be founded in the belief that we can make profound changes in people’s thinking on the cheap. This isn’t to say light-touch interventions can’t be worked into a regime that affects profound change—repeated light touches may work. However, as far as I could tell, no study tried multiple touches to see how the effect cumulates.

Near Contemporaneous Measurement of Dependent Variables

Very few papers judged the efficacy of the intervention a day or more after the intervention. Given the primary estimate of interest is longer-term effects, it is hard to judge the efficacy of the treatments in moving the needle on the actual quantity of interest.   

Beyond what the paper notes, here are a couple more things to consider:

  1. Perspective getting works better than perspective-taking. It would be good to explore this further in inter-group settings.
  2. One way to categorize ‘basic research interventions’ is by decomposing the treatment into its primary aspects and then slowly building back up bundles based on data:
    1. channel: f2f, audio (radio, etc.), visual (photos, etc.), audio-visual (tv, web, etc.), VR, etc.
    2. respondent action: talk, listen, see, imagine, reflect, play with a computer program, work together with someone, play together with someone, receive a public scolding, etc.
    3. source: peers, strangers, family, people who look like you, attractive people, researchers, authorities, etc.
    4. message type: parable, allegory, story, graph, table, drama, etc.
    5. message content: facts, personal stories, examples, Jonathan Haidt style studies that show some of the roots of our morality are based on poor logic, etc.

The (Mis)Information Age: Provenance is Not Enough

31 Aug

The information age has bought both bounty and pestilence. Today, we are deluged with both correct and incorrect information. If we knew how to tell apart correct claims from incorrect, we would have inched that much closer to utopia. But the lack of nous in telling apart generally ‘obvious’ incorrect claims from correct claims has brought us close to the precipice of disarray. Thus, improving people’s ability to identify untrustworthy claims as such takes on urgency.

http://gojiberries.io/2020/08/31/the-misinformation-age-measuring-and-improving-digital-literacy/

Inferring the Quality of Evidence Behind the Claims: Fact Check and Beyond

One way around misinformation is to rely on an expert army that assesses the truth value of claims. However, assessing the truth value of a claim is hard. It needs expert knowledge and careful research. When validating, we have to identify with which parts are wrong, which parts are right but misleading, and which parts are debatable. All in all, it is a noisy and time-consuming process to vet a few claims. Fact check operations, hence, cull a small number of claims and try to validate those claims. As the rate of production of information increases, thwarting misinformation by checking all the claims seems implausibly expensive.

Rather than assess the claims directly, we can assess the process. Or, in particular, the residue of one part of the process for making the claim—sources. Except for claims based on private experience, e.g., religious experience, claims are based on sources. We can use the features of these sources to infer credibility. The first feature is the number of sources cited to make a claim. All else equal, the more number of sources saying the same thing, the greater the chances that the claim is true. None of this is to undercut a common observation: lots of people can be wrong about something. A harder test for veracity if a diverse set of people say the same thing. The third test is checking the credibility of the sources.

Relying on the residue is not a panacea. People can simply lie about the source. We want the source to verify what they have been quoted as saying. And in the era of cheap data, this can be easily enabled. Quotes can be linked to video interviews or automatic transcriptions electronically signed by the interviewee. The same system can be scaled to institutions. The downside is that the system may prove onerous. On the other hand, commonly, the same source is cited by many people so a public repository of verified claims and evidence can mitigate much of the burden.

But will this solve the problem? Likely not. For one, people can still commit sins of omission. For two, they can still draft things in misleading ways. For three, trust in sources may not be tied to correctness. All we have done is built a system for establishing provenance. And establishing the provenance is not enough. Instead, we need a system that incentivizes both correctness and presentation that makes correct interpretation highly likely. It is a high bar. But it is the bar—correct and liable to correctly interpreted.

To create incentives for publishing correct claims, we need to either 1. educate the population, which brings me to the previous post, or 2. find ways to build products and recommendations that incentivize correct claims. We likely need both.

The (Mis)Information Age: Measuring and Improving ‘Digital Literacy’

31 Aug

The information age has bought both bounty and pestilence. Today, we are deluged with both correct and incorrect information. If we knew how to tell apart correct claims from incorrect, we would have inched that much closer to utopia. But the lack of nous in telling apart generally ‘obvious’ incorrect claims from correct claims has brought us close to the precipice of disarray. Thus, improving people’s ability to identify untrustworthy claims as such takes on urgency.

Before we find fixes, it is good to measure how bad things are and what things are bad. This is the task the following paper sets itself by creating a ‘digital literacy’ scale. (Digital literacy is an overloaded term. It means many different things, from the ability to find useful information, e.g., information about schools or government programs, to the ability to protect yourself against harm online (see here and here for how frequently people’s accounts are breached and how often they put themselves at risk of malware or phishing), to the ability to identify incorrect claims as such, which is how the paper uses it.)

Rather than build a skill assessment kind of a scale, the paper measures (really predicts) skills indirectly using some other digital literacy scales, whose primary purpose is likely broader. The paper validates the importance of various constituent items using variable importance and model fit kinds of measures. There are a few dangers of doing that:

  1. Inference using surrogates is dangerous as the weakness of surrogates cannot be fully explored with one dataset. And they are liable not to generalize as underlying conditions change. We ideally want measures that directly measure the construct.
  2. Variable importance is not the same as important variables. For instance, it isn’t clear why “recognition of the term RSS,” the “highest-performing item by far” has much to do with skill in identifying untrustworthy claims.

Some other work builds uncalibrated measures of digital literacy (conceived as in the previous paper). As part of an effort to judge the efficacy of a particular way of educating people about how to judge untrustworthy claims, the paper provides measures of trust in claims. The topline is that educating people is not hard (see the appendix for the description of the treatment). A minor treatment (see below) is able to improve “discernment between mainstream and false news headlines.”

Understandably, the effects of this short treatment are ‘small.’ The ITT short-term effect in the US is: “a decrease of nearly 0.2 points on a 4-point scale.” Later in the manuscript, the authors provide the substantive magnitude of the .2 pt net swing using a binary indicator of perceived headline accuracy: “The proportion of respondents rating a false headline as “very accurate” or “somewhat accurate” decreased from 32% in the control condition to 24% among respondents who were assigned to the media literacy intervention in wave 1, a decrease of 7 percentage points.” The .2 pt. net swing on a 4 point scale leading to a 7% difference is quite remarkable and generally suggests that there is a lot of ‘reverse’ intra-category movement that the crude dichotomization elides over. But even if we take the crude categories as the quantity of interest, a month later in the US, the 7 percent swing is down to 4 percent:

“…the intervention reduced the proportion of people endorsing false headlines as accurate from 33 to 29%, a 4-percentage-point effect. By contrast, the proportion of respondents who classified mainstream news as not very accurate or not at all accurate rather than somewhat or very accurate decreased only from 57 to 55% in wave 1 and 59 to 57% in wave 2.

Guess et al. 2020

The opportunity to mount more ambitious treatments remains sizable. So does the opportunity to more precisely understand what aspects of the quality of evidence people find hard to discern. And how we could release products that make their job easier.

Another ANES Goof-em-up: VCF0731

30 Aug

By Rob Lytle

At this point, it’s well established that the ANES CDF’s codebook is not to be trusted (I’m repeating “not to be trusted to include a second link!). Recently, I stumbled across another example of incorrect coding in the cumulative data file, this time in VCF0731 – Do you ever discuss politics with your family or friends?

The codebook reports 5 levels:

Do you ever discuss politics with your family or friends?

1. Yes
5. No

8. DK
9. NA

INAP. question not used

However, when we load the variable and examine the unique values:

# pulling anes-cdf from a GitHub repository
cdf <- rio::import("https://github.com/RobLytle/intra-party-affect/raw/master/data/raw/cdf-raw-trim.rds")


unique(cdf$VCF0731)
## [1] NA  5  1  6  7

We see a completely different coding scheme. We are left adrift, wondering “What is 6? What is 7?” Do 1 and 5 really mean “yes” and “no”?

We may never know.

For a survey that costs several million dollars to conduct, you’d think we could expect a double-checked codebook (or at least some kind of version control to easily fix these things as they’re identified).

Survey Experiments With Truth: Learning From Survey Experiments

27 Aug

Tools define science. Not only do they determine how science is practiced but also what questions are asked. Take survey experiments, for example. Since the advent of online survey platforms, which made conducting survey experiments trivial, the lure of convenience and internal validity has persuaded legions of researchers to use survey experiments to understand the world.

Conventional survey experiments are modest tools. Paul Sniderman writes,

“These three limitations of survey experiments—modesty of treatment, modesty of scale, and modesty of measurement—need constantly to be borne in mind when brandishing term experiment as a prestige enhancer.” I think we can easily collapse these in two — treatment (which includes ‘scale’ as he defines it— the amount of time) and measurement.

Paul Sniderman

Note: We can collapse these three concerns into two— treatment (which includes ‘scale’ as Paul defines it— the amount of time) and measurement.

But skillful artisans have used this modest tool to great effect. Famously, Kahneman and Tversky used survey experiments, e.g., Asian Disease Problem, to shed light on how people decide. More recently, Paul Sniderman and Tom Piazza have used survey experiments to shed light on an unsavory aspect of human decision making: discrimination. Aside from shedding light on human decision making, researchers have also used survey experiments to understand what survey measures mean, e.g., Ahler and Sood

The good, however, has come with the bad; insight has often come with irreflection. In particular, Paul Sniderman implicitly points to two common mistakes that people make:

  1. Not Learning From the Control Group. The focus on differences in means means that we sometimes fail to reflect on what the data in the Control Group tells us about the world. Take the paper on partisan expressive responding, for instance. The topline from the paper is that expressive responding explains half of the partisan gap. But it misses the bigger story—the partisan differences in the Control Group are much smaller than what people expect, just about 6.5% (see here). (Here’s what I wrote in 2016.)
  2. Not Putting the Effect Size in Context. A focus on significance testing means that we sometimes fail to reflect on the modesty of effect sizes. For instance, providing people $1 for a correct answer within the context of an online survey interview is a large premium. And if providing a dollar each on 12 (included) questions nudges people from an average of 4.5 correct responses to 5, it suggests that people are resistant to learning or impressively confident that what they know is right. Leaving $7 on the table tells us more than the .5, around which the paper is written. 

    More broadly, researchers are obtuse to the point that sometimes what the results show is how impressively modest the movement is when you ratchet up the dosage. For instance, if an overwhelming number of African Americans favor Whites who have scored just a few points more than a Black student, it is a telling testament to their endorsement of meritocracy.

Nothing to See Here: Statistical Power and “Oversight”

13 Aug

“Thus, when we calculate the net degree of expressive responding by subtracting the acceptance effect from the rejection effect—essentially differencing off the baseline effect of the incentive from the reduction in rumor acceptance with payment—we find that the net expressive effect is negative 0.5%—the opposite sign of what we would expect if there was expressive responding. However, the substantive size of the estimate of the expressive effect is trivial. Moreover, the standard error on this estimate is 10.6, meaning the estimate of expressive responding is essentially zero.

https://journals.uchicago.edu/doi/abs/10.1086/694258

(Note: This is not a full review of all the claims in the paper. There is more data in the paper than in the quote above. I am merely using the quote to clarify a couple of statistical points.)

There are two main points:

  1. The fact that estimate is close to zero and the s.e. is super fat are technically unrelated. The last line of the quote, however, seems to draw a relationship between the two.
  2. The estimated effect sizes of expressive responding in the literature are much smaller than the s.e. Bullock et al. (Table 2) estimate the effect of expressive responding at about 4% and Prior et al. (Figure 1) at about ~ 5.5% (“Figure 1(a) shows, the model recovers the raw means from Table 1, indicating a drop in bias from 11.8 to 6.3.”). Thus, one reasonable inference is that the study is underpowered to reasonably detect expected effect sizes.

Trump Trumps All: Coverage of Presidents on Network Television News

4 May

With Daniel Weitzel.

The US government is a federal system, with substantial domains reserved for local and state governments. For instance, education, most parts of the criminal justice system, and a large chunk of regulation are under the purview of the states. Further, the national government has three co-equal branches: legislative, executive, and judicial. Given these facts, you would expect news coverage to be broad in its coverage of branches and the levels of government. But there is a sharp skew in news coverage of politicians, with members of the executive branch, especially national politicians (and especially the president), covered far more often than other politicians (see here). Exploiting data from Vanderbilt Television News Archive (VTNA), the largest publicly available database of TV news—over 1M broadcast abstracts spanning 1968 and 2019—we add body to the observation. We searched for references to the president during their presidency and coded each hit as 1. As the figure below shows, references to the president are common. Excluding Trump, on average, a sixth of all articles contain a reference to the sitting president. But Trump is different: 60%(!) of abstracts refer to Trump.

Data and scripts can be found here.

Making an Impression: Learning from Google Ads

31 Oct

Broadly, Google Ads works as follows: 1. Advertisers create an ad, choose keywords, and make a bid (on cost-per-click or CPC) (You can bid on cost-per-view and cost-per-impression also, but we limit our discussion to CPC.), 2. the Google Ads account team vets whether the keywords are related to the product being advertised, and 3. people see the ad from the winning bid when they search for a term that includes the keyword or when they browse content that is related to the keyword (some Google Ads are shown on sites that use Google AdSense).

There is a further nuance to the last step. Generally, on popular keywords, Google has thousands of candidate ads to choose from. And Google doesn’t simply choose the ad from the winning bid. Instead, it uses data to choose an ad (or a few ads) that yield the most profit (Click Through Rate (CTR)*bid). (Google probably has a more complex user utility function and doesn’t show ads below a low predicted CTR*bid.) In all, who Google shows ads to depends on the predicted CTR and the money it will make per click.

Given this setup, we can reason about the audience for an ad. First, the higher the bid, the broader the audience. Second, it is not clear how well Google can predict CTR per ad conditional on keyword bid especially when the ad run is small. And if that is so, we expect Google to show the ad with the highest bid to a random subset of people searching for the keyword or browsing content related to the keyword. Under such conditions, you can use the total number of impressions per demographic group as an indicator of interest in the keyword. For instance, if you make the highest bid on the keyword ‘election’ and you find that total number of impressions that your ad makes among people 65+ are 10x more than people between ages 18-24, under some assumptions, e.g., similar use of ad blockers, similar rates of clicking ads conditional on relevance (which would become same as predicted relevance), similar utility functions (that is younger people are not more sensitive to irritation from irrelevant ads than older people), etc., you can infer relative interest of 18-24 versus 65+ in elections.

The other case where you can infer relative interest in a keyword (topic) from impressions is when ad markets are thin. For common keywords like ‘elections,’ Google generally has thousands of candidate ads for national campaigns. But if you only want to show your ad in a small geographic area or an infrequently searched term, the candidate set can be pretty small. If your ad is the only one, then your ad will be shown wherever it exceeds some minimum threshold of predicted CTR*bid. Assuming a high enough bid, you can take the total number of impressions of an ad as a proxy for total searches for the term and how often people browsed related content.

With all of this in mind, I discuss results from a Google Ads campaign. More here.

The Other Side

23 Oct

Samantha Laine Perfas of the Christian Science Monitor interviewed me about the gap between perceptions and reality for her podcast ‘perception gaps’ over a month ago. You can listen to the episode here (Episode 2).

The Monitor has also made the transcript of the podcast available here. Some excerpts:

“Differences need not be, and we don’t expect them to be, reasons why people dislike each other. We are all different from each other, right. …. Each person is unique, but we somehow seem to make a big fuss about certain differences and make less of a fuss about certain other differences.”

One way to fix it:

If you know so little and assume so much, … the answer is [to] simply stop doing that. Learn a little bit, assume a little less, and see where the conversation goes.

The interview is based on the following research:

  1. Partisan Composition (pdf) and Measuring Shares of Partisan Composition (pdf)
  2. Affect Not Ideology (pdf)
  3. Coming to Dislike (pdf)
  4. All in the Eye of the Beholder (pdf)

Related blog posts and think pieces:

  1. Party Time
  2. Pride and Prejudice
  3. Loss of Confidence
  4. How to read Ahler and Sood

Don’t Expose Yourself! Discretionary Exposure to Political Information

10 Oct

As the options have grown, so have the fears. Are the politically disinterested taking advantage of the nearly limitless options to opt out of news entirely? Are the politically interested siloing themselves into “echo chambers”? In an eponymous Oxford Research Encylopedia article, I discuss what we think we know, and some concerns about how we can know. Some key points:

  • Is the gap between how much the politically interested and politically disinterested know about politics increasing, as Post-broadcast Democracy posits? Figure 1 suggests not.

  • Quantity rather than ratio: “If the dependent variable is partisan affect, how ‘selective’ one is may not matter as much as the net imbalance in consumption—the difference between the number of congenial and uncongenial bits consumed…”

  • To measure how much political information a person is consuming, you must be able to distinguish political information from its complement. But what isn’t political information? “In this chapter, our focus is on consumption of varieties of political information. The genus is political information. And the species of this genus differ in congeniality, among other things. But what is political information? All information that influences people’s political attitudes or behaviors? If so, then limiting ourselves to news is likely too constraining. Popular television shows like The Handmaid’s Tale, Narcos, and Law and Order have clear political themes. … Shows like Will and Grace and The Cosby Show may be less clearly political, but they also have a political subtext.” (see Figure 4) … “Even if we limit ourselves to news, the domain is still not clear. Is news about a bank robbery relevant political information? What about Hillary Clinton’s haircut? To the extent that each of these affect people’s attitudes, they are arguably pertinent. “

  • One of the challenges with inferring consumption based on domain level data is that domain level data are crude. Going to http://nytimes.com is not the same as reading political news. And measurement error may vary by the kind of person. For instance, say we label http://nytimes.com as political news. For the political junkie, the measurement error may be close to zero. For teetotalers, it may be close to 100% (see more).

  • Show people a few news headlines along with the news source (you can randomize the source). What can you learn from a few such ‘trials’? You cannot learn what proportion of news they get from a particular source. you can learn the preferences, but not reliably. More from the paper: “Given the problems with self-reports, survey instruments that rely on behavioral measures are plausibly better. … We coded congeniality trichotomously: congenial, neutral, or uncongenial. The correlations between trials are alarmingly low. The polychoric correlation between any two trials range between .06 to .20. And the correlation between choosing political news in any two trials is between -.01 and .05.”

  • Following up on the previous point: preference for a source which has a mean slant != preference for slanted news. “Current measures of [selective exposure] are beset with five broad problems. First is conceptual errors. For instance, people frequently equate preference for information from partisan sources with a preference for congenial information.”

Code 44: How to Read Ahler and Sood

27 Jun

This is a follow-up to the hilarious Twitter thread about the sequence of 44s. Numbers in Perry’s 538 piece come from this paper.

First, yes 44s are indeed correct. (Better yet, look for yourself.) But what do the 44s refer to? 44 is the average of all the responses. When Perry writes “Republicans estimated the share at 46 percent,” (we have similar language in the paper, which is regrettable as it can be easily misunderstood), it doesn’t mean that every Republican thinks so. It may not even mean that the median Republican thinks so. See OA 1.7 for medians, OA 1.8 for distributions, but see also OA 2.8.1, Table OA 2.18, OA 2.8.2, OA 2.11 and Table OA 2.23.

Key points =

1. Large majorities overestimate the share of party-stereotypical groups in the party, except for Evangelicals and Southerners.

2. Compared to what people think is the share of a group in the population, people still think the share of the group in the stereotyped party is greater. (But how much more varies a fair bit.)

3. People also generally underestimate the share of counter-stereotypical groups in the party.

Bad Hombres: Bad People on the Other Side

8 Dec

Why do many people think that people on the other side are not well motivated? It could be because they think that the other side is less moral than them. And since opprobrium toward the morally defective is the bedrock of society, thinking that the people in the other group are less moral naturally leads people to censure the other group.

But it can’t be that two groups simultaneously have better morals than the other. It can only be that people in the groups think they are better. This much logic dictates. So, there has to be a self-serving aspect to moral standards. And this is what often leads people to think that the other side is less moral. Accepting this is not the same as accepting moral relativism. For even if we accept that some things are objectively more moral—not being sexist or racist say—some groups—those that espouse that a certain sex is superior or certain races are better—will still think that they are better.

But how do people come to know of other people’s morals? Some people infer morals from political aims. And that is a perfectly reasonable thing to do as political aims reflect what we value. For instance, a Republican who values ‘life’ may think that Democrats are morally inferior because they support the right to abortion. But the inference is fraught with error. As matters stand, Democrats would also like women to not go through the painful decision of aborting a fetus. They just want there to be an easy and safe way for women should they need to.

Sometimes people infer morals from policies. But support for different policies can stem from having different information or beliefs about causal claims. For instance, Democrats may support a carbon tax because they believe (correctly) the world is warming and because they think that the carbon tax is what will help reduce global warming the best and protect American interests. Republicans may dispute any part of that chain of logic. The point isn’t what is being disputed per se, but what people will infer about others if they just had information about the policies they support. Hanlon’s razor is often a good rule.

Why Do People (Re)-Elect Bad Leaders?

7 Dec

‘Why do people (re)-elect bad leaders?’ used to be a question that people only asked of third-world countries. No more. The recent election of unfit people to prominent positions in the U.S. and elsewhere has finally woken some American political scientists from their mildly racist reverie—the dream that they are somehow different.

So why do people (re)-elect bad leaders? One explanation that is often given is that people prefer leaders that share their ethnicity. The conventional explanation for preferring co-ethnics is that people expect co-ethnics (everyone) to do better under a co-ethnic leader. But often enough, the expectation seems more like wishful thinking than anything else. After all, the unsuitability of some leaders is pretty clear.

If it is wishful thinking, then how do we expose it? More importantly, how do we fix it? Let’s for the moment assume that people care about everyone. And if they were to learn that the co-ethnic leader is much worse than someone else, they may switch votes. But what if people care about the welfare of co-ethnics more than others? The ‘good’ thing about bad leaders is that they are generally bad for everyone. So, if they knew better, they would still switch their vote.

You can verify these points using a behavioral trust game where people observe allocators of different ethnicities and different competence, and also observe welfare of both co-ethnics and others. You can also use the game to study some of the deepest concerns about ‘negative party ID’—that people will harm themselves to spite others.

Party Time

2 Dec

It has been nearly five years since the publication of Affect, Not Ideology: A Social Identity Perspective on Polarization. In that time, the paper has accumulated over 450 citations according to Google Scholar. (Citation counts on Google Scholar tend to be a bit optimistic.) So how does the paper hold up? Some reflections:

  • Disagreement over policy conditional on aims should not mean that you think that people you disagree with are not well motivated. But regrettably, it often does.
  • Lack of real differences doesn’t mean a lack of perceived differences. See here, here, here, and here.
  • The presence of real differences is no bar to liking another person or group. Nor does a lack of real differences come in the way of disliking another person or group. History of racial and ethnic hatred will attest to the point. In fact, why small differences often serve as durable justifications for hatred is one of the oldest and deepest questions in all of social science. (Paraphrasing from Affectively Polarized?.) Evidence on the point:
    1. Sort of sorted but definitely polarized
    2. Assume partisan identity is slow moving as Green, Palmquist, and Schickler (2002) among others show. And then add to it the fact people still like their ‘own’ party a fair bit—thermometer ratings are a toasty 80 and haven’t budged. See the original paper.
    3. People like ideologically extreme elites of the party they identify with a fair bit (see here).
  • It may seem surprising to some that people can be so angry when they spend so little time on politics and know next to nothing about it. But it shouldn’t be. Information generally gets in the way of anger. Again,
    the history of racial bigotry is a good example.
  • The title of the paper is off in two ways. First, partisan affect can be caused by ideology. Not much of partisan affect may be founded in ideological differences, but at least some of it is. (I always thought so.) Secondly, the paper does not offer a social identity perspective on polarization.
  • The effect that campaigns have on increasing partisan animus is still to be studied carefully. Certainly, ads play but a small role in it.
  • Evidence on the key take-home point—that partisans dislike each other a fair bit—continues to mount. The great thing is that people have measured partisan affect in many different ways, including using IAT and trust games. Evidence that IAT is pretty unreliable is reasonably strong, but trust games seem reasonable. Also see my 2011 note on measuring partisan affect coldly.
  • Interpreting over-time changes is hard. That was always clear to us. But see Figure 1 here that controls for a bunch of socio-demographic variables, and note that the paper also has over-time cross-country to clarify inferences further.
  • If you assume that people learn about partisans from elites, reasoning what kinds of people would support this ideological extremist or another, it is easy to understand why people may like the opposing party less over time (though trends among independents should be parallel). The more curious thing is that people still like the party they identify with and approve of ideologically extreme elites of their party (see here).

Learning About [the] Loss (Function)

7 Nov

One of the things we often want to learn is the actual loss function people use for discounting ideological distance between self and a legislator. Often people try to learn the loss function using over actual distances. But if the aim is to learn the loss function, perceived distance rather than actual distance is better. It is so because perceived = what the voter believes to be true. People can then use the function to simulate out scenarios if perceptions = fact.