Liberal Politicians are Referred to More Often in News

8 Jul

The median Democrat referred to in television news is to the left of the House Democratic Median, and the median Republican politician referred to is to the left of the House Republican Median.

Click here for the aggregate distribution.

And here’s a plot of top 50 politicians cited in news. The plot shows a strong right skewed distribution with a bias towards executives.

News data: UCLA Television News Archive, which includes closed-caption transcripts of all national, cable and local (Los Angeles) news from 2006 to early 2013. In all, there are 155,814 transcripts of news shows.

Politician data: Database on Ideology, Money in Politics, and Elections (see Bonica 2012).

Taking out data from local news channels or removing Obama does little to change the pattern in the aggregate distribution.

Error Free Multi-dimensional Thinking

1 May

Some recent research suggests that Americans’ policy preferences are highly constrained, with a single dimension able to correctly predict over 80% of the responses (see Jessee 2009, Tausanovitch and Warshaw 2013). Not only that, adding a new (orthogonal) dimension doesn’t improve prediction success by more than a couple of percentage points.

All this flies in the face of conventional wisdom in American Politics, which is roughly antipodal to the new view: most people’s policy preferences are unstructured. In fact, many people don’t have any real preferences on many of the issues (`non-preferences’). Evidence that is most often cited in support of this view comes from Converse – weak correlation between preferences across measurement waves spanning two years (r ~ .4 to .5), and even lower within wave cross-issue correlations (r ~ .2).

What explains this double disagreement — over the authenticity of preferences, and over the structuration of preferences?

First, the authenticity of preferences. When reports of preferences change across waves, is it a consequence of attitude change or non-preferences or measurement error? In response to concerns about long periods between test-retest – which allowed for opinions to genuinely change – researchers tried shorter time periods. Correlations were notably stronger (r ~ .6 to .9)(see Brown 1970). But the sheen of these healthy correlations was worn off by concerns that stability was merely an artifact of people remembering and reproducing what they put down last time.

Redemption of correlations over longer time periods came from Achen (1975). While few of the assumptions behind the redemption are correct – notably uncorrelated errors (across individuals, waves, etc.) – for inferences to be seriously wrong, much has to go wrong. More recently, and, subject to validation, perhaps more convincingly, work by Dean Lacy suggests that once you take out the small number of implausible transitions between waves – those from one end of the scale to another – cross-wave correlations are fairly healthy. (This is exactly opposite to the conclusion Converse came to based on a Markov model; he argued that aside from a few consistent responses, rest of the responses were mostly noise.) Much simpler but informative tests are still missing. For instance, it seems implausible that lots of people who hold well-defined preferences on an issue would struggle to pick even the right side of the scale when surveyed. Tallying stability of dichotomized preferences would be useful.

Some other purported evidence for the authenticity of preferences has come from measurement error models that rest upon more sizable assumptions. These models assume an underlying trait (or traits) and pool preferences over disparate policy positions (see, for instance, Ansolabehere, Rodden, and Snyder 2006 but also Tausanovitch and Warshaw 2013). How do we know there is an underlying trait? That isn’t clear. Generally, it is perfectly okay to ask whether preferences are correlated, less so to simply assume that preferences are structured by an unobserved underlying mental construct.

With the caveat that dimensions may not reflect mental constructs, we next move to assessing claims about the dimensionality of preferences. Differences between recent results and conventional wisdom about “constraint” may be simply due to increase in structuration of preferences over time. However, research suggests that constraint hasn’t increased over time (Baldassari and Gelman 2008). Perhaps more plausibly, dichotomization, which presumably reduces measurement error, is behind some of the differences. There are of course less ham-handed ways of reducing measurement error. For instance, using multiple items to measure preferences on a single policy, as psychologists often do. Since it cannot be emphasized enough, the lesson of past two paragraphs is: keep adjustments for measurement error, and measurement of constraint separate.

Analysis suggesting higher constraint may also be an artifact of analysts’ choices. Dimension reduction techniques are naturally sensitive to the pool of items. If a large majority of the items solicit preferences on economic issues (as in Tausanovitch and Warshaw 2013), the first principal component will naturally pick preferences on that dimension. Since the majority of the gains would come from correctly predicting a large majority of the items, gains in percentage correctly predicted would be poor at judging whether there is another dimension, say preferences on cultural issues. Cross-validation across selected large item groups (large enough to overcome idiosyncratic error) would be a useful strategy. And then again, gains in percentage correctly predicted over the entire population may miss subgroups with very different preference structures. For instance, Blacks and Catholics, who tend to be more socially conservative but economically liberal. Lastly, it is possible that preferences on some current issues (such as those used by Jessee 2009) may be more structured (by political conflict) than some old standing issues.

Randomly Redistricting More Efficiently

25 Sep

In a forthcoming article, Chen and Rodden estimate the effect of ‘Unintentional gerrymandering’ on number of seats that go to a particular party. To do so they pick a precinct at random, and then add (randomly chosen) adjacent precincts to it till the district is of a certain size (decided by the total number of districts one wants to create). Then they go about creating a new district in the same manner, randomly selecting a precinct bordering the first district. This goes on till all the precincts are assigned to a district. There are some additional details but they are immaterial to the point of the note. A smarter way to do the same thing would be to just create one district over and over again (starting with a randomly chosen precinct). This would reduce the computational burden (memory for storing edges, differencing shapefiles, etc.) while leaving estimates unchanged.

A Potential Source of Bias in Estimating the Impact of Televised Campaign Ads

16 Aug

Or When Treatment is Strategic, No-Intent-to-Treat Intent-to-Treat Effects can be biased

One popular strategy for estimating the impact of televised campaign ads is by exploiting ‘accidental spillover’ (see Huber and Arceneaux 2007). The identification strategy builds on the following facts: Ads on local television can only be targeted at the DMA level. DMAs sometimes span multiple states. Where DMAs span battleground and non-battleground states, ads targeted for residents of battleground states are seen by those in non-battleground states. In short, people in non-battleground states are ‘inadvertently’ exposed to the ‘treatment’. Behavior/Attitudes etc. of the residents who were inadvertently exposed are then compared to those of other (unexposed) residents in those states. The benefit of this identification strategy is that it allows television ads to be decoupled from the ground campaign and other campaign activities, such as presidential visits (though people in the spillover region are exposed to television coverage of the visits). It also decouples ad exposure etc. from strategic targeting of the people based on characteristics of the battleground DMA etc. There is evidence that content, style, the volume, etc. of television ads is ‘context aware’ – varies depending on what ‘DMA’ they run in etc. (After accounting for cost of running ads in the DMA, some variation in volume/content etc. across DMAs within states can be explained by partisan profile of the DMA, etc.)

By decoupling strategic targeting from message volume and content, we only get an estimate of the ‘treatment’ targeted dumbly. If one wants an estimate of ‘strategic treatment’, such quasi-experimental designs relying on accidental spillover may be inappropriate. How to estimate then the impact of strategically targeted televised campaign ads: first estimate how ads are targeted depending on area and people (Political interest moderates the impact of political ads [see for e.g. Ansolabehere and Iyengar 1995]) characteristics, next estimate effect of messages using the H/A strategy, and then re-weight the effect using estimates of how the ad is targeted.

One can also try to estimate the effect of ‘strategy’ by comparing adjusted treatment effect estimates in DMAs where treatment was targeted vis-a-vis (captured by regressing out other campaign activity) and where it wasn’t.

Moving Away From the Main Opposing Party

1 Jun

Two things are often stated about American politics: political elites are increasingly polarized, and that the issue positions of the masses haven’t budged much. Assuming such to be the case, one expects the average distance between where partisans place themselves and where they place the ‘in-party’ (or the ‘out-party’) to increase. However, it appears that the distance to the in-party has remained roughly constant, while the distance to the out-party has grown, in line with what one expects from the theory of ‘affective polarization’ and group-based perception. (Read More: Still Close: Perceived Ideological Distance to Own and Main Opposing Party)


By Party:

Interviewer Assessed Political Information

15 Mar

In the National Election Studies (NES), interviewers have been asked to rate respondent’s level of political information: “Respondent’s general level of information about politics and public affairs seemed — Very high, Fairly high, Average, Fairly low, Very low.” John Zaller, among others, has argued that these ratings measure political knowledge reasonably well. However, there is some evidence that challenges the claim. For instance, there is considerable unexplained inter- and intra-interviewer heterogeneity in ratings – people with similar levels of knowledge (as measured via closed-ended items) are rated very differently (Levendusky and Jackman 2003 (pdf)). It also appears that mean interviewer ratings have been rising over the years, compared to the relatively flat trend observed in more traditional measures (see Delli Carpini, and Keeter 1996 and Gilens, Vavreck, and Cohen 2004, etc).

Part of the increase is explained by higher ratings of respondents with less than a college degree; ratings of respondents with BS or more have remained somewhat more flat. As a result, the difference in ratings of people with a Bachelor’s Degree or more and those with less than a college degree is decreasing over time. Correlation between interviewer ratings and other criteria like political interest are also trending downward (though the decline is less sharp). This conflicts with evidence for increasing ‘knowledge gap’ (Prior 2005).

The other notable trend is the sharp negative correlation (over .85) between intercept and slope of within-year regressions of interviewer ratings and political interest, education, etc. This sharp negative correlation hints at possible ceiling effects. And indeed there is some evidence for that.

Interviewer Measure – The measure is sometimes from the pre-election wave only, other times in the post-election wave only, and still other times in both waves. Where both pre and post measures were available, they were averaged. The correlation between pre-election and post-election rating was .69. The average post-election ratings are lower than pre-election ratings.

Impact of Menu on Choices: Choosing What You Want Or Deciding What You Should Want

24 Sep

In Predictably Irrational, Dan Ariely discusses the clever (ex)-subscription menu of The Economist that purportedly manipulates people to subscribe to a pricier plan. In an experiment based on the menu, Ariely shows that addition of an item to the menu (that very few choose) can cause preference reversal over other items in the menu.

Let’s consider a minor variation of Ariely’s experiment. Assume there are two different menus that look as follows:
1. 400 cal, 500 cal.
2. 400 cal, 500 cal, 800 cal.

Assume that all items cost and taste the same. When given the first menu, say 20% choose the 500 calorie item. When selecting from the second menu, percent of respondents selecting the 500 calorie choice is likely to be significantly greater.

Now, why may that be? One reason may be that people do not have absolute preferences; here for a specific number of calories. And that people make judgments about what is the reasonable number of calories based on the menu. For instance, they decide that they do not want the item with the maximum calorie count. And when presented with a menu with more than two distinct calorie choices, another consideration comes to mind — they do not too little food either. More generally, they may let the options on the menu anchor for them what is ‘too much’ and what is ‘too little.’

If this is true, it can have potentially negative consequences. For instance, McDonald’s has on the menu a Bacon Angus Burger that is about 1360 calories (calories are now being displayed on McDonald’s menus courtesy Richard Thaler). It is possible that people choose higher calorie items when they see this menu option, than when they do not.

More generally, people’s reliance on the menu to discover their own preferences means that marketers can manipulate what is seen as the middle (and hence ‘reasonable’). This also translates to some degree to politics where what is considered the middle (in both social and economic policy) is sometimes exogenously shifted by the elites.

That is but one way a choice on the menu can impact preference order over other choices. Separately, sometimes a choice can prime people about how to judge other choices. For instance, in a paper exploring effect of Nader on preferences over Bush and Kerry, researchers find that “[W]hen Nader is in the choice set all voters’ choices are more sharply aligned with their spatial placements of the candidates.”

This all means, assumptions of IIA need to be rethought. Adverse conclusions about human rationality are best withheld (see Sen).

Further Reading:

1. R. Duncan Luce and Howard Raiffa. Games and Decision. John Wiley and Sons, Inc., 1957.
2. Amartya Sen. Internal consistency of choice. Econometrica, 61(3):495– -521, May 1993.
3. Amartya Sen. Is the idea of purely internal consistency of choice bizarre? In J.E.J. Altham and Ross Harrison, editors, World, Mind, and Ethics. Essays on the ethical philosophy of Bernard Williams. Cambridge University Press, 1995.

Elite Lawyers!

11 Jul

(Based on data from the 111th Congress)

Law is the most popular degree at the Capitol Hill (it has been the case for a long time). Nearly 52% of the senators, and 36% of congressional representatives have a degree in law. There are some differences across parties and across houses, with Republicans likelier to have a law degree than Democrats in the Senate (58% to 48%), and the reverse holding true for the Congress, where a greater share of Democrats holds law degrees than Republicans (40% to 32%). Less than 10% of members of congress have a degree in the natural sciences or engineering. Nearly 8% have a degree from Harvard, making Harvard’s the largest alumni contingent at the Capitol. Yale is a distant second with less than half the number that went to Harvard.

Data and Script

Does Children’s Sex Cause Partisanship?

26 May

More women identify themselves as Democrats than as Republicans. The disparity is yet greater among single women. It is possible (perhaps even likely) that this difference in partisan identification is due to (perceived) policy positions of Republicans and Democrats.

Now let’s do a thought experiment: Imagine a couple about to have a kid. Also, assume that the couple doesn’t engage in sex-selection. Two things can happen – the couple can have a son or a daughter. It is possible that having a daughter persuades the parent to change his or her policy preferences towards a direction that is perceived as more congenial to women. It is also possible that having a son has the opposite impact — persuading parents to adopt more male congenial political preferences. Overall, it is possible that gender of the child makes a difference to parents’ policy preferences. With panel data, one can identify both movements. With cross-sectional data, one can only identify the difference between those who had a son, and those who had a daughter.

Let’s test this using cross-sectional data from Jennings and Stoker’s “Study of Political Socialization: Parent-Child Pairs Based on Survey of Youth Panel and Their Offspring, 1997.”

Let’s assume that a couple’s partisan affiliation doesn’t impact the gender of their kid.

The number of kids, however, is determined by personal choice, which in turn may be impacted by ideology, income, etc. For example, it is likely that conservatives have more kids as they are less likely to believe in contraception, etc. This is also supported by the data. (Ideology is a post-treatment variable. This may not matter if the impact of having a daughter is same in magnitude as the impact of having a son, and if there are similar numbers of each across people.)

Hence, one may conceptualize “treatment” as the gender of the kids, conditional on the number of kids.

Understandably, we only study people who have one or more kids.

Conditional on number of kids, the more daughters respondent has, the less likely respondent is to identify herself as a Republican (b = -.342, p < .01) (when dependent variable is curtailed to Republican/Democrat dichotomous variable; the relationship holds—indeed becomes stronger—if the dependent variable is coded as an ordinal trichotomous variable: Republican, Independent, and Democrat, and an ordered multinomial estimated)


If what we observe is true then we should also see that as party stances evolve, the impact of gender on policy preference of a parent should vary. One should also be able to do this cross-nationally.

Some other findings:

  1. Probability of having a son (limiting to live births in the U.S.) is about .51. This natural rate varies slightly by income. Daughters are more likely to be born among people with lower incomes. However, the effect of income is extremely modest in the U.S. The live birth ratio is marginally rebalanced by the higher child mortality rate among males. As a result, among 0–21, the ratio between men and women is about equal in U.S.

    In the sample, there are significantly more daughters than sons. The female/male ratio is 1.16. This is ‘significantly’ unusual.

  2. If families are less likely to have kids after the birth of a boy, the number of kids will be negatively correlated with proportion sons. Among people with just one kid, the number of sons is indeed greater than number of daughters, though the difference is insignificant. Overall correlation between proportion sons and number of kids is also very low (corr. = -.041).

Sort of Sorted but Definitely Cold

18 May

By now students of American Politics have all become accustomed to seeing graphs of DW-NOMINATE scores showing ideological polarization in Congress. Here are the equivalent graphs (we assume two dimensions) at the mass-level.

Data are from the 2004 ANES. Social and Cultural Preferences are from Confirmatory Factor Analysis over relevant items.





Here’s how to interpret the graphs:

1) There is a large overlap in preference profiles of Rs and Ds.

2) Conditional on same preferences, there is a large gap in thermometer ratings. Without partisan bias – same-preferences should yield about the same R-D thermometer ratings. And this gap is not particularly responsive to change in preferences within parties.

On (Modest) Differences In Racial Distribution of Voting Eligible Population and Registered Voters in California

13 Apr

Each election cycle, many hands are waved and spit is launched in air, when the topic of registration rates of Latinos (and other minorities) comes up. And indeed registration rates of Latinos substantially lag those of Whites. In California, percent eligible Latinos who are registered is 62.8%, whereas percent eligible Whites registered to vote is approximately 72.9%.

This somewhat large difference in registration rates doesn’t automatically translate to (equally) wide distortions in racial distribution of the eligible population and the registered voter population. For example, while self-identified Whites constitute 62.8% of the VEP, they constitute marginally more – 64.2% of the voting eligible respondents who self-identify as having registered to vote.

Here’s the math:

Assume VEP Pop. = 100
Whites = 63/100; of these 72% register = 45
Latinos = 23/100; of these 62% register = 14
Rest = 14/100; of these 62% register = 9
New Registered Population = 45 + 14 + 9 = 68
Registered: Whites = 66.2; Latinos = 20.6

Source: PPIC Survey (September 2010).
Note: CPS 2008, Secretary of State data confirm this. Voting day population estimates from Exit Poll also show no large distortions.

Some simple math:
For a two category case, say proportion category a = pa
Proportion category b = 1 - pa

Assume response rates for category a = qa, and for category b = qb = c*qa

Initial Ratio = pa/(1 -pa)
Final Ratio = pa*qa/(1-pa)*qb

Or between time 1 and 2, ratio changes by qa/qb or 1/c

T1 Diff. = pa - (1- pa) = 2pa - 1
T2 Diff. = (pa*qa - qb + pa*qb)/(pa*qa + (1-pa)*qb)
= (pa(qa + qb) - qb)/(pa(qa - qb) + qb)
= [pa*qa (1 + c) - c*qa]/[pa*qa(1-c) + c*qa]

T2 Diff. - T1 Diff. = [pa*qa (1 + c) - c*qa]/[pa*qa(1-c) + c*qa] - (2pa -1)
= [pa*qa (1 + c) - c*qa + pa*qa(1-c) + c*qa - 2pa (pa*qa(1-c) + c*qa)]/[pa*qa(1-c) + c*qa]
= [pa*qa + pa*qa*c - c*qa + pa*qa - pa*qa*c + c*qa - 2pa*pa*qa + 2pa*pa*qa*c - 2pa*c*qa]/[pa*qa(1-c) + c*qa]
= [2pa*qa - 2pa*pa*qa + 2pa*pa*qa*c - 2pa*c*qa]/[pa*qa(1-c) + c*qa]
= [2pa*qa(1- pa + pa*c -c)]/[pa*qa(1-c) + c*qa]
= [2pa*qa((1- c) - pa(1-c))]/[pa*qa(1-c) + c*qa]
= [2pa*qa(1-pa)(1-c)]/[pa*qa(1-c) + c*qa]

Diff. in response rates = qa - qb

When will diff. in response rates be greater than T2 - T1 Diff. -
qa - qb > [2pa*qa(1-pa)(1-c)]/(pa*qa - pa*qac + cqa)
qa(1-c)(pa*qa - pa*qac + cqa) > 2pa*qa(1-pa)(1-c)
qa(1-c)(pa*qa - pa*qa*c + c*qa) - 2pa*qa(1-pa)(1-c) > 0
(1-c)qa [pa*qa - pa*qa*c + c*qa - 2pa(1 -pa)] > 0
(1-c)qa[pa*qa -pa*qa*c + - 2pa + 2pa*pa] > 0
(1-c)qa[pa(qa - qa*c -2 + 2pa) -] > 0
(1- c) and qa are always greater than 0. Lets take them out. - - 2pa + - > 0
qa - qa.c - 2 + 2pa - > 0 [ dividing by pa]
qa + 2pa - + 1/pa) > 0
qa + 2pa > + 1/pa)
(qa + 2pa)/[qa(1 + 1/pa)] > c
[pa*(qa + 2pa)]/[(pa + 1)qa] > c

When will diff. in response rates + initial diff. > T2 diff.
qa - qa*c + 2pa - 1 > [pa*qa (1 + c) - c*qa]/[pa*qa(1-c) + c*qa]
[pa*qa(1-c) + c*qa][qa - qa*c + 2pa - 1] - [pa*qa (1 + c) - c*qa] > 0
- pa*qa + pa*qa*c - c*qa + [pa*qa(1-c) + c*qa][qa - qa*c + 2pa] - pa*qa - pa*qa*c + c*qa > 0
-2pa*qa + [pa*qa(1-c) + c*qa][qa - qa*c + 2pa] > 0
-2pa*qa + [pa*qa - pa*qa*c + c*qa][qa - qa*c + 2pa] > 0
-2pa*qa + pa*qa[qa - qa*c + 2pa] - pa*qa*c[qa - qa*c + 2pa] + c*qa[qa - qa*c + 2pa] > 0
-2pa*qa + pa*qa*qa - pa*qa*qa*c + 2pa*qa*pa - pa*qa*c*qa + pa*qa*c*qa*c + 2pa*qa*c*pa + c*qa*qa - c*qa*qa*c + 2pa*c*qa> 0
-2pa*qa + pa*qa^2 - 2c*pa*qa^2 + 2qa*pa^2 + pa*c^2*qa^2 + 2pa^2*c*qa + c*qa^2 + c^2*qa^2 + 2pa*c*qa > 0
-2pa*qa + 2qa*pa^2 + 2pa*c*qa + 2pa^2*c*qa + pa*qa^2 - 2c*pa*qa^2 + pa*c^2*qa^2 + c*qa^2 + c^2*qa^2 > 0
2qa*pa(-1 + c + pa + pa*c) + pa*qa^2 (1 - 2c + c^2) + c*qa^2(1 + c) > 0
2qa*pa(-1 + c + pa(1+c)) + pa*qa^2 (1 - c)^2 + c*qa^2(1 + c) > 0
two of the terms are always 0 or more.
2qa*pa(-1 + c + pa(1+c)) > 0
-1 + c + pa(1+c) > 0
pa > (1-c)/(1 +c)

Measuring Partisan Affect Coldly

24 Mar

Outside of the variety of ways of explicitly asking people how they feel about another group — feeling thermometers, like/dislike scales, favorability ratings — explicit measures asked using mechanisms designed to overcome or attenuate social desirability concerns — bogus pipeline, ACASI — and a plethora of implicit measures — affect misattribution, IAT — there exist a few other interesting ways of measuring affect:

  • Games as measures – Jeremy Weinstein uses games like the dictator game to measure (inter-ethnic) affect. One can use prisoner’s dilemma, among other games, to do the same.
  • Systematic bias in responding to factual questions when ignorant about the correct answer. For example, most presidential elections years since 1988, ANES has posed a variety of retrospective evaluative and factual questions including assessments of the state of the economy, whether the inflation/unemployment/crime rose, remained the same, or declined in the past year (or some other time frame). Analyses of these questions have revealed significant ‘partisan bias’, but these questions have yet to be used as a measure of ‘partisan affect’ that is the likely cause of the observed ‘bias’.