On (Modest) Differences In Racial Distribution of Voting Eligible Population and Registered Voters in California

13 Apr

Each election cycle, many hands are waved and spit is launched in air, when the topic of registration rates of Latinos (and other minorities) comes up. And indeed registration rates of Latinos substantially lag those of Whites. In California, percent eligible Latinos who are registered is 62.8%, whereas percent eligible Whites registered to vote is approximately 72.9%.

This somewhat large difference in registration rates doesn’t automatically translate to (equally) wide distortions in racial distribution of the eligible population and the registered voter population. For example, while self-identified Whites constitute 62.8% of the VEP, they constitute marginally more – 64.2% of the voting eligible respondents who self-identify as having registered to vote.

Here’s the math:

Assume VEP Pop. = 100
Whites = 63/100; of these 72% register = 45
Latinos = 23/100; of these 62% register = 14
Rest = 14/100; of these 62% register = 9
New Registered Population = 45 + 14 + 9 = 68
Registered: Whites = 66.2; Latinos = 20.6

Source: PPIC Survey (September 2010).
Note: CPS 2008, Secretary of State data confirm this. Voting day population estimates from Exit Poll also show no large distortions.

Some simple math:
For a two category case, say proportion category a = pa
Proportion category b = 1 - pa

Assume response rates for category a = qa, and for category b = qb = c*qa


Initial Ratio = pa/(1 -pa)
Final Ratio = pa*qa/(1-pa)*qb

Or between time 1 and 2, ratio changes by qa/qb or 1/c


T1 Diff. = pa - (1- pa) = 2pa - 1
T2 Diff. = (pa*qa - qb + pa*qb)/(pa*qa + (1-pa)*qb)
= (pa(qa + qb) - qb)/(pa(qa - qb) + qb)
= [pa*qa (1 + c) - c*qa]/[pa*qa(1-c) + c*qa]

T2 Diff. - T1 Diff. = [pa*qa (1 + c) - c*qa]/[pa*qa(1-c) + c*qa] - (2pa -1)
= [pa*qa (1 + c) - c*qa + pa*qa(1-c) + c*qa - 2pa (pa*qa(1-c) + c*qa)]/[pa*qa(1-c) + c*qa]
= [pa*qa + pa*qa*c - c*qa + pa*qa - pa*qa*c + c*qa - 2pa*pa*qa + 2pa*pa*qa*c - 2pa*c*qa]/[pa*qa(1-c) + c*qa]
= [2pa*qa - 2pa*pa*qa + 2pa*pa*qa*c - 2pa*c*qa]/[pa*qa(1-c) + c*qa]
= [2pa*qa(1- pa + pa*c -c)]/[pa*qa(1-c) + c*qa]
= [2pa*qa((1- c) - pa(1-c))]/[pa*qa(1-c) + c*qa]
= [2pa*qa(1-pa)(1-c)]/[pa*qa(1-c) + c*qa]

Diff. in response rates = qa - qb

When will diff. in response rates be greater than T2 - T1 Diff. -
qa - qb > [2pa*qa(1-pa)(1-c)]/(pa*qa - pa*qac + cqa)
qa(1-c)(pa*qa - pa*qac + cqa) > 2pa*qa(1-pa)(1-c)
qa(1-c)(pa*qa - pa*qa*c + c*qa) - 2pa*qa(1-pa)(1-c) > 0
(1-c)qa [pa*qa - pa*qa*c + c*qa - 2pa(1 -pa)] > 0
(1-c)qa[pa*qa -pa*qa*c + c.qa - 2pa + 2pa*pa] > 0
(1-c)qa[pa(qa - qa*c -2 + 2pa) - c.qa] > 0
(1- c) and qa are always greater than 0. Lets take them out.

pa.qa - pa.qa.c - 2pa + 2pa.pa - c.qa > 0
qa - qa.c - 2 + 2pa - c.qa/pa > 0 [ dividing by pa]
qa + 2pa - c.qa(1 + 1/pa) > 0
qa + 2pa > c.qa(1 + 1/pa)
(qa + 2pa)/[qa(1 + 1/pa)] > c
[pa*(qa + 2pa)]/[(pa + 1)qa] > c

When will diff. in response rates + initial diff. > T2 diff.
qa - qa*c + 2pa - 1 > [pa*qa (1 + c) - c*qa]/[pa*qa(1-c) + c*qa]
[pa*qa(1-c) + c*qa][qa - qa*c + 2pa - 1] - [pa*qa (1 + c) - c*qa] > 0
- pa*qa + pa*qa*c - c*qa + [pa*qa(1-c) + c*qa][qa - qa*c + 2pa] - pa*qa - pa*qa*c + c*qa > 0
-2pa*qa + [pa*qa(1-c) + c*qa][qa - qa*c + 2pa] > 0
-2pa*qa + [pa*qa - pa*qa*c + c*qa][qa - qa*c + 2pa] > 0
-2pa*qa + pa*qa[qa - qa*c + 2pa] - pa*qa*c[qa - qa*c + 2pa] + c*qa[qa - qa*c + 2pa] > 0
-2pa*qa + pa*qa*qa - pa*qa*qa*c + 2pa*qa*pa - pa*qa*c*qa + pa*qa*c*qa*c + 2pa*qa*c*pa + c*qa*qa - c*qa*qa*c + 2pa*c*qa> 0
-2pa*qa + pa*qa^2 - 2c*pa*qa^2 + 2qa*pa^2 + pa*c^2*qa^2 + 2pa^2*c*qa + c*qa^2 + c^2*qa^2 + 2pa*c*qa > 0
-2pa*qa + 2qa*pa^2 + 2pa*c*qa + 2pa^2*c*qa + pa*qa^2 - 2c*pa*qa^2 + pa*c^2*qa^2 + c*qa^2 + c^2*qa^2 > 0
2qa*pa(-1 + c + pa + pa*c) + pa*qa^2 (1 - 2c + c^2) + c*qa^2(1 + c) > 0
2qa*pa(-1 + c + pa(1+c)) + pa*qa^2 (1 - c)^2 + c*qa^2(1 + c) > 0
two of the terms are always 0 or more.
2qa*pa(-1 + c + pa(1+c)) > 0
-1 + c + pa(1+c) > 0
pa > (1-c)/(1 +c)