Thoughts and questions on primary pollster variances and their underlying reasons. Complete with nifty little (slightly crooked) tables!
After reading a not particularly interesting New York Times article about the Bradley Effect I decided to take another look at an older FiveThirtyEight article covering the same subject and using democratic primary data to come to the conclusion that the Bradley Effect may now actually be the Yeldarb Effect. The FiveThirtyEight article uses data from "31 states in which at least three separate polls were released within 14 days of that state's primary or caucus" comparing the final Pollster.com trend line to the actual voting data. They conclude (well, the numbers conclude) that Obama outperformed the trend line by an average of 3.3 points.
Since it's a beautiful windy day outside I decided to sit at my computer and further manipulate the data and I found a couple things that probably don't mean much, but I think are interesting nonetheless. PPB = percent of state's population (or average of percentages for that a specific category) that is black from 2006 Census. All variance numbers are the number by which Obama outperformed the Pollster.com trend line.
PPB Variance
19.9+ 11.2
10-19.8 1.19
0 - 9.9 0.5
Now the confounding issue with the above little table is that out of the 7 states with a PPB of 19.9% or more, 4 have open primaries (i.e., any party affiliation can vote at any primary). And if we look at this little table:
Primary Type Variance PPB
open 8.27 17.8
closed 2.01 10.6
closed/semi 0.68 10.6
We see that the open primaries (10 states) in general had a much higher favorable Obama variance than the closed (10 states) or closed/semi (semi-open/semi-closed – 30 states) primaries. But at the same time, we have the confounding issue of the states with open primaries, on average, having a much larger PPB.
I like just looking at the numbers best: numbers are nice and clean and simple. But if you want to try and actually interpret the numbers things can get a bit hairy:
Why is the +Obama variance so high in states with high PPB?
Pollsters are not reaching enough black voters...
The open primary effect is skewing the results...
Black-voters are turning out in unexpectedly high numbers...
Black voters in these states lie to pollsters about voting for the black candidate...
White voters in these states lie to pollsters about voting for the white candidate...
Why is the +Obama variance so high in states with open primaries?
Same reasons as above...
Republican meddling...
Unsure republicans having a preference for Obama...
Unsure republicans having a dislike for Clinton...
What effect did having a female running mate have on the Bradley/Yeldarb Effect?
And finally, what does any of this mean for 11/4? Will we actually see a Bradley Effect in the general once more republicans are thrown into the fray? Or will we simply see a continuation of the democratic primary patterns? What effect will voter intimidation/suppression/confusion have on the effects that may already be effecting things? So many questions. So many permutations. And really, who the hell knows.
Misc notes: I used 19.9 to start the highest PPB category because Virginia was at the cusp with 19.9 as of 2006 and I decided to drawing the line at 19.9 is about as meaningful as drawing it at 20. I did not use Iowa in any "Primary Type" data because I'm not really sure were it belongs. I did use the other 30 states that FiveThirtyEight had data for.