So as we have found out over the past few days, PPP's crosstabs aren't very helpful when it comes to the actual opinions of youth and minorities, despite accurate topline numbers. Is there a way to tell if any other demographics are messed up? Can we still get some interesting information from the massive amount of data available? Yes, and yes.
You will recall this all started with respondents who incorrectly reported their geographic location. This is one thing we actually are fairly certain we know about respondents, because the phone numbers are supposed to be landlines and we know the area code.
If you delve into the demographics of those who misreported their geography, you'll see they are very different from the sample as a whole. For example, about 60% say they're women versus 40% for men. So, either women are more likely to enter an incorrect geography... or people who enter an incorrect geography are far more likely to respond incorrectly to other demographic questions as well.
Are we sure people who enter errors in geography also make errors on other questions? Let's look at one demographic group which is almost defined by its political behavior: Tea Party supporters.
To be a Tea Party supporter is to be against Obama. Yet 7% of Tea Party supporters actually say they will vote for Obama. But if we look at Tea Party supporters who also chose the wrong geography, 19% say they will vote for Obama. The remaining Tea Party supporters are then reduced to 5% support of Obama - the data are closer to what we would expect, but probably not all the pretenders have been removed. Conversely, at only 19% support for Obama, a large portion of those who entered an incorrect geography probably truly are Tea Party supporters.
Still, we should be able to identify demographics with large proportions of corrupted data from the proportion of respondents who misidentify their geography. Let's take a look below the fold.
Sure enough, those demographics we have previously identified as containing high numbers of respondents who don't actually belong to said demographic group have fairly large geography error rates. Additionally, two sub-categories flagged as incorrect, African-American and Hispanic voters age 18-29, have geography error rates of 22% and 25%, respectively.
All values above 10% are bolded in red. I would recommend using the reported crosstab values for these demographic groups with extreme caution, as a large portion of the respondents are likely to not actually be in that demographic group. Changes in the numbers may still be meaningful, as we have seen with African-American support of gay marriage. However, the absolute value will almost certainly be incorrect.
Thus ends this exploration of polling errors in the DailyKos/SEIU/PPP polls, brought to you by a liberal data release policy. Sadly, the numbers for several important demographic groups are compromised, and are likely to be compromised amongst other automated pollsters as well. We do now have a rough measure of how compromised a given demographic may be, however, that will be useful in future analyses.
___________
Beyond the Margin of Error is a series exploring problems in polling other than random error, which is the only type of error the margin of error deals with.
Previously:
How to Stop Worrying and Love the Toplines. Even though youth and minority subsamples are incorrect, so are majority subsamples, so that the toplines in PPP polls come out about right.
The Curious Incident of the Young Republican Minorities. Only a little over half of respondents in the category of African-Americans age 18-29 said they approved of Obama - but only because many of those respondents weren't actually African-American or age 18-29. The numbers for the 18-29 age group are inaccurate as well.
This Is Why We Can't Have Nice Things. A small number of respondents press the wrong button when answering the DailyKos poll question on race, leading to inaccurate numbers for racial minorities in the crosstabs.
Why Don't People Know Where They Live in the DKos Poll? A small number of respondents - around 5-9% - press the wrong button when answering the geography question on the Daily Kos poll. This is far greater than than can be explained by observed rates of misunderstandings or data entry errors.
Why State Polls Look More Favorable For Obama than National Polls. In the spring and summer, lack of support in Blue States was bringing down Obama's performance in national polls, while Swing States and Red States were polling about the same as 2008.
Presidential Polls Are Almost Always Right, Even When They're Wrong. How the presidential polls in red and blue states are off, sometimes way off, and how to predict how far off they'll be.
When Polls Fail, or Why Elizabeth Warren Will Dash GOP Hopes. Why polls for close races for Governor and Senate are sometimes way off, and how to predict how far off they will be.