Beyond the Margin of Error: The Virginia polling errors in context

by Daniel Donner for Daily Kos Elections

Sunday, Dec. 01, 2013 Sunday, Dec. 01, 2013 at 10:25:14am PST

A few weeks ago there was an election in Virginia where the polls were off by a fair amount—the margin of victory for the Democrat in the governor's race was about five points less than the polling average had predicted.

The question I wanted to answer is, just how unusual is it for such a polling miss to occur?

The simple answer: it's pretty much normal when there's a third candidate polling more than 5 percent.

We already know that a substantial portion of polling error is related to Obama's 2008 vote share in that state, so we need to take that into account when considering what the typical polling error would be.

Here's a graph of the error in the margin predicted by polling plotted against Obama's 2008 vote share, for elections between 2003-2013 with more than five polls in the polling average.

In this graph, a polling average that gets the margin of victory perfectly right would result in a point right on the horizontal green line. A polling average that is wrong by the amount we would expect based on the partisan tendency of the state would fall on the diagonal black regression line.

Learn more about what this graph means below the fold.

The purple circles show elections where a third candidate was polling at more than 5 percent. As you can see, they fall mainly on the outside edges of the cloud of data. The 2013 VA-Gov election looks pretty typical for the purple circles.

If we measure how far off the data points are from the diagonal regression line, we see that the median absolute residual for the purple circles is 5.2, much greater than 1.8 for the grey circles (distribution here). Relative to 5.2, 2013 Virginia governor's race has a very reasonable absolute residual of 6.1.

At this point I would have liked to tell you I analyzed all these races with semi-competitive third (and fourth) candidates and came up with some whizbang rules to tell you which way the polls will err and when, but I can't. There's simply too many variations in the ideology and campaigns of the third (or fourth) candidate—it needs a race-by-race analysis.

The only "big picture" lesson I can come up with is that a third party, third-place candidate on average underperforms the polling average, by about half a point, which is not news. The underpeformance was more than one point in 7 of 20 cases, with one case of overperformance by more than one point. Major party third-place candidates, however, overperformed in four out of six races.

And as far as Virginia goes? The third-place, third party candidate underperformed by 3.0 points, the second worst of the 20 cases. So in that regard, it was indeed unusual.

More details for the curious

Why is the cutoff for third-place candidates 5 percent?
When third-place candidates are polling less than that, pollsters typically don't include them in the polls, so there's a lack of data. But, yes, it is an arbitrary cutoff.

Why are you still using percentage Obama 2008 for your regression?
Because the data are from 2003-2013, and 2008 is right in the middle. But I'm open to suggestions. I haven't found anything that works better yet.

How do you calculate your polling averages?
Polls within the last 10 days for a race with obvious trends, otherwise polls since October 1st or 1 month prior to election day. Research 2000, Strategic Vision, and 2006 Zogby Interactive polls are excluded.

Why do you only use polling averages with five or more polls?
Polling averages with fewer polls have more error. The amount of error seems to level out at about five polls.
___________
Beyond the Margin of Error is a occasional series exploring problems in polling other than random error, which is the only type of error the margin of error deals with. Click here for the full series.