The polls will be wrong - and not necessarily the way you want.

by Daniel Donner for Daily Kos Elections

Friday, Oct. 31, 2014 Friday, Oct. 31, 2014 at 11:19:56am PDT

There is one thing I can be sure will happen on Tuesday: A lot of polls will be wrong. Some will be very wrong. Of course, we know better than to pin hopes on an individual poll; but as it turns out, the margins of polling averages are also more than a couple of points off most of the time. Considering the Daily Kos Election Outlook has, as of this writing, 10 races within two points, we could have a wild ride this year.

Now here's the part you really wanted to hear: more often than not, the polls have underestimated Democratic performance. Don't get too excited though—there's a major exception, for conservative states. And that's where we happen to have some important Senate contests this year.

Here's how the polling averages have done in governor and Senate races, compared to actual results, for relatively close races between 2004 and 2013:

The diagonal line shows where, ideally, we would like to see the polling average: making a prediction that is exactly correct. The blue circles show where Democrats did better than the polls predicted (above the diagonal line). The red circles show where Republicans did better than the polls. Points above the red horizontal line are races that Democrats won; points to the right of the red vertical line are races that had Democrats ahead in the polling average. Points in the upper left and lower right quadrant show where the polling average incorrectly predicted the winner.

However, even when polls predict the correct winner they're off by a fair amount. Below, we'll look at some of the reasons why. We'll see that errors increase if there's not very many polls in the polling average or if there's third party candidates (and we have a lot of these this year). And, Democrats tend to outperform the polls more in blue states, while Republicans are more likely to outperform the polls in deep red states.

Please read below for a more detailed explanation.

As mentioned above, we have a huge number of races with third party candidates this year, and they don't seem to be fading away as election day approaches. In the past, this has led to large errors in the polling averages—two-thirds of the time, more than four points in the margin. Here is what that looks like for the closest races:

The green circles show races where third party candidates ended up with more than 5 percent of the vote, combined. (This is roughly equivalent to 6 percent in polls.) It's clear that most of them are quite far from the diagonal line where an ideal polling average should lie; indeed, among the farthest. The gray points show all the rest of the data.

Another source of error in polling averages is too few polls. This isn't much of an issue in close races, which are usually polled frequently, unless the race is changing faster than the pollsters are polling. Below, the yellow circles have 10 or more polls in the average, while the green circles have fewer than five.

There's not many green circles; however, half of them are a fair distance away from the perfection of the diagonal line. When you look at all races, not just close ones, it's quite clear that errors are related to the number of polls in the polling average.

Once we narrow down the data set to just those polling averages with five or more polls, and no races with third party candidates getting more than 5 percent, we can find a familiar pattern based on the partisan lean of the state showing up.

Below, the blue circles show only the elections in states where President Obama won more than 55 percent of the vote in 2008:

In every case but two, polls underestimated the Democratic candidate. The dashed line shows a regression through the blue data points, including points beyond the range of the graph. The polling average underestimated the Democratic margin by about three points on average for these data points.

Below, the mirror image: elections in states where Obama won less than 45 percent of the vote in 2008:

The most striking difference is how few data points there are. We've essentially been playing the political game on our home turf for the past ten years. That's not true this year, with close Senate races in Alaska, Arkansas, Louisiana, Kentucky, Kansas and, ever so briefly, South Dakota. We also have had close races for governor in Alaska, Arizona, Arkansas and Kansas.

As you can see from the graph, these races are more likely than not to have polls that are too friendly toward the Democratic candidates. On average, however, it is only a one-point error. Ancestrally Democratic states such as Arkansas, Kentucky, and Louisiana might not consistently show this effect, however.

A practical guide:

Here's a summary of what the last ten years of polling errors can tell us.

1. Look at the number of polls you have. If you have only a handful of polls, your polling average will likely have much greater error.

2. Look for third party candidates. If you have a third party candidate polling around 6 percent or so, the polling average is likely to be off by five or more points on the margin. This is true for many races this year.

3. If you have plenty of polls and no or low-polling third party candidates, look at the state. Deep blue states will, on average, have polls that are too Republican by a few points, while deep red states, on average, will have polls that are too Democratic by a point.

4. Finally, remember that past performance is no guarantee of future results.

Note: The polling averages used in this post are the averages of the margins of all polls from October 1 to Election Day, unless a trend was observed, in which case only polls from the final 10 days of the election were used.

Update: The title was changed to make it clear that the polls can be wrong either way.