Election Predictions Based on Past Polling Errors

by Daniel Donner

Tuesday, Nov. 06, 2012 Tuesday, Nov. 06, 2012 at 9:10:26am PST

Recently I introduced a simple model for predicting election results from current polling averages and past polling errors. This post presents final predictions for all statewide presidential, senate, and governor's races. This is the first test of this experimental model.

A simple polling average is shown for each race, which includes polls from the entire month of October for races with little apparent change, and polls from the past 10 days for races that appear to have trends in the month of October (including all Presidential races).

The regression prediction is based on a regression of data from 2004-2010 showing the error of the polling averages in a given state is correlated to Obama's 2008 performance in that state. See here for details.

The state-based prediction adjusts for polling errors specific to that state in races from 2004-2010. For example, if in a given state, polls overestimated the Democratic margin by 3 points in a 2008 Governor's race, 2 points in a 2006 Senate race, and 1 point in a 2010 Senate race, the current polling average for a Senate race would be adjusted downward for the Democrat by 2 points. State-based predictions use past Senate and Governor's errors for Senate and Governor's races, and past Presidential errors for Presidential races, because of state/federal party misalignment in some states.

Clarification: positive numbers mean the Democrat wins.

The Governors

The polling average in Montana is very close, but in the 2006 Senate race, polling showed Tester ahead by about 4-5 points, and the final margin was actually only 49-48. (The state-based prediction takes this information into account.) Mason-Dixon showed it tied, however, and they also correctly predicted the 2004 Governor's race. They currently show the Republican ahead by 3.

The Senate

The Senate is more interesting. Both the regression and state-based prediction put the competitive Massachusetts and Connecticut races into the Safe Dem category (+5 or more). The close race in Montana, again, is nudged into negative territory. Meanwhile, the Nevada race is moved from favoring the Republican to favoring the Democrat based on the lousy performance of Nevada polling in recent elections. Note also Hawai'i gets a huge bump in both predictions. And finally I will note I believe there is a significant chance of poll failure in Arizona, where most of the polls have not included enough Latino voters. I would not be surprised to see a very close race there.

The President

These polling averages and predictions are all likely to be off, given the movement in the Presidential race we've seen in the last few days. Nevertheless, rules are rules, so I kept a 10-day average. Looking at the numbers, despite the hullabaloo from the Republicans, PA, NV, and WI are bumped into safe territory by both prediction methods. New Hampshire and Iowa are both nudged downward in the state-based prediction based on local pollsters that, in the past, have vastly overestimated Democratic performance (UNH and Des Moines Register).

Potential Pitfalls
The biggest problem that could cause failure of the models is failure of the polls. With response rates less than 10%, do polls work anymore? We will see. More specifically, the regression model should perform worse when local pollsters are active in a state. Sometimes they may be more accurate or less accurate, but they would be different from the typical mix of national pollsters. In theory the state-based prediction should catch this. But the mix of pollsters may change from one year to the next in any given state. Another way the state-based prediction could fail is in cases where it is based on a limited number of polls. Finally, campaign strategy should be able to alter the relationship between polls and outcomes and could also provide error for this method.

Evaluating the Results
I will consider either model a success if the average absolute error is lower for the predictions than for polling averages alone. Only races with three or more polls will be included in the evaluation.

More fine print.
The state-based model was based on averaging the difference between the polling average and the actual outcomes in elections held between 2004 and 2010. All polls from October were included unless a clear trend was apparent, in which case only polls from the final 10 days were included. A poll with the majority of its field dates within the final 10 days is counted as being taken in the final 10 days. Elections with third party candidates >5%, recall elections, Research 2000, and Strategic Vision were excluded. In cases where four or five polls were within a reasonable range and one poll had a margin more than 20 points different, the outlier was removed. Presidential elections were treated separately from Senate and Governor's races because of misalignment problems in some states (such as West Virginia); however, in most states, the error for the presidential polling numbers is similar to the errors in Senate and Governor's races. Previous races are included in the state-based averages when they have more than five polls. If the state has no races with more than five polls, races with fewer than five polls are used in the average.

Addendum: The House
While I'm at it, I figured I may as well make a prediction for the House too. Based on recent elections, I will guesstimate the generic House vote margin will be +0.5 points more favorable for the President's party than the Presidential margin. But what will the President's margin be? Again, based on recent elections, I guesstimate the polling average plus 1.0 points for the winner. That's Obama +2.4. I'll average that with Nate (+2.6) and Sam (+2.2) - which gives me 2.4 for the President. Which leads to 2.9 for the House. Which leads to a prediction of a 28-seat gain based on my model after adjusting for redistricting. But don't get excited - this prediction has a range, too, of approximately 14-35 seats gained. I wouldn't be surprised if the final result is more towards 14 - but then, with House polling rather sparse this year, nothing would be too surprising.

It's been a wild ride this cycle - we shall see what we shall see!