One thing we've seen emerge in the results of both Florida and South Carolina is that Barack Obama has materially outperformed his polling averages when the actual votes were counted. Although this trend was contradicted in a big way in New Hampshire, it was also the case in Iowa, as well as in the "uncommitted" primary in Michigan.
A summary of the primary schedule to date is included below, including final polling averages from Real Clear Politics and Pollster.com. Because some people prefer RCP's method and others prefer Pollster, I have simply combined the two estimates -- that is, I've taken an average of averages.
On average, the Obama-Clinton margin has been 3.7 points more favorable to Obama than was predicted by the final polling averages. If you include Michigan in the tally -- where Uncommitted overperformed expectations -- the average increases to a 4.4 point swing toward Obama.
We cannot say with much certainty that the trend is statistically significant -- there simply isn't enough data to do that. However, if there are systematic reasons why Obama is likely to outperform his polls by several points on Super Tuesday, that is obviously something we'd want to consider. So far as I can tell, there are four or five reasons why such a phenomenon might be real:
#1. The incumbent rule. The incumbent rule simply states that in electoral races containing an incumbent, undecideds break to the challenger most of the time. It was first discovered in 1989 by Nick Panagakis, who studied a large number of races and found that the race broke toward the challenger in excess of 80% of the time (note: the incumbent rule is NOT suggesting that 80% of undecideds break to the challenger; it's saying that in 80% of races, the plurality of undecided votes break to the challenger).
Since then, and particularly after the disappointing election outcome in 2004 against incumbent George Bush, the incumbent rule has been the subject of much discussion. The closest to a consensus that's emerged is that the incumbent rule still applies, but is less important than it used to be in a high-information universe.
The real question, of course, is whether Hillary Clinton can be considered the functional equivalent of an incumbent. I think that she probably can be. Her name recognition is near 100%, she has substantial fundraising and institutional support, her positions are well-known, and she's running on a platform of experience. These are the traditional advantages of an incumbent -- not to mention the special circumstance that her husband is the former POTUS. So the idea is that voters default toward Hillary -- and if they haven't picked Hillary by a certain point in time, more likely than not they never will.
In fact, there is some evidence that undecided voters are more likely to break toward Obama. If we look at CNN exit polls, we find that among voters who made their decision within three days of the election, or on the day of the election, Obama has had roughly a 4:3 advantage over Hillary Clinton (I use this time period as the cut-off because these are the voters who will often be excluded from the last round of polls).
Late-Deciding Voters (within Three Days of Election)
Clinton Obama Edwards
FL 36 37 23
IA 21 32 28
NH 36 37 16
NV 42 37 14
SC 21 52 27
Average 31 39 22
Generally speaking, about 20% of voters make their decision within three days of the election. If this pattern is real, it is worth roughly 2 points to Obama.
#2. Cellphones. A couple of pollsters now include a sample of cellphones. Interestingly, this included the Selzer/Des Moines Regsiter poll in Iowa, which came very close to the actual result when most other polls did not. It also includes Gallup, although that is a very recent development -- since the first of the year.
Generally speaking, the acadaemic consensus has been that excluding cellphones from a polling sample is not a huge problem. However, we may have passed some kind of a tipping point. The number of cellphone-only voters has increased quite rapidly. But more significantly, you have one particular candidate, Barack Obama, whose demographics overlap almost perfectly with the cellphone-only set: younger, more urban, more tech-savvy, but less likely to be married or a homeowner. To some extent, it may be possible to get around this problem by re-weighting your samples (for example, weighting younger voters more to compensate for the ones you can't reach). But there are a sufficient number of overlapping and perpendicular demographic trends that this can be hard to do accurately, and many pollsters don't bother to do it at all.
#3. Underestimating turnout. Another problem is that many pollsters discriminate against voters who do not have long voting histories. For example, a typical screening mechanism might terminate calls with voters who had never before participated in a primary, provided that they were age-eligible.
The problem with this is that you will necessarily screen out new voters if you do this -- and from what we've seen the plurality of first-time voters are voting for Obama. And, of course, there have been a lot of new voters in this election cycle. Turnout increased by roughly 90% in Iowa and South Carolina, and 30% in New Hampshire. It also increased by 125% in Florida and over 1000% in Nevada, although those states did not have highly competitive primaries in 2004. It may be worth noting that the one state where turnout increased the least (New Hampshire) was the one where Obama underperformed his polls.
#4. GOTV. Related to problems with modeling turnout are Get Out the Vote (GOTV) operations. Irrespective of any problem with the statistical models of the pollsters, perhaps the Obama campaign has been more effective in getting its voters to canvas for him?
The problem with post-facto evaluations of GOTV operations is that they tend to be rather tautological. That is, if a campaign outperforms its polls, it is necessarily assumed that they had a superior GOTV operation, and vice versa. There are a couple of neutral observers whose judgment I've come to trust; for example, Matt Stoller at Open Left has developed a pretty good sense for campaign ground games, and called both Obama's superior GOTV in South Carolina and Hillary's superior GOTV in Nevada in advance. But mostly people just use this as a 'catch-all' category when they can't otherwise explain the result. Also, it's worth noting that Obama outperformed his polling averages in Florida, where there were no official GOTV operations.
But suppose that Obama is better than Hillary at GOTV, all else being equal. Is this likely to be an advantage on Super Tuesday? I think you can make arguments either way. On the one hand, there hasn't been nearly as much time to invest in infrastructure in the Super Tuesday states. But on the other hand, voters in Super Tuesday states have had less chance to engage directly with the candidates, and so the motivation provided by GOTV may be more important. To the extent that this is an advantage for Obama, I suspect that it will mostly manifest itself in the caucus states, where participation is lower and voting more of a chore.
#5. Reverse Bradley Effect. There has been much speculation about Bradley Effects, and Reverse Bradley Effects, the idea that voters will mislead pollsters about their preferences for reasons related to race. I find these explanations mostly unconvincing, especially since the other reasons I've outlined in this essay would tend to do a sufficient job of explaining voting discrepancies. But there are any number of explanations that the more creative among you might come up with. For example, perhaps there isn't a racial Bradley Effect, but there is a gender-based Bradley Effect, in which female voters don't want to tell an interviewer that they're voting against Hillary? Interestingly, there is some evidence that Robopollsters like Rasmussen, Survey USA, and PPP have outperformed live interview pollsters thus far in this cycle (presumably, it isn't worth lying to a computer).
[...]
Although I'm sure I'll be accused of drinking the Big O Kool-Aid, my intuition is that these effects, or some combination of these effects, are likely to give Obama a couple points worth of advantage on Super Tuesday. The incumbent rule (#1) and the cellphone problem (#2) in particular seem to be fairly tangible. Of course, we will all find out soon enough. But in light of the fact that the polls have had a rough cycle, it's yet another reason to expect a close contest on Super Tuesday.