One thing compounding the pain of the 2016 election—on top of, y’know, the actual consequences of losing—was the way that the loss seemed to come out of nowhere. Polls, for the most part, showed Hillary Clinton winning, both at the national level and in the key states that decide the electoral college. And predictive models—which by definition aren’t any better than the polls that get fed into them—as a result showed that Clinton had very high odds of winning overall, thanks to leading outside the margin of error in enough states to get over the 270 mark in the electoral college, meaning that the only way Donald Trump could win would be through catastrophic error throughout the polling industry. And yet, here we are today!
To their credit, the nation’s pollsters have been have been studiously trying to figure out what went wrong since then, in an effort to make sure it doesn’t happen again. Pollsters, after all, are social scientists, and on the rare occasion that the experiment goes awry and burns down the laboratory, the proper response is to track down the source of error and account for it, not to say “This was a one-time fluke; nothing to see here.” People who are going around saying “I’ll never trust another poll again; polls are broken,” whether they’re Republicans crowing in unexpected triumph or Democrats trying to rationalize their loss, are doing themselves a disservice, because the polling field, like any other scientific endeavor, is always self-correcting, adding lessons learned from its mistakes to its body of knowledge.
The polling industry’s professional association, the American Association of Public Opinion Research (AAPOR), recently held its annual convention, and the New York Times’ Nate Cohn—who, in addition to his work at poll aggregation, has also worked directly with innovative new polling techniques, like 2016’s NYT/Siena voter file-informed polls of swing states—reported back on Wednesday with an excellent summary of the self-diagnosis that went on at the meeting.
Possibly the biggest problem (and one that I’ve talked about myself in 2016 post mortems) is the surprisingly large role that education level had in predicting voter behavior in 2016. Pollsters typically weight for factors like race and age. In other words, they make adjustments post-sample, in order to make sure that the sample population matches, percentage-wise, the race and age distribution of the actual population they’re sampling. (The other alternative is quota-style sampling, where the pollsters seek to find the right number of people from each race, age bracket, and so on; that was prevalent in polling’s early days in the mid-20th century but isn’t done any more.) Weighting for education didn’t used to be important, because education didn’t have much of a relationship with how people voted until recently, and that correlation really shot up in 2016, to the extent that college education was almost as associated with voting Democratic as being non-white.
The problem for pollsters, though, is that college-educated voters tend to show up disproportionately in pollsters’ samples. In other words, they’re simply likelier to participate in a poll, when a pollster calls. This may seem counterintuitive, in that you might think college-educated voters might be busier or less likely to be sitting next to their landline—but what it probably has to do with is their higher level of civic engagement, which mirrors the fact that well-educated people are also likelier to vote. So, when college-educated and non-college voters were equally likely to go Democratic or GOP (as was the case in the ‘80s and ‘90s), weighting didn’t matter. But today, it’s suddenly important to weight for education (and adjust for the fact that it probably has too many college-educated respondents in it), or else you’ll have a sample that’s too Democratic-friendly.
National polls in 2016 generally did weight for education. As you might remember, the national polls were actually very close to the final result in the nationwide popular vote, as most major pollsters all coalesced around seeing a 2-to-3 point Clinton advantage in their final polls. However, most state-level polls did not weight by education, and this was especially a problem in states like Ohio, Pennsylvania, and Wisconsin, where the state electorates are disproportionately white working-class.
That’s not a magic bullet, though, unfortunately. Even some of the high-quality state-level pollsters who did weight by education were significantly off—most notably Marquette, considered the gold-standard pollster in Wisconsin, but also affecting Quinnipiac’s polling of Ohio and Pennsylvania. This same pattern also seemed to apply to the Clinton campaign’s internal pollsters, which may have contributed significantly to the tactical decisions to forego a big push in Wisconsin at any point.
What the pollsters worry about here is a problem that can’t be simply fixed through better weighting, and it gets back to the whole “civic engagement” theory again: that lower-educated people simply aren’t even picking up the phone when pollsters call. If that same segment that isn’t participating in polls is also disproportionately inclined to vote for one particular candidate (Trump, in this case), then you have a problem with partisan response bias, which is a source of error that can’t be fixed purely through weighting.
Cohn suggests, though, that there isn’t much evidence that the Trump surge came primarily from people with low levels of civic engagement and who aren’t regular voters. And my own tinkering with turnout data showed there wasn't a sudden surge of new Trump voters in pro-Trump areas who were previous non-voters; in other words, it was the same pool of voters as usual, and the real problem was people who switched from voting for Barack Obama in 2012 to Trump in 2016.
And this leads to the other big issue that Cohn discusses: the problem of late-breaking undecided voters. Part of the way that many of the Obama-to-Trump voters flew under the radar in states like Pennsylvania and Wisconsin was that they showed up in polls as “undecided” or “don’t know;” many of them decided at the very end to throw in with Trump. The fact that Clinton was polling below 50 percent in late polls of states like Pennsylvania and Wisconsin, at say 47 or 48, even while leading by 5 or 6 points, should have been more of a warning sign to aggregators (Daily Kos Elections included) that victory wasn’t assured if the undecideds broke significantly in one direction.
And that break in one direction was exactly what happened: Cohn aggregated post-election polls to see how people whose voting intention was “undecided” to see how they actually broke, and they broke for Trump over Clinton 37-18 (with an unusually large 29 percent going third party, and 15 percent remaining “none,” meaning they probably didn’t vote). In addition, people whose voting intention pre-election was third party mostly stayed third-party, but those who went for one of the major candidates disproportionately went for Trump as well.
There’s one other problem that Cohn discusses; it’s not as well-defined as the others and may be sort of a conflation of the undecided voter problem, but it’s potentially a source of error as well. And that’s the question of who fits in a likely voter screen. In other words, people intending to vote for Clinton were likelier to identify as “likely voters” during the initial screening questions at the start of a poll. Trump voters were less likely to say they were likely to vote, but that comports with post-election findings that a significant share of Trump voters did so reluctantly.
The good news here is that all of the problems are fixable; pollsters can weight for education. They can also push “leaners” harder to avoid high levels of undecided voters, and maybe most importantly of all, they can try to have as many very late polls as possible, to try and capture more last-minute deciders and also to make sure that the effects of any late-breaking developments get recorded.
For instance, one thing that Cohn’s article doesn’t mention, and that may have pushed some of those reluctant last-minute deciders off the fence and into Trump’s embrace, was the late recurrence of the Comey letter (specifically, the “OMG new Weiner files” aspect). Problems like that may simply be too much of a “black swan” event to ever really prepare for, but polling very late in the game, instead of wrapping up a week or two before the election, can mitigate this kind of surprise in the future.