Over the past month I’ve posted a lot of my concern and frustration with how polling is reported and debated by the Dailykos Elections (DKE) community. In short: People are taking polls, just about any polls, as being true measure of the state of a race and predictors of election results. If they don’t like the poll, they then have to try to find any way to “disprove” it, usually by pointing to odd results in the crosstabs. At its worst, this verges on unskewing, the notoriously laughable practice by some Republicans in 2012 to “unskew” polls to magically show Mitt Romney defeating Barack Obama. Overall these discussions over polls have come to dominate discussion on DKE. Poll is reported, commence arguing or celebrating or announcing your own rating change or attacking the ratings of other forecasters. Needless to say, the more surprising the poll result, the longer and more intense the commentary.
If you’ve been following the Siena/NYT partnership that is live polling literally dozens of House races, you’ve seen many that have literally never had a public poll before. Watching the responses tallied up in real time is captivating, exciting stuff. But I think a requirement to follow these polls is to also follow Nate Cohn’s twitter, where he has shown incredible transparency and commentary on this venture he is a part of. He notes things such as the results changing a point with literally the final call, asking questions like “What if the poll had ended just one call earlier?” Other candidates have led by 10 points with the first hundred calls, before trailing by 10 points after a few hundred. What if the first hundred calls had been similar to the next couple hundred? He’s noted the challenges in getting good response rates…some of their polls have required 40,000 calls just to get 500 people to talk to them. When nearly 99% of people don’t even talk to the pollster, how do you even know the remaining 1% is representative? He’s tried to get people to think critically about what polling is and isn’t, and drawn attention to some of the most difficult aspects of it.
So here’s the secret: polling is hard. Polling is really hard. If you haven’t read Nate Cohn’s article on that subject, you really should. Polling isn’t just calling 500 random people out of a phone book [insert note explaining what a phone book is to our younger readers], if you don’t weight them correctly you’ll get laughably bad results. A hallmark of a cheap fly-by-night pollster is having most respondents over the age of 65. But weighting can be tricky, more of an art than a science at times. Age, race, gender…but what about party identification vs party registration, which can be two different things! What about education? And how to figure out who will actually vote? It’s one thing to just report an opinion poll like “59% of America thinks weed should be legal”. But if you want to predict an electorate, and an actual election result, it gets so much harder.
Here’s an expanded form of an analogy I’ve made before. Polling isn’t just like sifting through a bag of millions of marbles. If it was, you could pull out several hundred of them, some red and some blue, and tally up the results. Based on how many, you get a statistical value called Margin of Error (MoE) that shows the likelihood your result reflects the overall percentages of the whole bag. Easy right? Well, here’s the problem. Real election polling is like if the whole bag of marbles isn’t even the actual thing you want to measure, but rather you want to get a read on a much smaller set of marbles that will (for a variety of difficult to measure reasons) self-select into a smaller set. You don’t even know the exact percentage, while generally in presidential years it’s around 60% and midterms it’s around 40%, those numbers will always vary (and more importantly, aren’t equal among red and blue marbles). And trying to grab a truly random set of marbles is also difficult because most tend to move out of the way of your hand (some extremely so), while others just hop right in (aka senior citizens with landlines). So maybe you know that some marbles are less likely to be chosen based on observable attributes, so you make it a point to give those that you manage to grab added weight in the final number (millennials, people of color). But what if you got an unrepresentative marble in this overweighted set? In the end, with your best efforts, let’s say you get a value like 42% red, 39% blue. A number of the marbles don’t even truly reveal their color without a little pushing. Now it’s 46% red, 44% blue. But you’re not sure which of them will make it to the election. So you ask them, are you sure? Are you really sure? Maybe you just try to predict this based on whether they’ve participated in previous elections. But what if this year is different? In the end with all these subjective choices you get a tie: 47% red, 47% blue. The last 6% just stay gray, refusing to reveal their color one way or the other. And you’re trying to do all of this weeks or even months before the election, while unpredictable actions may change the makeup of the marbles that self-select, and even change their colors between red and blue. Maybe the grays don’t vote, maybe they go overwhelmingly red or blue right at the end. Who knows? Election day happens, it’s 56% blue, 44% red. That’s a miss way outside of the margin of error. Did you screw up? Was your poll bad? Or this is just an unavoidable risk for even high-quality, professional pollsters? I say yes, potentially so, just given all the difficulties in getting a representative sample in your raw data and then making the right choices in the secret sauce that is a “likely voter model”. But then…
Interesting question! What indeed is the point of polling, if you can never be really sure if the result is accurate, a little off, or in another reality altogether? There is certainly value in internal polling for campaigns, to judge how their messaging is working, and where their weak points are. But what value to the public does election polling serve, if after the poll the result is still “we don’t know for sure what will happen”? Well, the truth is you can always just wait for the election results to roll right in. But if you, like I, belong to a small group of people who obsessively follow and analyze elections, that will not be a satisfying answer.
As I’ve said before, polling is fuzzy, and just one piece of the puzzle. If you wanted to be the best election forecaster in the business, would you just regurgitate whatever polls happen to be made public? Or would you look further? You could take the district, consider its partisanship, how it has voted for president and downballot, how the incumbent has fared in previous elections, how the district may be changing with time. You could look at fundraising and spending, both from the candidates and from outside groups (hint: the more outside groups spend, the more competitive they think the race is). You could try to analyze the candidates themselves, judge their strengths and weaknesses, consider their political bases and where they might struggle. Maybe even talk to political insiders from both sides, and see their evaluations of the state of play. And then yes, look at what polls are saying…the more polls to average, the better, but always remember that garbage in equals garbage out. This multifaceted approach is what professional election forecasters like Charlie Cook and Larry Sabato do. And they get it wrong too sometimes. Because the endless, ultimate, eternal truth is that nobody can predict a close election result with 100% certainty, no matter how many polls they see.
A final note on MoE: when you see “4% margin of error” they mean that with 95% confidence the results should not vary by more than 4% plus or minus on both results. Meaning a 47-47 poll in this case would have a 95% chance of being within the boundaries of 43-51 and 51-43. That still means a 1/20 chance of being outside that range! Which, if you are analyzing dozens of polls, means a few of them could be really out of bounds. The 95% figure is the one most commonly reported, but you could also calculate a different margin of error for different confidence levels. Like a tighter 99% confidence margin might increase the MoE to 7%. So only a 1/100 chance of the poll being outside the range of 40-54 to 54-40. Or if you dropped the confidence to 80%, the MoE might drop to just 2%. So if you are fine with being wrong 1/5 times, you could say the range should be within 45-49 and 49-45.
All of this is basic statistics assuming purely random sampling of the correct set to be measured, i.e. just normal marbles in a bag. This potential error is amplified considerably by all the modeling and weighting difficulties mentioned above, so it is no surprise when polls regularly (i.e. way more than just 5% of the time) miss election results outside the reported margin of error. Harry Enten has analyzed past election polls and results and found that 5% of the Senate polling averages missed by over 12 points, which is just incredible. The median miss was almost 4 points, meaning half of all the Senate polling averages will probably be off by 4 points or more. Take that into account when trying to evaluate tossup races!
So take polls for what they are, and what they aren’t.