Polls are very important to election analysts, since they can tell us a great deal about a given race. But they also have their limitations, which is why we at Daily Kos Elections have strict requirements before we'll write up a poll and analyze it in our daily newsletter, the Morning Digest. Below, we'll discuss those requirements, as well as other factors we look at in order to best derive meaning from survey data.
Requirements
1) Pollster name. Most pollsters have reputations, some good and some bad. Without knowing who conducted a poll, we have no way to place it the context of that pollster's other work. And if a pollster is new to us and doesn't have a track record, that's something we want to know, too—and convey to our readers. Put another way: If a pollster isn't proud enough to put its name on its work, what does that say about the quality of the work?
Unfortunately, along with the ongoing problem of fake news, there are also fake polls. Yes, there are destructive people out there who will create press releases or even entire websites touting fake polls from nonexistent pollsters. Fortunately, these are rare, but sniffing them out is part of our job, and if we encounter an unfamiliar name, we'll always do our best to determine that outfit's bona fides before discussing any of their work.
2) Pollster's partisan affiliation. If a pollster is a partisan outfit—that is to say, it works for either Republican or Democratic clients—we want to know that, because studies show that partisan polls that become public tend to lean in favor of their own side. We're not accusing partisan pollsters of stacking the deck (though some might), since most generally want to get an accurate read of the races they're surveying. But unlike independent pollsters, partisan outfits only release their data selectively, and usually only if it's good for their team.
3) Client (if any). Some pollsters conduct polls independently (i.e., for themselves), and many do so for nonpartisan news organizations. But if a pollster has a client with skin in the game, that's important to know, for reasons similar to those just above. We don't necessarily require the exact client's name, but at the very least, we want a general description of who the client is (or isn't). If we can't determine whether or not there was a client, or who that client might be, we'll always say so, so that our readers can deploy an appropriate measure of skepticism.
When campaigns or those with a rooting interest in a race release internal polls, they almost always do so to support a particular narrative, or to boost fundraising, or both. Always be on the lookout for such possibilities.
4) Sample size. If a pollster has contacted too few voters, the results are more apt to be off-base—and also more apt to fluctuate without cause from poll to poll. (Here's a more detailed discussion of the issue.) In the polling industry, a sample size of 300 is generally considered the bare minimum acceptable for a poll, and it's therefore the minimum we require in order to write up a poll.
5) Field dates. Knowing when a poll was actually conducted is crucial. Is the data recent? Or is it old? Naturally we strongly prefer fresh numbers, but campaigns and other organizations will often release dusty polls that may no longer reflect the state of play. In those cases, we often will decline to write them up—they're old news. In addition, sometimes a major development will take place while a poll is in the field, or immediately afterwards. By knowing the field dates, we can thereby be on the lookout for any news that might have affected the numbers that the poll might not have been able to account for due to its timing.
If a poll is in the field too long, that can render the data suspect. It increases the chance that voters' views may have shifted from the start of the field period to its end, especially if there's been a noteworthy intervening event. Likewise, if a poll is in the field for just a single day, that reduces its chances of reaching a representative sample of voters. Ideally, a poll should be fielded for three to five days. If a poll has been in the field for more than 14 days, our practice is not to write it up.
6) Toplines. It’s not enough to just know a candidate is ahead by 3 points. There’s a world of difference between a race where one candidate leads another 51-48, where there aren’t many voters left undecided, and one where a candidate leads 23-20, where most voters are up for grabs. We therefore need to know the actual percentage of the vote each candidate is getting in the poll in question, not just the margin between them. (All too often, press reports will only supply the latter.)
7) Undecided voters. One of the most important numbers in a poll is how many voters are still undecided between candidates. If a pollster does not allow voters to say they’re still making up their minds and instead forces them to choose a side, they're leaving out a critical piece of information about the state of the race—and not adhering to best practices.
8) Voter screen. A pollster should always indicate whether they've spoken with registered voters (i.e., those whose names simply appear on voter registration rolls) or likely voters (those who, through some method, the pollster thinks are actually likely to vote in the election being asked about). While these are the two most common ways of screening voters, pollsters will sometimes use more customized models of the electorate. In such cases, pollsters should explain their methodology.
Sometimes pollsters will provide results for more than one model. In such cases, we always report data for likely voters (or whichever model comes closest to approximating a traditional definition of likely voters). Occasionally, a pollster will release a poll surveying only "adults" rather than voters. We don't report on these sorts of polls because they don't reflect the electorate (many adults are not registered to vote).
9) Partisan identification of candidates. In partisan elections (which represent the overwhelming majority of races we cover), many voters make decisions about which candidates to support based on their partisan affiliation. Therefore, pollsters should identify candidates to respondents by the party they’ll be identified with on the ballot. If a pollster doesn't include this, then they're leaving out important information and failing to accurately mimic the way voters will make their choices when they actually cast their ballots.
If someone publishing a poll doesn't make all of above information publicly available, then we consider it incomplete and won't write it up. You'll often see reports of alleged polls with limited information about them—polls that might even be missing the name of the pollster. Treat such limited releases with maximum skepticism.
Other things we look at
In addition to the requirements above, there are a number of other things we look at when examining every poll we come across. Not all of these apply to every single poll, but the questions below are always important to think about.
Did the pollster push leaners? Sometimes, pollsters will ask undecided voters which candidate they "lean" toward. This practice is known as "pushing leaners" and usually yields useful information about voter preferences. If a pollster provides numbers with and without leaners, we will report the numbers that include leaners.
What kinds of questions did the pollster ask before getting to the horserace (i.e., the head-to-head matchups between candidates)? Pollsters sometimes ask issue-related questions, and best practices dictate they should be asked after the horserace. That's because these kinds of questions can "prime" voters to lean one way or the other, especially if they're on contentious topics or asked in ways that suggest the pollster has an axe to grind. (Here's one example.)
How did the pollster ask about an office-holder's job approval rating? The best way is to ask whether voters simply approve or disapprove, but sometimes, pollsters will ask along a four-point scale along the lines of "excellent," "good," "fair," or "poor." That "fair" (or sometimes "just fair") is a very tricky phrase: To some voters that means fine, to others, decidedly meh. Because of that ambiguity, we generally do not report on approval ratings when they're asked this way.
Did the pollster include every credible candidate in the race? This seems like a simple one, but sometimes, pollsters will leave out a candidate. For primary polling, this can make the numbers significantly less useful. (By contrast, sometimes a candidate will have dropped out after a pollster has conducted a survey but before it releases the data. This is a problem, too.)
It's also possible that a pollster won't test all general election matchups between credible candidates. We see this most often when a campaign releases a poll of a general election prior to the primary that only provides numbers for its own candidate. This doesn't call into question the accuracy of the poll, but it does leave open the question as to whether other opponents might fare better.
What languages did the pollster interview respondents in? The vast majority of the time, English alone will suffice. But in some heavily Latino districts, it's hard if not impossible to get an accurate read on the electorate without also conducting Spanish-language interviews.
Has the pollster also provided a breakdown of its sample composition, or crosstabs? In an ideal world, pollsters would always tell us the basics about their sample, including the proportion of respondents by gender, party affiliation, age, and race/ethnicity. Unfortunately, many don't, which requires us to trust that they've produced a sample that's likely to reflect the electorate (and therefore requires us to exercise greater skepticism).
Similarly, crosstabs allow us, for instance, to see the proportion of men vs. women who say they'll vote for a particular candidate. That gives us richer data to analyze, and also helps us evaluate the accuracy of a poll. But always be careful: Sub-samples are going to be smaller than the total sample, so they are correspondingly likely to be less accurate. For instance, a national poll of 500 people might only include 60 black respondents, so examining the views of black voters on the basis of such a small sample is something we strongly discourage.
How has the poll been weighted? After a pollster has contacted respondents, they’ll almost certainly wind up with a sample that does not reflect the electorate. In particular, white people, women, and older people tend to respond to polls at a higher rate than other populations. Therefore, pollsters need to "weight" their sample so that it resembles the electorate they expect to see, giving more weight to under-represented groups and vice-versa.
There are many different criteria by which a poll can be weighted, with race, gender, and age some of the most obvious and common. Education is another. In 2016, many pollsters did not weight by education levels, because they were not previously thought to correlate strongly with political preferences. However, a political divide along educational lines had been emerging, especially among white voters, leading many analysts to conclude that a failure to weight by education contributed materially to the 2016 polling "miss." Now, when the information is available, we will often look to see if a pollster has taken education into account.
There's also one factor that pollsters sometimes weight by that’s a source of great controversy: party identification. Unlike, say, race or age, party preference is easily changed and therefore quite fluid. Trying to ensure your sample matches pre-set proportions of Democrats, Republicans, and independents may mean that you miss real movement in how voters are choosing to identify themselves. On the other hand, because voters are less likely to take polls when the news is bad for "their" side, you can easily wind up with apparent surges in party identification for the party that’s enjoying good news, even if that's not matched by reality.
Note that party identification, which is self-reported, is different from party registration, which is based on information maintained on each voter by election officials in states where it’s possible to register as a member of a political party. Party registration statistics also change over time, but they are less fluid and can easily be verified by checking what’s known as the "voter file."
What methods did the pollster use to survey respondents? Most pollsters contact respondents using live telephone interviews, automated telephone interviews, or the internet, or a combination of these methods. Each has advantages and drawbacks. There are long-running debates within the polling industry and among academics and analysts about which approach (or approaches) yield the most accurate results. With such debates unlikely to be resolved soon, we tend not to focus on survey methods but rather the results themselves. With no definitively superior survey method, we think it best to judge pollsters based on their accuracy and reliability.
What technique did the pollster use to sample respondents? Before conducting a poll, a pollster must decide how come up with a representative sample of voters. In traditional polling (also known as "probability polling"), two most common methods are random-digit dialing ("RDD," where a pollster will call phone numbers within the jurisdiction at random) and registration-based sampling ("RBS," where a pollster will only call voters whose names and phone numbers appear in databases of voters, known as "voter files").
In internet-based polling (also known as "non-probability polling"), pollsters will generally recruit large panels of potential respondents, then contact panelists to create a representative sample. As with survey methods, there are debates on the merits of probability vs. non-probability polling, and on RDD vs. RBS. Again, we tend to focus on a pollster's results rather than its techniques.
Note: There are a tiny number of pollsters whose data we do not take the time to write up in the Morning Digest, due to persistent concerns about the quality of their work. However, to avoid concerns about picking and choosing polls when it comes to calculating poll averages, we do include all polls we come across in our database (which we will soon publicly re-launch) as long as we have all the necessary information described above, unless we have reason to believe a survey is fake.