On Tuesday, a new Granite State Poll of the two House races in New Hampshire was released. The results showed Democrat Carol Shea-Porter with a three-point lead in NH-01, and Republican Marlinda Garcia with a four-point lead in NH-02.
Here's how those results look in the context of previous polls from the same pollster this year:
Wow! Those are two very dramatic races—up, down, and back again.
Well, maybe not so much. The problem? The number of voters in these surveys has ranged from a high of 307, to just 184.
This means that random error alone, excluding all the other delightful and varied sources of error in polling, can account for the vast majority of the movement seen in the polling numbers.
To demonstrate this, I simulated a poll of a population with an actual margin of zero (a tie) 50 times in a row (with no undecideds). I used only 250 respondents. This is the result:
The simulated poll bounces around like crazy, and in about the same range as the polls we see from UNH. In other words, a poll with only 250 respondents is
utterly useless if you want to follow changes in voter preference over time for all but the most dramatic races.
Please donate $3 to help Carol Shea-Porter win regardless of these bouncy polls!
Voting by mail is convenient, easy, and defeats the best of the GOP's voter suppression efforts. Sign up here to check eligibility and vote by mail, then get your friends, family, and coworkers to sign up as well.
But on the bright side, the way to get rid of this crazy behavior is simple (but costly—increase your sample size. Follow me below the fold to see how this changes the behavior of the simulated poll.
Here's 50 simulated polls with 500 respondents:
You can see the noise has been drastically reduced, although the margin regularly bounces back and forth between +5 and -5. Polls with 500 respondents are starting to be useful; still, about 80 percent of the general election polls in the
Daily Kos Elections polling database have
more than 500 respondents.
Now here's more simulated polls, this time with 1000 respondents:
The simulated polls are now much closer to the red line. About 55 percent of them are between +2 and -2. Still, small trends would be hard to discern, and false trends can emerge. If you look from Poll 35 to Poll 45, for example, you can see a series of descending poll values—an entirely nonexistent trend.
What the graphs above are doing, essentially, is providing a visual representation of the famous Margin of Error. The margin of error essentially explodes upwards as the number of respondents in a poll shrinks:
For example, the margin of error (at 50 percent) for a poll of 1000 people will be 3.1, while it is twice that, 6.2, for a poll of 250.
However, as the graphs above show, concealed within these simple numbers is another story when it comes to political utility. As we are often concerned with changes in the polling margin of as little as 3 or 5 points, polls with only 250 people simply have too much noise to be useful.