Whenever we see a poll, we see a margin of error, or confidence interval. These are always wrong. They are wrong, even if there are only two candidates, and they are even more wrong if there are more than two candidates.  But they are simple.

The truth is complicated.

This complication exists even if we assume that the sample is a perfectly random sample of the population of voters. This assumption is ludicrous, but without it, things get really hairy. In fact, the truth is more complicated than this diary makes it out to be

If you have only two candidates then the results follow what is known as a binomial distribution. If you have more than two they follow what is known as a multinomial distribution. "Distribution" is itself a statistical term. It means an assignment of probability to each possible outcome; in this case, the proportion of the vote a candidate will get. In sampling, we try to estimate a population distribution from a sample distribution. Of course, our estimate isn't perfect, but, again assuming it's random, we can estimate how badly off it might be.

There are a few problems with the way margins of error (MoE) are usually presented in polls.

First, we interpret them wrongly.  Even if we used the right MoE (see below) our interpretation is off.  A confidence interval (CI) is given by the estimate plus or minus the MoE. The correct interpretation of a 95% confidence interval is that, if the population value was X, 95% of the time, the sample value would be in the 95%CI.  What we usually assume is that, since the sample estimate is XXX, we can be 95% sure that the population value is within the 95% CI.  That's wrong.  This interpretation is VERY common; I've even fallen into it myself.

A second wrong interpretation is that we assume either a) That all values within the CI are equally likely or b) That values outside the CI are impossible.  Neither is correct. If our poll estimates that 52% will vote for Joe Shmo, then the most likely result is 52%; the farther you go from 52%, the less likely. The likelihood of any particular result is given by the likelihood function - and ANY result from 0 to 100 is possible, it's just that when you get far from 52%, they are very unlikely.  (You COULD flip a fair coin 100 times and get 100 heads; it's not LIKELY, but it's POSSIBLE).

But we also give the wrong MoE, because we give a single MoE for each poll, and that's not right. The classical formula for a 95% MoE is

1.96*(pq/n)^.5,

where p is the proportion saying something, q = 1-p and n is sample size.

This is approximately accurate, and the approximation is pretty good for results from polls where n is usually pretty big and we aren't interested in very rare events. It doesn't work well for estimating very rare things, like prevalence of rare diseases, but it's OK for polls.  But it gives a different MoE for each candidate.  But when there are two candidates who get all (or almost all) of the votes, then this difference doesn't matter too much. For example, if we poll 400 people and 60% say they will vote for Obama, 35% for Bachmann (should she be the Repub. nominee) and 5% for someone else, then the MoE for these three are
Obama  4.88%
Bachmann 4.78%

But the pollsters like to give ONE MoE, so they use an even simpler formula:
0.98/n^.5; this is only exactly correct if p = .5

For the above, it would give
Obama  4.9%
Bachmann 4.9%

not far off.

But what if we are polling a primary?  A recent Iowa poll of 500 Repubs gave these results

Bachmann 25%
Romney 21%
Pawlenty 9%
Cain 9%
Paul 6%
Gingrich 4%
Santorum 2%
Huntsman 1%

It said the MoE was 4.4%; that uses the simple formula .98/n^.5. But the right ones, with the formula 1.96*(pq/n)^.5  are different for each candidate and they are

Bachmann 3.8%
Romney 3.6%
Pawlenty 2.5%
Cain 2.5%
Paul 2.1%
Gingrich 1.7%
Santorum 1.2%
Huntsman 0.9%

There are still problems with Huntsman's, but these are much more reasonable figures. They are asymptotically accurate.

#### Tags

EMAIL TO A FRIEND X
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags

?

More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
 Unpublish Diary (The diary will be removed from the site and returned to your drafts for further editing.) Delete Diary (The diary will be removed.)
Are you sure you want to save these changes to the published diary?