Today's entry is about margins of random sampling error in political surveys. I'm qualified to teach you about this because, before I sold out and became a lawyer, I was a math major in college.
Disclaimer The most important kind of survey error is in picking a sample that is in some way skewed or biased. Essentially all surveys you see in newspapers and press releases and on blogs are telephone surveys. They do their best. But, non-answers, unlisted numbers, people without phones, people with multiple phones, and more can skew these lists. Also, sometimes people lie to people who take surveys. None of this is captured by the analysis below.
The analysis below looks at a basic problem. You have a target group. Typically for a general population this is likely voters. Typically for a Democratic primary, this is likely primary voters or caucus attenders as the case may be. There are millions of these critters out there. The survey typically asks 300 to 1500 people in the relevant group some questions. The cases we care most about are a list of candidates for a particular office.
No matter how perfect your sampling methods are, random chance is going to cause the 300 to 1500 people you pick to not prefer candidates in exactly the same proportions as the entire target population. But, it is possible to show with some fancy mathematics, that as your sample gets larger, your results will look more and more like the general population. Random fluke results tend to average out in larger surveys. Fancy mathematics also shows that the distribution of survey results from the same population tends to cluster around one value, if repeated, and that the more distant values are quite unlikely.
The flukiness of the results of surveys due to random sampling difference from the population, is very well defined mathematically. In cases where the target population is significantly greater than the survey sample, there are only two formulas that really matter, plus a couple of corrolaries.
Any particular result in a survey has its own margin of error. For example, in a survey with a sample size of 408, if Dean polls 32%, the margin of error of this result at the 95% confidence level is 4.5%. Popular convention describes margin of error as the 95% confidence level, which means that if the survey is repeated over and over again, that 95% of results will be within the margin of error range.
This convention is arbitrary. A 95% confidence level is a result that is within 1.96 standard deviations of the "mean" result. A 99% confidence level is a result that is within 2.57 standard deviations from the "mean" result. A 90% confidence level is 1.65 standard deviations from the "mean". A single standard deviation from the mean is a 64% confidence level -- the results are within that range two thirds of the time. A useful rule of thumb is to remember that two-thirds of the time, a survey result will be within half the margin of error.
The margin of error formula for large target population sizes is as follows:
MOE=Z*SQRT(P*(1-P)/N)
Where MOE is margin of error at the confidence level for the Z chosen, Z is the number of standard deviations from the mean in the MOE created confidence interval, P is the percentage result expressed as a decimal, and N is the survey size. SQRT means square root of and is the symbol that looks like a checkmark on your calculator. Hence in the Dean example above it looks like this:
MOE=1.96*SQRT(.32*(1-.32)/408) which is 4.5%.
When the margin of error for an entire survey is presented the "P" figure used is 50%, which is the point at which a survey is least accurate and hence a conservative estimate. The margin of error for individual results is generally lower. The MOE of a survey is purely a function of survey size. It is as follows:
Survey Size MOE
- 9.8%
- 6.9%
- 5.7%
- 4.9%
- 4.4%
- 4.0%
- 3.1%
- 2.5%
- 1.8%
- 0.98%
- 0.44%
Most political surveys are conducted with samples of 400-1500. Subsamples are often 100-300in size. The largest survey I use on a regular basis is the American Survey of Religous Identification which has a sample size of 50,000, and subsamples of 1,000.
Now onto the issue of comparing two results.
Suppose that you have a survey with a sample size of 408, and one candidate, Dean has 32% of the people supporting him, and Gephardt has 22% supporting him. What is the likelihood that the gap is a statistical fluke?
The formula for this is as follows:
MOE of gap=Z*SQRT((P1*(1-P1)N)+(P2*(1-P2)/N))
As applied in this case, we use a Z=1.96 for the customary 95% confidence level, P1 is Dean's level of support, and P2 is Gephardt's level of support. The gap P1-P2=10%. But, how accurate is that? The survey size is 408 so:
MOE=1.96*SQRT((.32*(1-.32)/408)+(.22*(1-.22)/408))
This produces a result of MOE of gap is 6.05%. So, the real gap between Dean and Gephardt is 10%+
-6.05% (i.e. there is a 95% chance that the gap is between 4% and 16%), with about two-thirds of the results likely to come between 7% and 13%.
Now, is this is too much trouble there is a good approximation of the gap MOE formula that is fairly easy to calculate and better than most crude methods.
MOE gap is approximately equal to the average of the MOE of candidate one'sresults and the MOE of candidate two's results times a factor of one point four.
For example, in the example above Dean has a MOE of 4.5% and Gephardt has a MOE of 4.0. The average is 4.25%, and that times 1.4 is 5.95%, which is quite close to the real answer. (This works because the number inside the square root of the formula is the sum of the two MOE factors, and this is quite close to two times the average of the MOE factors, and the square foot of 2 is 1.41 and with some algebra you can see that it comes out quite close).