(Cross-posted at the Princeton Election Consortium)
Many readers of this site know that pollsters vary in their methods. However, existing solutions, such as correcting for bias, have attracted controversy. As a result, one prolific pollster can dominate the discussion.
Here I present a solution based on a very simple -- and neutral -- statistical principle. If adopted by Pollster.com, TalkingPointsMemo, and FiveThirtyEight, it would make poll aggregation far more useful.
It is well known that pollsters vary in their methods. The American Association of Public Opinion Researchers has established common standards of practice and encouraged transparency, driven in part by Mark Blumenthal of Pollster (the Princeton Election Consortium's data source). But poll-sniffers aficionados know that Rasmussen Opinion's results consistently trend more Republican than other organizations.
I should note that this problem is not limited to Rasmussen. The recent Pew survey showing a 10-point lead for Obama is also an outlier. In principle, a good solution would do something constructive with both of these sources of information.
Differences like these present a challenge to poll aggregators. An obvious solution is to estimate the size of each pollster's bias, then subtract it. However, this generates three new problems: (1) Who is the neutral reference point? Gallup? Quinnipiac? Rasmussen? (2) What to do about pollsters who do very few polls? (3) What if the pollster changes methods mid-season?
For my Meta-analysis I have chosen a simple solution that gets rid of most of the bias: use median-based statistics. Here's how it works.
Imagine the two following similar sets of poll margins between candidates A and B:
Data set 1: A +2%, A +4%, tie, A +3%, A +1%.
Data set 2: A +2%, A +4%, tie, A +3%, B +4%.
The difference is that in the second case one pollster is shifted by 5% toward candidate B, approximately corresponding to the Rasmussen effect. This single outlier poll brings the average margin toward candidate B, and increases the uncertainty considerably:
Data set 1 (averages): Candidate A leads by 2.0 ± 0.7 % (mean ± SEM), win probability 98%.
Data set 2 (averages): Candidate A leads by 1.0 ± 1.4%, win probability 74%.
However, now use medians. The two data sets have the same median, 2.0%. Median-based statistics allow calculation of estimated SD, defined as (median absolute deviation)*1.4826. This gives
Data set 1 (medians): Candidate A leads by 2.0 ± 0.7% (median ± estimated SEM), win probability 98%.
Data set 2 (medians): Candidate A leads by 2.0 ± 1.3%, win probability 90%.
Generally speaking, using medians gets rid of most of the bias from a single outlier. In this example, the race is taken most of the way out of the toss-up category.
You might ask: if medians are so great, then why don't popular aggregators like FiveThirtyEight use them? A big one is that media organizations want to maintain an appearance of neutrality. I argue that a simple tool, the median, solves the problem, improves the quality of aggregated data, and helps cut through the noise -- which is why we like poll aggregation in the first place.