OK

This is only a Preview!

You must Publish this diary to make this visible to the public,
or click 'Edit Diary' to make further changes first.

Posting a Diary Entry

Daily Kos welcomes blog articles from readers, known as diaries. The Intro section to a diary should be about three paragraphs long, and is required. The body section is optional, as is the poll, which can have 1 to 15 choices. Descriptive tags are also required to help others find your diary by subject; please don't use "cute" tags.

When you're ready, scroll down below the tags and click Save & Preview. You can edit your diary after it's published by clicking Edit Diary. Polls cannot be edited once they are published.

If this is your first time creating a Diary since the Ajax upgrade, before you enter any text below, please press Ctrl-F5 and then hold down the Shift Key and press your browser's Reload button to refresh its cache with the new script files.

ATTENTION: READ THE RULES.

  1. One diary daily maximum.
  2. Substantive diaries only. If you don't have at least three solid, original paragraphs, you should probably post a comment in an Open Thread.
  3. No repetitive diaries. Take a moment to ensure your topic hasn't been blogged (you can search for Stories and Diaries that already cover this topic), though fresh original analysis is always welcome.
  4. Use the "Body" textbox if your diary entry is longer than three paragraphs.
  5. Any images in your posts must be hosted by an approved image hosting service (one of: imageshack.us, photobucket.com, flickr.com, smugmug.com, allyoucanupload.com, picturetrail.com, mac.com, webshots.com, editgrid.com).
  6. Copying and pasting entire copyrighted works is prohibited. If you do quote something, keep it brief, always provide a link to the original source, and use the <blockquote> tags to clearly identify the quoted material. Violating this rule is grounds for immediate banning.
  7. Be civil. Do not "call out" other users by name in diary titles. Do not use profanity in diary titles. Don't write diaries whose main purpose is to deliberately inflame.
For the complete list of DailyKos diary guidelines, please click here.

Please begin with an informative title:

(Cross-posted at the Princeton Election Consortium)

Many readers of this site know that pollsters vary in their methods. However, existing solutions, such as correcting for bias, have attracted controversy. As a result, one prolific pollster can dominate the discussion.

Here I present a solution based on a very simple -- and neutral -- statistical principle. If adopted by Pollster.com, TalkingPointsMemo, and FiveThirtyEight, it would make poll aggregation far more useful.

Intro

You must enter an Intro for your Diary Entry between 300 and 1150 characters long (that's approximately 50-175 words without any html or formatting markup).

It is well known that pollsters vary in their methods. The American Association of Public Opinion Researchers has established common standards of practice and encouraged transparency, driven in part by Mark Blumenthal of Pollster (the Princeton Election Consortium's data source). But poll-sniffers aficionados know that Rasmussen Opinion's results consistently trend more Republican than other organizations.

I should note that this problem is not limited to Rasmussen. The recent Pew survey showing a 10-point lead for Obama is also an outlier. In principle, a good solution would do something constructive with both of these sources of information.

Differences like these present a challenge to poll aggregators. An obvious solution is to estimate the size of each pollster's bias, then subtract it. However, this generates three new problems: (1) Who is the neutral reference point? Gallup? Quinnipiac? Rasmussen? (2) What to do about pollsters who do very few polls? (3) What if the pollster changes methods mid-season?

For my Meta-analysis I have chosen a simple solution that gets rid of most of the bias: use median-based statistics. Here's how it works. 

Imagine the two following similar sets of poll margins between candidates A and B:

Data set 1: A +2%, A +4%, tie, A +3%, A +1%.

Data set 2: A +2%, A +4%, tie, A +3%, B +4%.

The difference is that in the second case one pollster is shifted by 5% toward candidate B, approximately corresponding to the Rasmussen effect. This single outlier poll brings the average margin toward candidate B, and increases the uncertainty considerably:

Data set 1 (averages): Candidate A leads by 2.0 ± 0.7 % (mean ± SEM), win probability 98%.

Data set 2 (averages): Candidate A leads by 1.0 ± 1.4%, win probability 74%.

However, now use medians. The two data sets have the same median, 2.0%. Median-based statistics allow calculation of estimated SD, defined as (median absolute deviation)*1.4826. This gives

Data set 1 (medians): Candidate A leads by 2.0 ± 0.7% (median ± estimated SEM), win probability 98%.

Data set 2 (medians): Candidate A leads by 2.0 ± 1.3%, win probability 90%.

Generally speaking, using medians gets rid of most of the bias from a single outlier. In this example, the race is taken most of the way out of the toss-up category.

You might ask: if medians are so great, then why don't popular aggregators like FiveThirtyEight use them? A big one is that media organizations want to maintain an appearance of neutrality. I argue that a simple tool, the median, solves the problem, improves the quality of aggregated data, and helps cut through the noise -- which is why we like poll aggregation in the first place.

Extended (Optional)

EMAIL TO A FRIEND X
Your Email has been sent.