Skip to main content

View Diary: The Meta-Analysis Of State Polls is back! (w/poll) (31 comments)

Comment Preferences

  •  How did you get the 95% CI? (0+ / 0-)

    based on some Monte Carlo work?

    Confidence intervals around quantiles are notoriously tricky

    •  No - Monte Carlo is unnecessary (0+ / 0-)

      I calculate the exact distribution of outcomes given the individual state probabilities. Then it's a simple matter of finding the 5th and 95th percentiles.

      In fact, a key point is that the calculation is exact. Go read about it on the site!

      •  I gotta go look at the site (0+ / 0-)

        definitely right up my alley

      •  exact..ish (0+ / 0-)

        I realize the word exact has a specific meaning here, but it tends to imply for readers that the result is indeed "exact".  But it's the "given the state probabilities" part that people may miss.  

        Since the majority of the uncertainty in the EV estimate is from the uncertainty in the state probabilities and not in how those probabilities are combined, I think the use of thee word exact may be a little misleading.  If someone used Monte Carlo methods instead of your formula to come up with the EV distribution, I would expect that they would get pretty much the same answer -- although they wouldn't be able to call it exact ;)    

        •  Incorrect - Monte Carlo is inadequate (0+ / 0-)

          In regard to Monte Carlo simulation, think again.

          As of today, based on polling data alone there are 12 states with intermediate win probabilities, i.e. between 5% and 95%. So even if you assume that states with extreme probabilities belong safely to Obama or McCain, I have 2^12 = 4096 permutations. Because the likely permutations are distributed unevenly, this is already hard to sample using Monte Carlo methods.

          The problem is compounded in the case of fivethirtyeight.com, where many states are rated as being intermediate. Today, 29 states have intermediate probabilities, leading to over 500 million permutations. Monte Carlo simulation is totally inadequate to sample this range of possibilities.

          There are two points here. First, an approximate numerical method is a poor choice for making the EV estimate. Second, the exact calculation (or closed-form calculation, if you like) gives a precise answer that has a very low degree of uncertainty, and therefore can be tracked over time to detect small changes in the direction of the race.

          •  really?... (0+ / 0-)

            Computers are quite fast and performing millions or even hundreds of millions of Monte Carlo draws from a simple distribution is relatively trivial -- waiting a few minutes for a results instead of a few seconds.  In addition, you still haven't addressed the point that the results are NOT exact or even very precise in the usual meaning of these words if there are many "toss-up" states.  The fundamental uncertainty in the state polls still dominates the overall uncertainty.  

            •  "Trivial" - but wrong (0+ / 0-)

              Yes, computers can calculate that quickly. But you are missing the point. It is inexact to carry out such simulations, much the same way that one would not make a serious calculation of pi by repeatedly measuring the circumferences and diameters of circular objects.

              You seem to have an interest in statistical methods. Read the writeup on my site. Then, if you still have questions, please write to me at sswang at princeton dot edu.

              In the meantime, here is a direct example of what I am talking about.

              Here is the result of 10,000 simulations done yesterday at fivethirtyeight.com:

              and here is the exact distribution of probabilities of all >2 quadrillion outcomes, calculated from the state probabilities posted there:

              Assuming that the EV estimate posted on that site is based on the simulations, the value posted there today is currently off by 5 EV. Considering the care with which the state probabilities are calculated, this is a major oversight.

              •  I understand (0+ / 0-)

                I understand your approach and I also understand that 10000 simulations might be less than ideal.  A million would probably be better.  I just think that words like exact and precise don't apply that much to meta-analysis of polling data three months before the election.  

                BTW, I think your formula is pretty cool and Poblano should use it.  But it's his method for weighting polls and imputing results for thinly polled states through the demographic regressions that seem to make his projections better than the simple meta-polls.  That piece of the analysis is even more important than whether he does 10,000 simulations, or 1 million, or realizes that there's an easier way forward to get at the distribution (like your formula).

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site