Which polls monitor early voting, and an update on PPP.

by nycvisionary

Community

(This content is not subject to review by Daily Kos staff prior to publication.)

Saturday, Oct. 20, 2012 Saturday, Oct. 20, 2012 at 11:54:41pm PDT

There are a lot of conflicting stories in the news about early voting totals. At a TPM blog on this topic, there is a statement that recent polls have estimated a D advantage ranging anywhere between 19 and 52 % pts (that's a gigantic range of estimates), and a statement that the R team is doing slightly better at early voting than the McCain team did. However, a Republican spokesman reported that the D advantage is only 7 % in OH.
This means that the actual advantage could be anywhere from 7% to 52% - as of yesterday. In other words, there don't seem to be any accurate estimates of early voting totals or advantages in OH. We know next to nothing, other than that there has been a D advantage, not (quite? nearly?) as large as it was in 2008.

Welcoming anyone with solid information on early voting in OH (& other States) to please share it here at dkos, and also tell us how different polls are monitoring early voting.

Because we don't really know (1) the early voting numbers, or (2) which polls are properly monitoring early voting, most of the polling data in states with early voting have any extra source of error variance. In other words, as much disagreement as there has already been, the polling data will become even more chaotic now that the early voting uncertainties have become part of the picture.

With respect to PPP, does anyone know if early voting is part of their data, and if so, how do they assess it?

In the State of Iowa, PPP has just released an estimate that Romney is up +1%. But it is widely reported that the largest Dem early voting advantage is in Iowa. Further 7 of the last 9 polls in Iowa have reported Obama to be ahead, by a margin of 4 to 5% (the other most recent poll NBCMarist has Obama +8%). Since it's an outlier, something appears to be amiss with the PPP data in Iowa, and so the question arises whether they are using any means to inquire as to early voting.

The Iowa PPP data, accurate or not, is of interest not only with respect to early voting, but also because it is one in a series of the most recent PPP polls that has departed dramatically from early PPP polling that tended to show a slight (D) house effect, in Nate Silver's words (he estimated ppp house effect was around 3% in early summer, and that their house effect had diminished noticeably by September).

It appears that Nate Silver's numerous discussions of house effects may have caused both ppp and rasmussen to tighten up on procedures that accentuared house effects. However, it should be noted that rasmussen continues to show a clear R house effect or "bias" (i.e., rasmussen remains "R friendly").

In the case of PPP, Nate Silver's words appear to have had a particularly strong impact, and indeed a growing impact. Nate recently wrote about his appreciation for those polls that do not make any adjustment for demographics or partisan alignment of voters (in other words, polls that just use raw data without weighting or adjustment).

It is debatable whether procedures such as weighting data to known population characteristics is or is not the optimal approach to polling with samples of 500 or 1000, and more importantly, the methods for weighting and adjusting data vary widely - - no one has established that there is a particular best way to do it. The science of polling remains somewhat unresolved on this point, which is why polls handle these issues differently.

There is a disadvantage for the (D) team if PPP has abandoned its previous efforts to ensure that its samples were representative of the American voting population in terms of demographics and partisan alignments. The disadvantage is that PPP is the only truly D-friendly polling firm that regularly monitors swing States and national vote preferences. We Dems had only one horse in the race, PPP, and it appears that Nate Silver may have persuade the people at PPP to let the data come in as they come in, somewhat randomly or representatively, or sometimes not very randomly or representatively, depending on how the polling winds blow from day to day.

The bottom line is that, if PPP is listening to Nate Silver's general statement that it is a nice thing to have some polls that don't weight their data, and drawing from this that Nate is somehow requiring or demanding that PPP should stop the weighting and adjusting (or is being pressured by Nate in some way to do this), well, this might be exactly the explanation for the wacky changes in PPP polling recently.

In other words, many have been complaining about changes in PPP polling in the past 10 days. In this diary, one idea is being put forth - - that it's possible that Nate Silver has cajoled or persuaded PPP in some way to get rid of its "house effect" entirely, and to do so by no longer weighting or adjusting the data to match features of the population.

Now - it is controversial whether all polls should suddenly abandon weighting and adjusting or oversampling. These procedures are used in most of the social sciences, and they are considered very important in fields like public health, where it is essential to weight data to offset non-representativeness in samples. No one has argued that weighting or oversampling should not be used in these sciences, and indeed the trend has been toward more (not less) use of these procedures in recent decades.

Nate has one good point - which is that from time to time, it is important to just collect raw data from a very large sample to establish what the actual population features actually are - so that other polls can use this for weighting and oversampling accurately. He has also made the point that it seems like a good thing for some polls to use these procedures, while others don't use them. That way, in his website, he can study the differences between the various methods.

PPP, it appears, has been bending over backwards, to a greater and greater extent to seem completely "neutral" and no longer "D-friendly," even though its funders are kos and organized labor (SEIU). The Dems who commission the PPP polling have somehow gotten stuck with a polling firm that is no longer trying to offset the effects of the R polling firms that intentionally weight and oversample to favor R candidates.

If correct (and it is possible that there are other mysterious reasons why PPP has started to report Romney-friendly numbers in swing states or nationally, due to changes in their sampling or weighting procedures), then the one polling firm that Dems used to be able to rely on to provide information from samples weighted to ensure an adequate number of Dems and Independents were sampled, is no longer providing this kind of information.

One could argue that all the R firms should do the same (but it is not at all clear that weighting and oversampling should be abandoned altogether to maximize accuracy, as noted above). But, regardless, the R firms are not going to stop weighting and adjusting their samples or oversampling. Gallup is an R-friendly firm (historically) that has no qualms about continuing to report data from samples that appear to Dems to be unrepresentative and R-friendly.

The problem raised here for discussion is whether PPP has made the correct decision at this crucial moment, or has made some other systematic changes in its polling methods, with the result that PPP now stands alone as the only previously (D) or "D friendly" poll to cease the efforts that it had made to ensure balance among polling firms (i.e., to offset the overwhelming effects of many (R) polling firms.

As reported at DailyKos in recent diaries this week, PPP is a land-line robopoll (automated phone poll) that relies on land lines, but uses statistical procedures to offset the effects of reliance on land lines only. It does not appear that the adjustment for cell phone users has stopped. The question raised here and elsewhere is whether PPP has made a more and more stingent effort since September to eliminate any traces of D-friendliness and to eliminate the formerly detectable PPP "house effect."

Nate himself has written that his 538.com estimates of "house effects" are themselves very imperfect. For example, he has noted that there is no way to know that a polling firm with a "house effect" of 0 is going to be the most accurate poll. Indeed, as Nate has shown awareness of, it would be possible to throw the "house effect" estimates off by several % points (for example, by overloading the polling data world with an abundance of Republican-friendly polls that would skew the entire polling aggregate or average off, making it less and less accurate).

Here is a mind experiment of Nate's acknowledgement of how the polls themselves determine how "house effects" are calculated. If there were 20 (R) polls, 10 non-partisan polls with varying levels of D or R lean from week to week, and 2 (D) polls, when these were all averaged together, they would show that the 2 (D) polls had overwhelmingly "left-wing" bias or "house effects." Most of the (R) polls would be close to the average, and so few or none of them would have any detectable house effect more than 1%.

So (using game theory, for example, and knowing how Nate computes "house effects"), the funders who underwrite (R) polls would logically know that the correct thing to do is absolutely overwhelm the polling field with as many new (R) polls as they possibly could. By thus "gaming" the system, as they have actually done, they have flooded the election 2012 websites with information from the (R) polling perspective. And, further, if most recent observations of PPP polling are accurate, they have also succeeded in intimidating PPP by making it appear that PPP has had an unduly strong house effect. In other to offset this (R perpetrated) perception, PPP has had to stop making the efforts that it had made to ensure that the voting population was sampled accurately, from a Democratic standpoing.

This diary is written on the day following PPP's publication of a poll indicating that Obama's lead is currently only 1% in OH, an estimate identical to that of the (R) firm Rasmussen, with (R) firm Gravis estimating a tie in OH. In comparison the other recent polls include O+3 (Survey USA, Fox), O+4 (YouGOV), and O+6 (NBC, WSJ, Marist). Oddly, PPP itself did report O+5 when the YouGOV poll was published. This helps to emphasize that PPP has reported at least some poll findings that showed a strong Obama lead within the past week.

There is no way of knowing if the recent PPP data are actually accurate and right on the money, but there is information indicating that several recent PPP samples had notably higher numbers of Republican poll respondents than usual.

Thus the one question that can be most strongly justified is this: Why would PPP suddenly start reporting poll data based on samples in which Republican voters were over-represented? One hypothesis has been advanced in paragraphs above.

Other hypotheses are also welcome. Also of interest - - do D-leaning kos readers recognize the importance of having polling firms with D-friendly sampling and weighting procedures? Or do they think that PPP should stop making any effort to set the sampling parameters in a way that ensures demographic and partisan balance as the Democratic voter currently understands it to be?