What I would like to write is an introduction to the basic science of global warming or some dark musings on oxytocin and the evolution of parochial altruism. For today, however, it's best to stick to some housecleaning follow-up to our article on the statistical anomalies in the R2K polling.
Most readers find it hard to follow technical arguments, even ones that include explanations for laypersons. The unfortunate implication is that argument by authority is often needed as back-up. So here goes.
The two obvious blogs to follow to see how authorities on polling and statistical forensics reacted to our arguments were Nate Silver's 538 site and Pollster.com.
Nate has run a series of responses. In his latest, he states a conclusion that goes beyond what we actually claimed, and endorses it unequivocally. "Grebner et al. argue that this phenomenon could only be a reflection of human intervention -- no naturally occurring statistical process could produce it. In my view, that conclusion is correct beyond the shadow of a doubt."
On Pollster.com, the initial responses were respectful but very cautious. The Pollster group was waiting for a guest blogger with sufficient statistical expertise, Douglas Rivers, to make a more definite evaluation. Rivers wrote that the article "convincingly demonstrate[d] that something is seriously amiss with the research reported by Research 2000, which may well be due to fraud." He did raise, however, the possibility that we had overlooked a subtle technical issue. Our exchange on that question concluded as follows:
"Michael Weissman:
[snip]
Oy, do we sound like a couple of academics! We agree on
- the possibility of subcategory stabilization reducing Var.
- and (I think you agree) that by far the most dramatic instance of that for R2K, per their cross-tabs, is party ID.
- and Party ID is useless for stabilization within Party ID categories.
- Nonetheless, in principle other stabilization, e.g via gender, could have a very subtle effect on E(Var) within party ID categories.
- No such effect is remotely close to removing the huge anomaly for the R2K IND-OTH.
- In an academic paper, such issues ought to be mentioned.
So we're doing the academic fight thing over the question, I guess, of whether we should have mentioned that in the blog? Truce?
Posted on June 30, 2010 4:33 PM
__________________
Douglas Rivers:
Michael Weissman:
Yes. Yes. Yes. Yes. Yes. Yes. Yes. I'm closing my browser now."
One person we wanted to hear from was Walter Mebane, considered the leader in the field of forensic statistics. He was contacted by Pollster's Mark Blumenthal. Mebane described the evidence as "convincing". He also emphasized, in complete agreement with our arguments, a point that Nate put as follows: "he [R2K] could be using real data and making it look fake" by some peculiar alterations.
Blumenthal himself somewhat mis-summarized Mebane's remarks (I checked this with Mebane), confusing the lack of evidence that no real data had been used with the existence of evidence that some real data had been used. (Important Note: Mark Blumenthal has written that I misunderstood his meaning in my criticism here. He was not implying that there was evidence that any real polling data were used in the study. I apologize for my misunderstanding.) However, Blumenthal went on to argue correctly that that questions of that sort should be resolved by entirely different lines of evidence. He cited some of that evidence.
In one follow-up, a very young Kos blogger attempted to challenge the arguments. Following the discussion, one sees that he had nothing new to add except a deep confusion about the difference between marginally significant results and ones of overwhelming statistical significance.
The most dramatic follow-up, however, has not come from statisticians but from the head of R2K. In a letter to TPM he wrote:"Yes we weight heavily and I will, using te margin of error adjust the top line and when adjusted under my discretion as both a pollster and social scientist, therefore all sub groups must be adjusted as well." [sic]
Mark Grebner, who started all this, responded on Pollster:
"The argument the Weissmans and I made - that the published results could not have arisen from the proper analysis of properly conducted polling - seems established now by Del Ali's own words. If he 'adjusts' his numbers 'within the margin of error' based solely upon the limits of his 'discretion', there is no surprise that we see patterns which could not arise stochastically.
The controversy may continue, but our work is done."