Skip to main content

View Diary: Punish CNN for Bogus Poll: Remove Candy Crowley as a Moderator (281 comments)

Comment Preferences

  •  OK, but I think we mostly have that already (1+ / 0-)
    Recommended by:
    elmo

    They reinterviewed respondents from a recently completed survey who planned to watch the debate and agreed to be reinterviewed. (They would lose anyone whom they couldn't reach on the phone after the debate, or who admitted to not having watched it.)

    The biggest "known unknown" is whether and how the results were weighted.

    Election protection: there's an app for that!
    Better Know Your Voting System with the Verifier!

    by HudsonValleyMark on Thu Oct 04, 2012 at 07:39:44 PM PDT

    [ Parent ]

    •  How were those respondents selected? (0+ / 0-)

      We know that they were supposedly "nationwide."  And they were a "random national sample" -- do we know of what universe?  Do we know the sampling frame?  How shocked will you be if it was one that was more likely to pick up cable TV viewers?

      Pro-Occupy Democratic Candidate for California State Senate, District 29 & Occupy OC Civic Liaison.

      "I love this goddamn country, and we're going to take it back." -- Saul Alinsky

      by Seneca Doane on Thu Oct 04, 2012 at 10:44:20 PM PDT

      [ Parent ]

      •  "supposedly 'nationwide'"? (0+ / 0-)

        What a weird conversation this is. What is the point of asking methodological questions if even the most straightforward parts of the answers are subject to scare-quoting? It makes sense, given that you already launched the Facebook crusade. But as a research strategy, it's bent.

        I'd be pretty surprised if the pre-debate survey had a particular bias toward "cable TV viewers." I say "particular" because any raw sample might be biased toward cable TV viewers, especially once non-respondents are winnowed.

        Once the panel sample is limited to people who (1) watched the debate, (2) agreed to be reinterviewed, and (3) actually picked up the phone, then ORC is contending with both the "bias" in viewership and any factors that influenced (2) and (3). It isn't facially obvious what is the best, or "least worst," way of dealing with that challenge.

        Election protection: there's an app for that!
        Better Know Your Voting System with the Verifier!

        by HudsonValleyMark on Fri Oct 05, 2012 at 07:34:11 AM PDT

        [ Parent ]

        •  I don't really know (0+ / 0-)

          how likely it is to try to do a nationwide survey and end up with what appears to be a disproportionate number of respondents from a single region.

          Certainly, I suspect the results of the poll would have been dramatically different if a disproportionate number of the respondents had happened to end up being from the Northeast, for example.

          •  these simply aren't the facts in evidence (2+ / 0-)

            (Sorry I missed your comment earlier.)

            The MoE of +/- 8.5% for respondents from the South implies a sample size of about 138 (with ample room for rounding error), which would be 32% of the 430 respondents. (If we just looked at the ratio of that MoE to the overall MoE, the estimate would be smaller.)

            The South's reported vote share in the 2008 exit poll was 32%.

            It's no wonder that the diarist eventually pivoted to emphasizing the age distribution, except that what I really expected him to do was to fix the diary.

            The age distro raises doubts about the generalizability of the poll -- well, sort of. Snap polls are never going to be all that generalizable, and we already knew from the Romney favorables that the sample was red, although we don't know to what extent that reflects viewership bias vs. poll bias. Wading through the crosstabs is, at best, a pretty oblique way of getting at the problem.

            But the way the diarist characterized the N/A columns has helped perpetrate an urban legend. From now on, there will be Kossacks who think they learned from a reputable source that this poll interviewed no one, or hardly anyone, who wasn't a 50+ white southerner. That bothers me.

            Election protection: there's an app for that!
            Better Know Your Voting System with the Verifier!

            by HudsonValleyMark on Sat Oct 06, 2012 at 03:35:54 PM PDT

            [ Parent ]

        •  The best way of dealing with that challenge (1+ / 0-)
          Recommended by:
          HudsonValleyMark

          is to do so transparently, which from the CNN headlines, it did not.  (That's foreshadowing my conclusion.)

          I don't know if your background is in academia instead of the rough and tumble world, but maybe you just don't have enough experience with disingenuous and misleading polling.  (Search the word "Probolsky," a political activist/consultant whose work appears uncriticized in the Orange County Register, and you'll see what I'm dealing with out there.)  Take the word "nationwide."  If I interview 350 people from a wealthy Georgia suburb and 1 person apiece from each of the other 50 states (counting DC), I have a nationwide survey.  (One could of course lack respondents many states and still call it "nationwide.")  Before you 'splode, I don't think they did this.

          You would be wrong to call it a "nationwide random sample," of course, which is why I look for that kind of language in poll descriptions.  In this one, we sort of get it -- but we really don't:

          Interviews with 430 adult Americans who watched the presidential debate conducted by telephone by ORC International on October 3, 2012. The margin of sampling error for results based on the total sample is plus or minus 4.5 percentage points.

          Survey respondents were first interviewed as part of a random national sample on September 28-October 2, 2012. In those interviews, respondents indicated they planned to watch tonight's debate and were willing to be re-interviewed after the debate.

          My first question is: what's the sampling frame?  One would like to infer that it was a sample of all American registered or perhaps likely voters -- but it doesn't say that.  All we know that that the population from which they sampled is nationwide -- it doesn't say "cable TV watchers," but it doesn't say "registered voters" either! -- and that having obtained that population, they sampled from it randomly.

          Great -- but that's the obtained sample from the first stage of the survey: that's not the obtained sample from which they report the results.  The methodological problem here is one of "differential extinction."  (Those aren't scare quotes; it's a technical term -- or at least it was when I was in grad schoo; the concept must continue to exist regardless of whether the terminology may have changed.)  For the benefit of other readers, rather than, I suspect, for you, it refers to the problem that in a "panel study" (where one tests or surveys people at one point and then again one or more times subsequently), the proportion of people who stop participating or providing usable data across demographic or (if an experiment is done) "treatment" conditions may be correlated with the demographic or manipulated factor.

          The likelihood that the obtained sample in the second (post-debate) administration of the poll was quite discrepant from the voting population (or even the debate-watching population) seems quite high.  That's evident in the age data, for one thing, although some of that data is hidden due to a procedure for presenting data that, while it may be routine in some situations, is inexcusable in this one given both the importance and influence of this sort of poll (which I don't recall you yet acknowledging), the likelihood of this sort of confounding factor in this research situation, and the great ease by which it could be solved by simply giving the damned raw numbers along with an asterisk, if need be, noting that the sampling error for that subgroup was high enough that it was beyond the convention that CNN customarily used for reporting data.  Reporting a bunch of "N/A" cell results like this is unnecessary and unconscionable -- not necessarily from a statistics perspective, but from a journalistic and political communications perspective.

          It seems highly likely that the distribution of age of the recipients was so likely to be discrepant from what lay consumers of the poll would presume it to be as to require a responsible journalistic organization to flag it -- not once in a separate document, not buried in the middle of a story, but even in headlines.  "Elderly sample judges Romney winner by 67% to 25% margin" might do; and statements that (to paraphrase) this is the highest number of votes for one candidate to have won a debate ever -- in fact it's never before reached 60% for the winner -- which were bouncing along CNN's site were astonishingly irresponsible.

          Do you think that one could or should compare a 2012 sample that was highly skewed in the elderly direction with a 2008 or 2000 or 1980 sample that for all we know was not?  Scientifically, it is irresponsible.  Journalistically, it is an order of magnitude beyond merely irresponsible.  In political journalism, in what everybody knows would be a highly influential poll after a highly influential event (especially with that result) a month before an election, it's two or three orders of magnitude beyond that.  It's stunningly bad practice.

          What you're really objecting to, I recognize, is my following a previous diarist in saying that they had had no respondents from various categories instead of (perhaps you'd agree) disproportionately few of them (without putting a bright red flag on it when reporting results.)  Yep, you're right, especially about the South.  I did recognize that they had to have interviewed some people outside of the non-N/A categories, but my intuition in looking at it was that it was shockingly few.

          In reaching that intuition, I was primarily influenced by the first line, the age split of 50+ vs. <50, which (rightly, I think) put blood in my eye.  When the marginals for the 50+ group -- itself quite skewed towards 65+; check the standard errors --  were one point off from the total margin (67/24 vs. 67/25) and the marginals for the "attended college" was only two points off (67/23 vs. 67/25), that was pretty much all I had to see to deliver a verdict of guilty of a political journalistic crime.  That should have required an asterisk on every headline the size of John Boehner's liver -- and CNN didn't provide one because it was in CNN's interest not to provide on -- JUICY HEADLINE, GUYS! -- and I think it is extremely unlikely that, presuming that they were monitoring the likely demographic skew (and probably the differential extinction) in their follow-up survey, they knew about it in advance and had planned to be able to run with the story exactly as they did.

          (Note: I had acted here by intuition based on experience with datasets; once I calmed down the next day I decided to check to see whether my intuition was right.  You'll find that painfully detailed analysis here, in a comment down near the bottom of this section.  I think that my noting the the maximum possible discrepancy (and the much lower likeliest possible discrepancy) in the <50 group helps to establish the conclusion that my snap-judgment revulsion at what had happened here and at how it was being taken was absolutely justified.)

          You asked me a legitimate question about standard errors regarding geography, which I brushed off because I was focused primarily on the top two results, for age and college; that was wrong on my part, which I know is what you've wanted to hear, but I was also very angry at the massive gulf between the apparent composition of the second obtained sample and the way it was being excitedly presented by and received on CNN.com.

          There's a difference between interviewing no one in a category and interviewing way too few of them.  In the heat of the moment, I didn't honor it.  I think that what CNN did in (my opinion remains) knowingly slanting the reportage about the debate performance based on what they knew would be a dramatically skewed poll was reprehensible.

          I don't really expect Markos to acknowledge that even if he gets it; I do expect that you should.  The purpose of polling in political journalism is to inform.  The poll invited misinterpretation -- and while the use of N/A is justified in most circumstances, in this one it clearly was not.  They looked like they were trying to hide something -- most likely because they were trying to hide something.

          I expect transparency from major players like CNN.  I got opacity.

          Pro-Occupy Democratic Candidate for California State Senate, District 29 & Occupy OC Civic Liaison.

          "I love this goddamn country, and we're going to take it back." -- Saul Alinsky

          by Seneca Doane on Sat Oct 06, 2012 at 01:29:52 PM PDT

          [ Parent ]

          •  what I wanted was for you to fix the mistake (1+ / 0-)
            Recommended by:
            Seneca Doane

            I mean, y'know, 20/20 hindsight.

            Whatever loopholes exist in the survey description, I see no positive basis here for thinking that ORC cooked the survey. I think your main gripe is that CNN didn't go nearly far enough in communicating the limitations of the survey. That would be a fair gripe, but I don't think worth blowing a gasket over. (I don't think CNN's overall narrative would have been much different with a bluer sample or a more nuanced story on the poll.)

            (I also think that CNN should be a lot more transparent about polling methodology generally -- but this didn't play out as a good test case for that.)

            I think it would have been better to say, "Look, the favorables alone tell us two things: that this sample was unusually pro-Romney, and that the debate had basically zero impact on favorability. Nothing here supports the game-changer narrative."

            Dunno. Maybe in the rough and tumble world, demonizing CNN got some people's minds off other things. But I thought it made us look kind of dumb, kind of petty, and kind of desperate, for no good reason. I'm not saying that to snipe, just offering my point of view for whatever it is worth.

            Election protection: there's an app for that!
            Better Know Your Voting System with the Verifier!

            by HudsonValleyMark on Sat Oct 06, 2012 at 04:05:02 PM PDT

            [ Parent ]

            •  I shouldn't spend more time rehashing the points (0+ / 0-)

              nut I'll respond just to what you think it would have been better to say:

              (1) "the sample was unusually pro-Romney."  That's pallid and even if you tossed out the 54-42 pre-debate breakdown, I don't think that voters would really get it.  "They interviewed a group of people that you'd mostly expect to support Romney and then they pretended that they hadn't" is more powerful.

              (2) The impact a debate on favorability is not something that happens independently inside individuals' minds.  There's a snowball effect.  Your sense of the debate -- even your recollections from the debate and your mis/recollections from the debate -- changes as you get a sense of what others thought.  We use social norms -- including what other people thought of something -- to determine, refine, and revise our own social perceptions.

              That's why these insta-polls are so very important and persuasive: they have the potential to kick off a snowball effect.  In that respect they are very /unlike the slow tracking polls to which few non-geeks closely attend, which is why Markos's comparison of me to an "unskewer" was absurd.  This matters more -- and more quickly.

              There may have been zero favorability inside people's individual heads at the time they say the debate, but social information about how others reacted was bound to (and did) push it in a given direction and rolled it downhill to gather up steam.  If we were able to contain it, it was partly because networks and others did more fact-checking this time around than was usual, which was not the sort of thing on which I'd like to depend.

              If you want to know why I jumped on it immediately with cleats on, that' why.  And I really don't much mind if it looked petty or dumb.  This is war, not a photo op.

              Pro-Occupy Democratic Candidate for California State Senate, District 29 & Occupy OC Civic Liaison.

              "I love this goddamn country, and we're going to take it back." -- Saul Alinsky

              by Seneca Doane on Sun Oct 07, 2012 at 02:14:47 PM PDT

              [ Parent ]

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site