Skip to main content

A bit over two weeks ago, a group of statistic wizards (Mark Grebner, Michael Weissman, and Jonathan Weissman) approached me with a disturbing premise -- they had been poring over the crosstabs of the weekly Research 2000 polling we had been running, and were concerned that the numbers weren't legit.

I immediately began cooperating with their investigation, which concluded late last week. Daily Kos furnished the researchers with all available and relevant information in our possession, and we made every attempt to obtain R2K's cooperation -- which, as I detail in my reaction post here -- was not forthcoming.  The investigators' report is below, but its conclusion speaks volumes:

We do not know exactly how the weekly R2K results were created, but we are confident they could not accurately describe random polls.

The full report follows -- kos

R2K polls: Problems in Plain Sight

Mark Grebner, Michael Weissman, and Jonathan Weissman

For the past year and a half, Daily Kos has been featuring weekly poll results  from the Research 2000 (R2K) organization.  These polls were often praised for their "transparency", since they included detailed cross-tabs on sub-populations and a clear description of the random dialing technique. However, on June 6, 2010, FiveThirtyEight.com rated R2K as among the least accurate pollsters in predicting election results. Daily Kos then terminated the relationship.

One of us (MG) wondered if odd patterns he had noticed in R2K's reports might be connected with R2K's mediocre track record, prompting our investigation of whether the reports could represent proper random polling. We've picked a few initial tests partly based on which ones seemed likely to be sensitive to problems and partly based on what was easy to read by eye before we automated the data download. This posting is a careful initial report of our findings, not intended to be a full formal analysis but rather to alert people not to rely on R2K's results. We've tried to make the discussion intelligible to the lay reader while maintaining a reasonable level of technical rigor.

The three features we will look at are:

  1. A large set of number pairs which should be independent of each other in detail, yet almost always are either both even or both odd.
  1. A set of polls on separate groups which track each other far too closely, given the statistical uncertainties.
  1. The collection of week-to-week changes, in which one particular small change (zero) occurs far too rarely. This test is particularly valuable because the reports exhibit a property known to show up when people try to make up random sequences.

1. Polls taken of different groups of people may reflect broadly similar
opinions but should not show any detailed connections between minor random details.  Let's  look at a little sample of R2K's recent results for men (M) and women (F).

6/3/10       Favorable     Unfavorable    Undecided
Question    Men  Women     Men  Women    Men  Women

Obama        43    59       54    34        3     7
Pelosi       22    52       66    38       12    10
Reid         28    36       60    54       12    10
McConnell    31    17       50    70       19    13
Boehner      26    16       51    67       33    17
Cong. (D)    28    44       64    54        8     2
Cong. (R)    31    13       58    74       11    13
Party (D)    31    45       64    46        5     9
Party (R)    38    20       57    71        5     9

A combination of random sampling error and systematic difference should make the M results differ a bit from the F results, and in almost every case they do differ. In one respect, however, the numbers for M and F do not differ: if one is even, so is the other, and likewise for odd. Given that the M and F results usually differ, knowing that say 43% of M were favorable (Fav) to Obama gives essentially no clue as to whether say 59% or say 60% of F would be. Thus knowing whether M Fav is even or odd tells us essentially nothing about whether F Fav would be even or odd.

Thus the even-odd property should match about half the time, just like the odds of getting both heads or both tails if you tossed a penny and nickel. If you were to toss the penny and the nickel 18 times (like the 18 entries in the first two columns of the table) you would expect them to show about the same number of heads, but would rightly be shocked if they each showed exactly the same random-looking pattern of heads and tails.
   
Were the results in our little table a fluke? The R2K weekly polls report 778 M-F pairs. For their favorable ratings (Fav), the even-odd property matched 776 times. For unfavorable (Unf)  there were 777 matches.

Common sense says that that result is highly unlikely, but it helps to do a more precise calculation. Since the odds of getting a match each time are essentially 50%, the odds of getting 776/778 matches are just like those of getting 776 heads on 778 tosses of a fair coin. Results that extreme happen less than one time in 10228. That’s one followed by 228 zeros. (The number of atoms within our cosmic horizon is something like 1 followed by 80 zeros.) For the Unf, the odds are less than one in 10231. (Having some Undecideds makes Fav and Unf nearly independent, so these are two separate wildly unlikely events.)

There is no remotely realistic way that a simple tabulation and subsequent rounding of the results for M's and F's could possibly show that detailed similarity. Therefore the numbers on these two separate groups were not generated just by independently polling them.

This does not tell us whether there was a minor "adjustment" to real data or something more major. For that we turn to the issue of whether the reports show the sort of random weekly variations expected to arise from sampling statistics.

2. Polls taken by sampling a small set of N respondents from a larger population show sampling error due to the randomness of who happens to be reached. (The famous "margin of error" (MOE) describes this sampling error.) If you flip a fair coin 100 times, it should give about 50 heads, but if it gives exactly 50 heads each time you try, something is wrong. In fact, if it is always in the range 49-51, something is wrong. Although unusual poll reproducibility can itself occur by accident, just as exactly 50 heads can happen occasionally, extreme reproducibility becomes extremely unlikely to happen by chance.

To see whether the results showed enough statistical variation, we use several techniques to isolate the random sampling fluctuations from changes of opinion over time. First, we focus on small demographic subsets, since smaller samples show more random variation. Second, we consider the differences between categories whose actual time course should approximately match to remove those shared changes, making it easier to see if there is enough statistical noise. Finally, we make use of the different time courses of opinion change and random statistical noise. The former tends to show up as slow, smooth, cumulative changes, while the latter appears as abrupt movement in poll numbers up or down that aren't sustained in subsequent polls. An easy way to separate the fast from the slow is to look only at the differences from week to week, in which the fast changes are undiminished but the slow changes hardly show up.

At one point R2K changed its target population, sample size (N), and the categories used for crosstabs. For the first 60 weeks, R2K reported N=2400, and the set of questions changed occasionally; during the final 14 weeks, N=1200 and the same 11 three-answer questions were used each week. We analyzed the two sets of results separately since most simple statistical tests are not designed for that sort of hybrid.

We took advantage of the small numbers with reported political party affiliations of "independent" (about 600 per week) and "other" (about 120 per week) in the first 60 weeks.  We tracked the difference between Obama's margin (Fav-Unf, the percent favorable minus the percent unfavorable) among independents and "others". This quantity should have a large helping of statistical noise, not obscured by much systematic change.

We quantify the fluctuations of the difference between those margins via its variance, i.e. the average of the square of how far those margins differ from their own average. (The units for variance here are squared-percent, which may take some getting used to.) The expected random variance in either margin is known from standard statistical methods:

{variance}=(100*(Fav+Unf)-(Fav-Unf)2)/N.  (1)

(Essentially the same calculation is used all the time to report an MOE for polls, although the MOE is reported in plain percents, not squared. )

The expected variance of the sum or difference of two independent random variables is just the sum of their expected variances. (That simple property is the reason why statisticians use variance rather than say, standard deviation, as a measure of variation.)  For the Obama approval rating (the first and only one we checked) the average expected variance in this difference of margins over the first 60 weeks was 80.5, i.e. a range of +/- 9% around the average value.

With that mind, consider what R2K reported, in their first 8 weekly polls:

 Attitude toward Obama among "Independents" and "Other"
Week ended    Independents        Other          Diff
             Fav Unfav Marg   Fav Unfav Marg    
1/ 8/09      68 - 25 = 43     69 - 23 = 46       -3
1/15/09      69 - 24 = 45     70 - 22 = 48       -3
1/22/09      82 - 15 = 67     84 - 13 = 71       -4
1/29/09      80 - 16 = 64     82 - 16 = 66       -2
2/ 5/09      73 - 21 = 52     72 - 23 = 49       +3
2/12/09      72 - 20 = 52     71 - 22 = 49       +3
2/19/09      71 - 21 = 53     72 - 22 = 50       +3
2/26/09      70 - 20 = 57     75 - 21 = 54       +3

There's far less noise than the minimum predicted from polling statistics alone.

Looking over the full 60-week set, the variance in the reports was only 9.947. To calculate how unlikely that is, we need to use a standard tool, a chi-squared distribution, in this case one with 59 degrees-of-freedom. The probability of getting such a  low variance via regular polling is less than one in 1016, i.e. one in ten million billion.

What little variation there was in the difference of those cross-tab margins seemed to happen slowly over many weeks, not like the week-to-week random jitter expected for real statistics. Since the weekly random error in each result should be independent of the previous week, the squared random weekly changes should average twice the variance. (Remember, for independent variables the variance of the sum or difference is just the sum of the variances.) That's 2*80.5= 161 in this case. The actual average of the square of the weekly change in the difference between these reported margins was 1.475. It is hardly necessary even to do statistical analysis to see that something is drastically wrong there, much worse even than in the reported variances, which were already essentially impossible.

So far we have described extreme anomalies in the cross-tabs. We have not yet directly described the top-lines, the main results representing the overall population averages. Could these have been obtained by standard methods, regardless of how the cross-tabs for sub-populations were generated? The top-line  statistics require more care, because they are expected to have less statistical jitter and because there is no matching subset to use to approximately cancel the non-random changes over time.

For the data from the first 60 weeks, before the change in N, we found no obvious lack of jitter in the top-lines. For the next 14 weeks, the top-line margins give the immediate impression that they don't change as much in single week steps as would be expected just from random statistics. A detailed analysis does show very suspiciously low weekly changes in those margins, but the analysis is too complex for this short note.

3. We now turn instead to another oddity in the week-to-week changes in the top-lines. For background, let’s look at the changes in the Obama Fav from Gallup's  tracking poll, with  162 changes from one independent 3-day sample of N=1500 to the next. There is a smooth distribution of changes around zero, with no particular structure. That’s just as expected.

Gallup graph

Now let’s look at the same for the weekly changes in  R2K's first 60 weeks. There are many changes of 1% or -1%, but very few of 0%. It's as if some coin seemed to want to change on each flip, rarely giving heads or tails twice in a row. That looks very peculiar, but with only 59 numbers it's not so extremely far outside the range of what could happen by accident, especially since any accidental change in week 2 shows up in both the change from week 1 and the change to week 3, complicating the statistics.

R2K graph

If we now look at all the top-line changes in favorability (or other first-answer of three answers)   for the last 14 weeks, we see a similar pattern.

R2K 14 weeks graph

A very similar missing-zeros pattern also appears in the complete first 60-week collection, as we see here for the 9 3-answer questions given each of those weeks. (Here, just to be extra cautious, we only show every other weekly change, so each weekly result shows up in only one change.)

R2K 60 weeks graph

How do we know that the real data couldn't possibly have many changes of +1% or -1% but few changes of 0%? Let's make an imaginative leap and say that, for some inexplicable reason, the actual changes in the population's opinion were always exactly +1% or -1%, equally likely. Since real polls would have substantial sampling error (about +/-2% in the week-to-week numbers even in the first 60 weeks, more later) the distribution of weekly changes in the poll results would be smeared out, with slightly more ending up rounding to 0% than to -1% or +1%. No real results could show a sharp hole at 0%, barring yet another wildly unlikely accident.  

By "unlikely" we mean that, just looking at the last 14 weeks alone, and counting only differences between non-overlapping pairs of weeks just to make sure all the random changes are independent,  the chances are about one in a million that so few would be 0% out of just the results that came out +1%, 0%, or -1%. When those first 60 weeks of data are also included, the chances become less than 1 in 1016. (There are some minor approximations used here, but not ones of practical import.)

The missing zeros anomaly helps us guess how the results were generated. Some fancy ways to estimate population percentages based on polling results and prior knowledge give more stable results than do simple polls. So far as we are aware, no such algorithm shows too few changes of zero, i.e. none has an aversion to outputting the same whole number percent in two successive weeks. On the other hand, it has long been known that when people write down imagined random sequences they typically avoid repetition, i.e. show too few changes of zero.{Response-Tendencies in Attempts to Generate Random Binary Series. Paul Bakan, The American Journal of Psychology, Vol. 73, No. 1 (Mar., 1960), pp. 127-131.}

People who have been trusting the R2K reports should know about these extreme anomalies. We do not know exactly how the weekly R2K results were created, but we are confident they could not accurately describe random polls.

We thank J. M. Robins and J. I. Marden for expert advice on some technical issues and Markos Moulitsas for his gracious assistance under very trying circumstances. Calculations were done in Matlab, except for calculations of very low probabilities, which used an online tool.

Postscript: Del Ali, the head of R2K, was contacted by Markos concerning some of these anomalies on  6/14/2010.  Ali responded promptly but at the time of this writing has not yet provided either any explanations or any raw data. We sent a draft of this paper to him very early on 6/28/2010, with an urgent request for explanations or factual corrections.  Ali sent a prompt response but, as yet, no information.

Mark Grebner is a political consultant.

Michael Weissman is a retired physicist.

Jonathan Weissman, a wildlife research technician.  is opening a blog with Michael for more accessible in-depth explanations, Q, arguments, etc. on the technical side of these poll forensics.

Originally posted to Daily Kos on Tue Jun 29, 2010 at 10:01 AM PDT.

EMAIL TO A FRIEND X
Your Email has been sent.
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags

?

More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
(The diary will be removed from the site and returned to your drafts for further editing.)
(The diary will be removed.)
Are you sure you want to save these changes to the published diary?

Comment Preferences

    •  Now I'm going to puzzle over this all day (13+ / 0-)

      Without a doubt, anomaly (1) could not be the result of random chance, but something else.  The question is, what on Earth is that something else?  

      What sort of mistake could possibly result in the Men/Women pairs (almost) always having the same residue modulo 2?  

      One possible explanation is that integer rounding occured in the average of two numbers.  Suppose I have numbers M=49 and W=52.  I decide to store their average A=(M+W)/2=50.5, and this is mistakenly stored in an int as A==50.  If the final numbers are somehow derived from this value (e.g. M=2A-W), you'd get pairs of numbers that have equal parity.
       
      Alternately, I may try to fabricate data by randomly generating a target average A for both men and women, and then picking a small deviation D to produce realistically diverging pairs M=A-D, W=A+D.  This will produce the same result.

      So I'm guessing that this could simply be some kind of data processing error, or something far less ethical, but in either case an unwanted integer cast is to blame.

      •  If that were the case (4+ / 0-)

        the pollsters should have been able to supply either an explanation or the raw data, as they promised to do when the questions came up.

        I call crickets...

        Before you win, you have to fight. Come fight along with us at TexasKaos.

        by boadicea on Tue Jun 29, 2010 at 11:01:16 AM PDT

        [ Parent ]

        •  I don't see your logic (2+ / 0-)
          Recommended by:
          steve04, Sharon Wraight

          How does that follow from what I said?  

          If it was indeed the case that they generated fake data with a mistaken integer cast, then they wouldn't be able to supply a satisfactory explanation or hand over the raw data.

          Again, I'm not asking whether this is faked data or misprocessed real data; I'm asking what specific type of bug would produce the weird anomaly of even-even and odd-odd pairs.

          •  I think mistaken integer cast is right (6+ / 0-)

            There's a possibility that the numbers published are post-processed by a reporting script, which itself may be sloppy with integer vs. double casting.  

            For example, if the raw data by sex was initially broken down and stored as two values, the population average, and half the difference between male & female, when it is read back into memory to form a report, if the difference is read in as an integer, it would produce this error.

            For example, population average could be 50.2, and if the male/female difference divided by 2 is 3 (integer), then they'd end up with 47.2 and 53.2, both rounding to odd numbers.  With the integer +/- split, no matter what decimal precision population average is used as a starting point, the final break down will always be odd odd or even even.  

            Perhaps a similar thing could be at work with the week to week variation avoiding 0--it seems like they aren't converting from decimal to integer correctly in their number handling and something that should round down to 0 is always rounding up to +/- 1.

            If the above hypotheses are true, they would indicate several things: (1) not following the rule of carrying maximum precision data through a calculation until the end, where it should be rounded to the least number of significant digits of the input data, and (2) not understanding machine storage of numbers, which is pretty inexcusable for a statistics firm.

            In combination, (1) and (2) would point toward a lazy and ill-educated operation that might produce bottom of the pack accuracy, as already shown by 538.com

            No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

            by steve04 on Tue Jun 29, 2010 at 12:36:12 PM PDT

            [ Parent ]

            •  problem is... (1+ / 0-)
              Recommended by:
              SarahLee

              You're not considering the proportion of the population that is male (or female).

              You're considering the proportion of the male population (resp. female population) that has a favorable response to a certain question.

              In particular, you are considering

              fav(male)/tot(male)

              and

              fav(female)/tot(female)

              Why would an improper integer cast produce exactly the same behavior for both of these numbers?  You don't even see that kind of correlation between fav(female)/tot(female) and unfav(female)/tot(female), where it might be more excusable.  

              Why would they always both be simultaneously even or simultaneously odd?  

              I'm keen to hear any feasible explanation.  

              Gentlemen, you can't fight in here! This is the War Room.

              by RickD on Tue Jun 29, 2010 at 02:31:36 PM PDT

              [ Parent ]

              •  The point is not related to proper consideration (1+ / 0-)
                Recommended by:
                Sharon Wraight

                It is statistically conclusive that the numbers, as reported, can not accurately reflect a legitimate poll.

                The question is whether the poll was falsified, or whether the processing of the poll data was done incorrectly (without appropriate consideration of population statistics, integer/decimal handling, etc.).  I don't think the odd odd / even even split is possible unless there was an error in digital handling of numbers representing the polling "data."

                No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

                by steve04 on Tue Jun 29, 2010 at 02:50:11 PM PDT

                [ Parent ]

                •  sure it's possible (0+ / 0-)

                  if it was done intentionally.

                  I'm curious as to why it would exist as part of botched "digital handling of the numbers".  Just how would you be handling the numbers so this kind of behavior would be the result?  

                  I cannot see this being the result of anything remotely innocuous.  Data entry isn't that hard, nor is it hard to calculate percentages.  I cannot even see how this arises as a result of a bogus RNG, even though in such a case I concede that code might exist that would produce such a result.  But I'd have to be shown such code and given an explanation as to why it was used before believing that it was the cause.

                  Look - it's 2010, not 1951.  The amount of number-crunching involved here can be handled by any spreadsheet program written in the 1980s.  That is, if the numbers were generated in any legitimate fashion.  

                  Gentlemen, you can't fight in here! This is the War Room.

                  by RickD on Tue Jun 29, 2010 at 03:07:55 PM PDT

                  [ Parent ]

                  •  Please just trust people who program (0+ / 0-)

                    Number crunching in computers depends on how the numbers are stored.  I tried to explain the issue in terms of integer vs. double precision--I'm really not interested in giving a computer science 102 class.

                    And all your talk of spreadsheets is nice--but Excel's recent release had bugs in how it did math that took many months to be fixed.

                    No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

                    by steve04 on Tue Jun 29, 2010 at 03:25:12 PM PDT

                    [ Parent ]

                    •  "trust people who program" (1+ / 0-)
                      Recommended by:
                      WillR

                      Um, I program.  I also have three degrees in mathematics (BA, MS, and PhD) and understand probability pretty well.  

                      I doubt that any bugs in the recent releases of Excel dealt with dividing one integer by another.  And, again, I would have to be shown some example of why the bugs would appear in exactly this fashion.  Saying "there might be bugs" doesn't bring us very far to explaining why a bug would produce exactly this behavior.

                      (And lose the condescension while you're at it.)

                      Let's put it this way: it would be extremely easy to write a program to intentionally produce this behavior.  It'd be pretty difficult to write a program that produced this behavior accidentally.  Why on Earth is the approval rating for men related in any way to the approval rating for women?  

                      You're focusing on the wrong questions.

                      Gentlemen, you can't fight in here! This is the War Room.

                      by RickD on Tue Jun 29, 2010 at 09:21:04 PM PDT

                      [ Parent ]

                      •  The Basic Issue is Why Would This Happen (1+ / 0-)
                        Recommended by:
                        steve04

                        The basic issue I see for the odd/even pairing is that it's highly unlikely someone just picking numbers at random would come up with such a consistent pairing.  This strongly suggests the results in case 1) weren't generated by a human-being randomly picking numbers.  

                        In contrast to case 2) or 3), both which can be logically explained by human actors attempting to (poorly) simulate randomness, the results of case 1) are just plain bizarre.  It’s still obvious that the results aren’t random (1 in 10^216?), but that doesn’t make them any less weird, even for a computer. Unless a number generating program were specifically designed to produce the flawed odd/even pairing, it seems unlikely this would arise by accident.  

                        Caj is merely trying to suggest a plausible way these results might have arisen, without having been intentionally built into the program.  Also, Caj isn't claiming that the results weren't necessarily fabricated, only that in case 1) even a pure fabrication would be hard-pressed to produce these results.  After all, who would deliberately write a number faking program to fake numbers in such a deliberately systematic way?

                        One other possibility is that they were deliberately put there by a human actor - either for mere amusement or possibly as a flag.  That doesn't make a lot of sense either, but neither do most of the other explanations.

                        •  rounding explains point #1 (0+ / 0-)

                          I can produce an algorithm in which rounding exactly produces the results found in point #1.

                          I will use specific numbers for an example. I will first calculate results without rounding until the end. Then I will show how rounding in the middle at two points produces the observed results. Skip down to ANALYSIS WITH ROUNDING to get a general idea of the explanation without too many mathematical details.

                          ANALYSIS WITHOUT ROUNDING

                          1.

                          First I'll give the example data. Start with a sample size of 600. This is split between 306 females and 294 males.

                          Of the 306 females, 143 have a favorable opinion of a particular person.

                          Of the 294 males, 129 have a favorable opinion of that person.

                          2.

                          Find percentages for each group.

                          143 of the 306 females comes to 46.73%.

                          129 of the 294 males comes to 43.88%.

                          3.

                          Find the percentage for the full population.

                          In all, 272 (45.33%) of all 600 respondents have a favorable opinion of that person.

                          4.

                          Given the total result in step 3, calculate how many males and how many females would have been expected to favor this person.

                          45.33% of 306 females is an expectancy of 138.72.

                          45.33% of 294 males is an expectancy of 133.28.

                          5.

                          Calculate the excess or deficiency of each number.

                          For females, there were 143 minus 138.72 extra females who favored this person. The excess is 4.28 females.

                          For males, there were 133.28 minus 129 fewer males who favored this person. The deficiency is 4.28 males.

                          The two numbers calculated above will always be exactly equal.

                          6.

                          Turn the excess or deficiency into a percentage.

                          For females, divide 4.28 by 306 to get an excess of 1.40%.

                          For males, divide 4.28 by 294 to get a deficiency of 1.46%.

                          These numbers are close but not exactly equal because the denominators differ. This is because there were unequal numbers of males and females in the sample.

                          7.

                          Add the excess percentage or subtract the deficient percentage to the percentage for the full population to get the final percentage for each group.

                          For females, 45.33% plus 1.40% gives 46.73%.

                          For males, 45.33% minus 1.46% gives 43.88% (I kept extra digits of precision not shown here to get this result).

                          These are the correct numbers calculated more directly in step 2. If we now round these final results, we get 47% and 44%, an odd-even pair.

                          ANALYSIS WITH ROUNDING

                          Now we add rounding.

                          In step 3, we round to 45%. In the general case, call this X.

                          In step 6, we round the excess and the deficiency to 1%. In the general case, call this Y.

                          In step 7, we add/subtract these rounded results.

                          45% +/- 1% gives 46% versus 44%.  This is an even-even pair, even though the correct result is an odd-even pair.

                          With this rounding, we will always get a result of X+Y and X-Y, where X and Y are each integers. The difference between these is 2*Y, so we will always end up with odd-odd or even-even pairs.

                          Now, note that the excess and deficiency in step 7 are not exactly equal. Normally, they will round to the same integral percentage. Occasionally we will hit a boundary in which one rounds up and one rounds down. If we have a sample size of 1200 and the difference between male opinions and female opinions is often in the range of 5% to 10%, then there is roughly a one in 500 chance of this happening. In fact, this study found that there were three cases of odd-even or even-odd pairs out of 1556.

                          Why would the pollster use such a strange calculation? It is normal to weight poll results to account for differences between the sample sizes of subgroups and their known distribution in the population. The type of calculation performed here could easily come about as a result of such weighting with incorrect use of rounding of intermediate results.

                          It is plausible that a similar rounding problem could explain the other two unexpected features of the poll results. I can't come up with an algorithm that does that, but I can't rule it out.

                          •  Anything's possible. (0+ / 0-)

                            Once we agree that the published crosstabs don't correctly represent actual polling responses, we are left to wonder where they really came from.  A process like the one proposed, with multiple intermediate calculations, each using improperly rounded factors, could easily have played a part.

                            In the actual data set, there were many divergences of reported male and female opinions over 20%, which would have resulted in more cases of non-matched parity.  Of course, the actual series of calculations may have been more tangled than we imagine.

                            I don't understand why this would have been needed for rounding; it would seem more natural as a step to synthesizing M/F splits that match a previously synthesized topline result.  That is, take the rounded topline and apply an offset to create each of the required subgroups, using weights that assure they add up to the required total.

                            Speculation on how it may have been done is a very feeble tool.  A statement from Del might clear it all up in an instant.

                      •  Look elsewhere in the thread. (0+ / 0-)

                        I've given an example of how the data could have been stored/read back in during an intermediate processing step that would produce the odd odd/even even result.  It just so happens that the only other hypothetical bug proposed by anyone in this thread was proposed by Caj, and is the same one I came up with without too much time spent.  If we can come up with one explanation, there are surely others.  As such, what you refer to as a "proof" by the "expert witnesses" is not a proof but a hypothesis that ignores potential conflicts.

                        No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

                        by steve04 on Wed Jun 30, 2010 at 12:52:24 PM PDT

                        [ Parent ]

                      •  Here's one of the excel bugs (0+ / 0-)

                        http://blog.meteorit.co.uk/...

                        Your doubt about dividing one integer by another is confusing to me--I never said anything about needing two integer casts to produce the odd odd/even even split, and if you knew your computer math well, you would know that if you start a calculation with a single integer, no matter how many higher precision numbers are used subsequently in the calculation, the intermediate value is stored in the computer with integer precision.

                        As for whether it's easy to produce this data intentionally, we can discount that case.  R2K is sunk as a business if they did that--they were a real polling firm before Kos hired them, and they would like to continue to have a business as a real polling firm.  Similarly, I think we can discount a human manually entering those numbers, because a human would easily see the pattern they were generating and would take steps to make it look more random, with the intent of continuing to have a business and a job.

                        No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

                        by steve04 on Wed Jun 30, 2010 at 01:00:50 PM PDT

                        [ Parent ]

          •  Entomology (4+ / 0-)
            Recommended by:
            steve04, kurt, kingneil, pixxer

            "what specific type of bug would produce the weird anomaly of even-even and odd-odd pairs"

            a) One variable ended up dependent on the other when they're not supposed to be, which could happen if a randomizing factor was supposed to have a 0.5 probability but ended up being closer to 1.0 one way or the other, or

            b) The raw generated results are sanitized through a filter to eliminate "impossible" results, and "impossible" at runtime ended up actually being "different".

            Either of these kinds of errors could result from as little as a single misplaced decimal point in the code, even if the equations the code was implementing were correct, which they might not be.

            Spend a few decades programming like I have, you will no longer see anything weird about anomalies like this showing up in your models, I assure you...

      •  Invented data (2+ / 0-)
        Recommended by:
        JVolvo, Coach Jay

        Read all the other polling data and then invent your own.

        The US Senate is begging to be abolished. Let's fulfill its request.

        by freelunch on Tue Jun 29, 2010 at 11:07:56 AM PDT

        [ Parent ]

        •  Again, that isn't an explanation (10+ / 0-)

          That doesn't explain why the numbers would occur in pairs with this weird identical parity.  

          If you honestly poll tons of people, this sort of pattern will never happen.  But if you fake all the numbers with a random number generator, this pattern will never happen either.  Try generating hundreds of fake poll numbers, and see if the results magically fall into even-even and odd-odd pairs.

          There must either be a integer conversion error in the processing of real data, or an integer conversion  error in the generation of fake data.  Either way, it's a puzzle.

        •  No, if they were inventing, they would have been (4+ / 0-)

          careful not to make it obvious. This is obvious.  I agree that it is some kind of rounding error.

          •  But rounding doesn't produce matched even-odd (1+ / 0-)
            Recommended by:
            Lefty Mama

            Agree that my first thought was rounding error.  But the rounding of male numbers is independent of the rounding of female numbers.  I.e., assume they erroneously always rounded any fraction up to the nearest whole number.  If male is 56.3, or 56.8, 57.  If the female number is 45.3 or 45.8, assuming the same rounding error (up), your result will be 46.  One odd, the other even.

            Even if the rounding error were more complex and bizarre and unlikely, like "round all fractionate odd numbers down, all fractionate even numbers up," you don't get these results where there's a variance from paired-odds to paired-evens from week to week.  I.e., male 47.8 is rounded to 47.  Female 56.3 is rounded to 57.  They're both odd, but under this rule, ALL the results would ALWAYS be odd.  Rounding odds down and evens up would always get you male and female matched odds, but would mean no poll results were ever even, and these were.

            It can't be rounding error.

            Thought is only a flash in the middle of a long night, but the flash that means everything - Henri Poincaré

            by milton333 on Tue Jun 29, 2010 at 11:56:16 AM PDT

            [ Parent ]

            •  I suggested elsewhere in this thread (4+ / 0-)
              Recommended by:
              Caj, steve04, TexasLiz, MooseHB

              That the even/odd correlation was the result of artificially splitting doubled numbers (which would of course have to be even).  Any pair of numbers which have to add up to an even number are going to be either both odd or both even.

              That could be wrong, but it seems as plausible as anything else.  If correct, it suggests that R2000 at the very least massaged their initial numbers to create fraudulently detailed crosstabs.  It's just possible that the basic numbers are correct, but the crosstabs are faked; but I'm not holding out too much hope for that.

            •  As detailed by several others... (0+ / 0-)

              ...it is possible for a rounding error, albeit a fairly contrived one, to produce this type of effect.

              You're assuming that the rounding is applied to each number independently; if instead rounding is mistakenly applied to an average from which both numbers are derived, it can give them both the same residue modulo 2.

              But more to the point:  if it wasn't rounding, what on Earth could it be instead?

              •  rounding error is hardly an explanation (0+ / 0-)

                you're telling me that if you take Number 1 - number 2, the result is always always always going to be even, I'm not sure it's possible to come up with a rounding scheme to do that.

                I mean, if you have percentages of 2.1 and 8.9, how can you possibly round those to be either 2 and 8 or 3 and 9? How?  There is no consistent way to round.  And let's say "well, they always round up."  Well, what if it's 1.9 and 8.9 then?

                It couldn't even be something where, if it's odd they always round up (to the next even number) and if it's even, they always round down (to the existing even) because it's the margin that's always even, not the numbers.

                Someone brought up the suggestion that it would be "impossible" for the same thing to happen with a random number generator.

                That would be correct, which means we can conclude that they didn't use a random number generator (which they probably wouldn't anyway) but it in no way makes their numbers more valid.  Obviously whatever program they wrote to create these numbers, and I'm pretty convinced they were pulled out of thin air, it was written so that the margin in the crosstabs was pretty much always even.

                Perhaps to reduce suspicion, they then "calculated up" so that one didn't see the same numbers in the full numbers, but even then that doesn't explain the "lack of zero" problem, unless ALL the numbers were made up, but a check was done to make sure that if someone decided to manually re-calculate the total from the crosstabs, they still matched.

      •  This looks like a possibility. From TPM: (3+ / 0-)
        Recommended by:
        Gorette, MooseHB, Blue Shift

        Here's our initial write up of the Kos/R2K story. Our reporter got in touch with Del Ali, head of R2K who is declining to comment on advice of his lawyer but says "I will tell you unequivocally that we conducted EVERY poll properly for the Daily Kos."

        http://www.talkingpointsmemo.com/

        It is interesting that the response emphasizes having CONDUCTED the polls correctly. Not necessarily the analysis, though.

    •  Kudos on Transparency and this DOES explain Ohio (3+ / 0-)
      Recommended by:
      Gorette, greenearth, Tamar

      Thank you, Markos, for having the integrity to publish this.

      This goes a long way toward explaining the mysterious divergence in polling data for Ohio this cycle. Generally, this cycle, PPP, Rasmussen and the Ohio Poll -- all reputable in the case of Ohio -- have closely tracked each other on the major races for Governor and US Senate, showing Strickland with a consistent recent lag (except during the final primary months when Democrats dominated TV advertising).

      (One other caveat -- the Ohio Poll underwent major internal changes in late winter after a number of member newspapers withdrew. It's no longer the same polling outfit.)

      On the other hand, Quinnipiac has dominated Ohio polling, with blatantly screwy results, diverging from the above three. Quinn has consistently shown Strickland winning by a wide margin in northwest Ohio, which is impossible given historic patterns. It's been clear that Quinn is undersampling Independents, who are trending Republican this year.

      R2000 has been the only pollster to back Quinn's offbeat Ohio results, to the extent of reproducing Quinn's impossible numbers for northwest Ohio.

      This has not been legit. Someone was reproducing Quinn's flawed methodology with intention. I hope it's now corrected.

      PPP is out with a new poll today that shows Republican Kasich leading Strickland by 2 points, but Strickland stuck at 41%, with still a huge undecided segment. A Democratic incumbent in Ohio cannot win with only 41% this late in the game (and only 37% approval).

      Quinn is also out with a new Ohio poll today with the same garbage pattern and not worth reporting.

      Let's at least be clear on what's happening in Ohio.

      "Politics: The conduct of public affairs for private advantage." -- Ambrose Bierce

      by Ohiobama on Tue Jun 29, 2010 at 12:06:35 PM PDT

      [ Parent ]

      •  I don't think... (1+ / 0-)
        Recommended by:
        ItsSimpleSimon

        The polls support what you are claiming is happening in Ohio.  Consistently, different pollsters have put Strickland ahead of Kasich, including a brand new poll that just came out TODAY.  I think you're reading way too much into this without much supporting data.

        •  You didn't read what I said (0+ / 0-)

          The new poll that shows a Strickland lead is yet another by Quinnipiac. Look at the crosstabs. They are totally screwy -- impossible by historic regional trends.

          If you take R2000 out of the mix, that leaves only Quinn as the outlier giving Strickland a chance. Quinn is unreliable to the point where it needs to be disregarded.

          All other polls had Strickland lagging and declining, with the exception of near the end of the primary season, but that's now gone.

          "Politics: The conduct of public affairs for private advantage." -- Ambrose Bierce

          by Ohiobama on Wed Jun 30, 2010 at 05:13:15 AM PDT

          [ Parent ]

  •  Well done, Kos. (40+ / 0-)

    You show honesty and respect for the community by disclosing all this and you show integrity and commitment to truth.

    I hope R2K burns for this.

    We have just enough religion to make us hate, but not enough to make us love one another. -- Jonathan Swift

    by raptavio on Tue Jun 29, 2010 at 10:12:36 AM PDT

  •  Wow (10+ / 0-)

    What more can I say....

    Religion gives men the strength to do what should not be done.

    by bobtmn on Tue Jun 29, 2010 at 10:13:21 AM PDT

  •  Statistics detectives (17+ / 0-)

    nice work!

    "Empty vessels make the loudest sound, they have the least wit and are the greatest blabbers" Plato

    by Empty Vessel on Tue Jun 29, 2010 at 10:14:25 AM PDT

  •  Kos (29+ / 0-)

    I applaud your transparency on this issue.

    We did not come to fear the future. We came here to shape it. -President Barack Obama

    by BDsTrinity on Tue Jun 29, 2010 at 10:14:36 AM PDT

  •  I am still reading this, but wow. (14+ / 0-)

    Burning question: Why? Just for the sake of money? or for some other reason?

    There are moments when the body is as numinous as words, days that are the good flesh continuing. -- Robert Hass

    by srkp23 on Tue Jun 29, 2010 at 10:15:19 AM PDT

    •  Actually calling people is hard work (17+ / 0-)

      Seems like they studied the management style of GWB just a tad too closely.

      Better to just make it up and speak decisively.

      -2.38 -4.87: Damn, I love the smell of competence in the morning!

      by grapes on Tue Jun 29, 2010 at 10:24:02 AM PDT

      [ Parent ]

      •  Not only will the poster (7+ / 0-)

        get the money to be paid, but pocket the money they ould have paid in phones and personnel.

        "Empty vessels make the loudest sound, they have the least wit and are the greatest blabbers" Plato

        by Empty Vessel on Tue Jun 29, 2010 at 10:39:14 AM PDT

        [ Parent ]

        •  And can/will kos go after them for their (4+ / 0-)

          shenanigans?  All those polls since, what 2006-7? (dunno); that's gotta be some serious ching.  We could all have powered recliners with plasma vidscreens for dkos blogging for all that $$$

          Just found this via TPM

          Calling into question years worth of polls, Daily Kos founder Markos Moulitsas said today his site will sue pollster Research 2000 after a statistical analysis showed that R2K allegedly "fabricated or manipulated" poll data commissioned by Kos.

          Go gettum, kos!  I want my heated recliner with floating plasma vidsceen*!  

          *Ahem, strictly for blogging here...

          The Republican motto: "There's been a lot of progress in this country over the last 75 years, and we've been against all of it." ~ Hillbilly Dem's 78-yo Dad

          by JVolvo on Tue Jun 29, 2010 at 11:47:31 AM PDT

          [ Parent ]

      •  Kos should hire a reputable pollster (1+ / 0-)
        Recommended by:
        G2geek

        (Mark Penn?) to do some polling to find out if the expected number of people were actually polled by R2K.

        That ought to solve the mystery of whether or not this data is simply made up out of thin air?

        •  Sample sizes are tiny. (1+ / 0-)
          Recommended by:
          G2geek

          We're talking about 2,000 people representing maybe 200,000,000. So one in every hundred-thousand. The margin of error on these polls tends to be in the mid single digits, and we'd be trying to measure something we expect to come in at 0.001%.

          “If I can't dance to it, it's not my revolution.” — Emma Goldman

          by Jyrinx on Tue Jun 29, 2010 at 11:40:21 AM PDT

          [ Parent ]

          •  That's 2,400 people for 60 weeks (1+ / 0-)
            Recommended by:
            Jyrinx

            or an order of magnitude more than you say.

            That's way more people than watch Channel 297, yet they presumably can measure that by sampling methods . . . .

            •  Okay, so it's 0.06%, not 0.001%. (0+ / 0-)

              Still way too small to measure by national telephone polling.

              “If I can't dance to it, it's not my revolution.” — Emma Goldman

              by Jyrinx on Tue Jun 29, 2010 at 12:30:55 PM PDT

              [ Parent ]

              •  Then add the "six degrees of Kevin Bacon" (0+ / 0-)

                twist to the polling efforts . . . just ask each of the respondees to find somebody who has.  It shouldn't be that difficult . . .

                Six degrees of separation (also referred to as the "Human Web") refers to the idea that everyone is at most six steps away from any other person on Earth, so that a chain of, "a friend of a friend" statements can be made to connect any two people in six steps or fewer.

              •  Nope (0+ / 0-)

                It is too small to get a sense state by state, but the margin of error that goes with the polling is reliable. Statistical analysis works very well when done properly.

                The US Senate is begging to be abolished. Let's fulfill its request.

                by freelunch on Tue Jun 29, 2010 at 12:37:25 PM PDT

                [ Parent ]

        •  Probably way too much noise? (0+ / 0-)

          I assume (haven't checked though) that R2K claims to have polled people selected randomly for each poll from a large sample pool.

          I suspect that very few people who are polled even pay attention to the name of the organization that polled them let alone recall it weeks later. Perhaps those polled by a really well known name such as Gallup would be more likely to recall, but after a couple weeks, I suspect recollection of even that would be rare among most people. Most types of polls don't suffer from this recall problem as they are either asking for current opinion, current state (such as "Do you own more than two cars"), or very significant past events (such as "Has one or more of your children ever died under the age of 18 in a car accident where they were driving?")

          So the percentage of responses that are reliable would be tiny.

          If you asked "Have you been polled via phone by any pollster about political issues in the past six months?" you might get fairly reliable results. Although you would probably get a number of false negatives from people who thought the poll was more than six months ago when it was not and a lot of false positives by those who recalled the event as having occurred in the last six months when it didn't (I think people are biased towards answering "yes", or its equivalent, partially due to cultural influences and partially due to feeling more important by being part of a select group).

          Once someone has, hopefully correctly, indicated they were polled by some organization, I suspect almost none could correctly name the pollster in a "fill in the blank" type of question unless it was Gallup where they might occasionally. A lot would also, I suspect, incorrectly say "Gallup". If you give them a list, people are likely to pick one inaccurately for many reasons and would probably trend towards names they had heard through other sources. This is kind of like the memory test where a subject is given a list of items such as "hay, pin, game, thimble, sharp, medication" and then, a day later, asked to iterate the list and they are likely to include terms like "haystack", "needle", "thread", "sew", and "hypodermic". Don't discount the "Alvin Greene Effect"!

          Note it would be very unusual for one individual to have been polled twice by R2K so they would not have the benefit of repetition to recall the event ("Who just called dear? Oh, it was R2K yet again - I don't mind answering polls occasionally, but those guys call me every week.")

          So, a lot (perhaps most) of the answers would be unreliable which coupled with difficulties others have noted would make this a daunting task. Perhaps one could calibrate some of the influences I mentioned by running some fake polls followed by a real poll to check recall of polling events. This calibration might help quantify the degree of uncertainty of the R2K validation polls. In these fake polls, random people would be told they are being polled by [cycle through "PollsRUs", "Honest Abe Polling", "Gallup", "Data Cooking Inc"] and then ask these people some of the actual questions R2K asks. Then call these same people at various times over the following nine months with a "Have you been polled in the past six months and by whom?" poll.

    •  I kept thinking "why?", too. (4+ / 0-)
      Recommended by:
      Wee Mama, G2geek, JVolvo, jds1978

      What bubble do these people live in? With Markos putting out all the data, how did they ever think they could get away with this??

      Just sorta makes my head explode.

      It's only water. What could go wrong???

      by MrSandman on Tue Jun 29, 2010 at 10:42:11 AM PDT

      [ Parent ]

      •  If they aren't sued successfully (0+ / 0-)

        and perhaps even if they are, they may yet get away with it.

        "They paved paradise, and put in a parking lot."
        "...Don't it always seem to go, that you don't know what you've got 'til it's gone?"
        - Joni Mitchell

        by davewill on Tue Jun 29, 2010 at 10:54:25 AM PDT

        [ Parent ]

        •  That assumes (2+ / 0-)
          Recommended by:
          HudsonValleyMark, sydneyluv

          that they place zero value on their reputation as a polling company.

          Just because you can doesn't mean you should, but if you can, and you should, then you ought.

          by firant on Tue Jun 29, 2010 at 11:05:17 AM PDT

          [ Parent ]

          •  In "The Sociopath Next Door" Martha Stout (4+ / 0-)
            Recommended by:
            G2geek, JVolvo, Justus, sydneyluv

            points out that most sociopaths flame out because they have so little regard for good people that they figure that they can get away with anything.

            The US Senate is begging to be abolished. Let's fulfill its request.

            by freelunch on Tue Jun 29, 2010 at 11:10:15 AM PDT

            [ Parent ]

          •  If "they" are employees, then yes. (1+ / 0-)
            Recommended by:
            G2geek

            It's pretty common for a company to have employees who don't give a rat's ass about the company's reputation or mission or legacy.  

          •  Isn't that obvious? (1+ / 0-)
            Recommended by:
            G2geek

            If they placed value on their reputation, they wouldn't have cheated in the first place.

            "They paved paradise, and put in a parking lot."
            "...Don't it always seem to go, that you don't know what you've got 'til it's gone?"
            - Joni Mitchell

            by davewill on Tue Jun 29, 2010 at 12:29:21 PM PDT

            [ Parent ]

            •  Allow me to revise and extend... (2+ / 0-)
              Recommended by:
              RF, G2geek

              stating that they got away scot-free requires the underlying predicate that there is no value in the companies reputation.

              This is a polling company, to be caught making up numbers is to be destroyed.

              Just because you can doesn't mean you should, but if you can, and you should, then you ought.

              by firant on Tue Jun 29, 2010 at 06:25:10 PM PDT

              [ Parent ]

              •  To bad (0+ / 0-)

                County election officials throughout each and every county in the country didn't go by the same edict-

                ...to be caught throwing away ballots is to be destroyed...

                Reality is more like...

                ...to be caught throwing away ballots is to be ignored...

                Thinking Ohio, Arkansas, South Carolina and the infamous Florida butterfly toss, just to name a few notable eyebrow raisers ...

                Good polling can make red sore counties unfavorably inflammatory; I for one would welcome a cleansing and microscopic inspection of intent at the ballot box, albeit if this much scrutiny were focused on the actual voting process recent results in South Carolina and Arkansas, and distant elections held in Ohio in 2004 and Florida in 2000, the results would be much harder for even a Roberts Scotus to square-

                For now, this type work and forensics is very good for AMerica; IMHO most probably not an opinion shared by all political persuasions-

                Evidence that contradicts the ruling belief system is held to extraordinary standards, while evidence that entrenches it is uncritically accepted. -Carl Sagan

                by RF on Wed Jun 30, 2010 at 07:39:58 AM PDT

                [ Parent ]

                •  Fairly off topic ... (1+ / 0-)
                  Recommended by:
                  RF

                  but I don't believe there's anyone (at least here) who would disagree that messing with the vote is a whole different level of evil.

                  On the other hand, the reputation of the county organizers and the state attorneys are a) sheltered by the denunciation that any vote destruction evidence is part of a cooky left wing conspiracy theory, and b) the gains of vote fraud are significantly greater than the losses given that the local law enforcement in those cases were generally on the side of the fraudsters.

                  Just because you can doesn't mean you should, but if you can, and you should, then you ought.

                  by firant on Thu Jul 01, 2010 at 01:25:24 AM PDT

                  [ Parent ]

      •  maybe the same hubris that allows (1+ / 0-)
        Recommended by:
        G2geek

        so many Congressmen to have affairs; so many gay political and religious celebrities to think they can hide their private sexual encounters with public and fervently expressed homophobia.  
        The combined arrogance and stupidity of some people is always astounding.

        If, in our efforts to win, we become as dishonest as our opponents on the right, we don't deserve to triumph.

        by Tamar on Tue Jun 29, 2010 at 02:54:09 PM PDT

        [ Parent ]

    •  I have noticed over the last 10 years (2+ / 0-)
      Recommended by:
      JD SoOR, G2geek

      the subtle creep of incompetence and fraud into all lines of business.  I can't help think much of that is related to the tone set by the "leadership" of Bush and Cheney.  After awhile, when people in high places are seen to be complete fuckups, there are an awful lot of people at the grassroots who take that as a stamp of approval.  

      It doesn't help that guys like Bush, Cheney, Brownie, Libby...and on and on and on...got away with it.

  •  Damn. (7+ / 0-)

    But I'm not givin' in an inch to fear, 'cause I promised myself this year. I feel like I owe it to someone.

    by Its the Supreme Court Stupid on Tue Jun 29, 2010 at 10:17:23 AM PDT

  •  Very curious indeed (14+ / 0-)

    They had to use some technique for developing these numbers, and unless they are incredibly stupid (a real possibility), there should not have been any way to come up with these anomalies. Even making the numbers up from thin air should have eliminated the even-odd anomaly at the top. Perhaps they used much smaller populations, and used statistical techniques to pretend that they were the size reported.

    Do Pavlov's dogs chase Schroedinger's cat?

    by corwin on Tue Jun 29, 2010 at 10:17:54 AM PDT

    •  We don't tend to hear about the smart greedy (14+ / 0-)

      people until after they go too far and we start refering to them as stupid greedy people.

      Republican ideas are like sacks of manure but without the sacks.

      by ontheleftcoast on Tue Jun 29, 2010 at 10:33:32 AM PDT

      [ Parent ]

    •  There's a damned if you do and damned if you (5+ / 0-)
      Recommended by:
      corwin, Swordsmith, dnta, G2geek, Justus

      don't problem.

      Real random numbers are always going to do something that looks suspicious over time.  When our brains hear random, they really think somewhat interleaved.  But real randomness is clumpy.

      My favorite example would be a one-and-done coin flipping tournament.  No matter how many people you have in the tournament, the winner will call it correctly every single time.

      The overall ratio for the entire tournament will be a nearly perfect 50% if you've got enough flippers, but the winner will be a perfect 100%.  Likewise, half the losers plus 1 will be a perfect 0%.

      So -- if you're faking numbers, do you want to make them look fishy to most people or look fishy to an expert who probably never gets down and dirty with your numbers because you have a track record?

      The answer is probably avoid looking fishy to the lay people because that might cause enough agitation that somebody hires an expert to look your stuff over.

      Free speech? Yeah, I've heard of that. Have you?

      by dinotrac on Tue Jun 29, 2010 at 10:56:13 AM PDT

      [ Parent ]

    •  An alternate explanation (6+ / 0-)

      This could be simply due to a software bug.

      It would be hard for it to be due to a software bug in the software for interpreting their results, though I suppose possible. But it's much more likely to be due to a software bug in the software they use to fabricate their results.

      Insufficient QA strikes again? I wonder.

      -fred

      •  Yes. (8+ / 0-)

        Of course, it could be a software bug in processing legitimate poll data, or it could be a software bug in generating fake data.

        I think either way there has to be a bug, to produce such a weird and consistent anomaly.

        It would be hard for it to be due to a software bug in the software for interpreting their results, though I suppose possible. But it's much more likely to be due to a software bug in the software they use to fabricate their results.

        I can't say for sure, because I've seen some pretty nutty bugs.  

        But overall I share your suspicions:  I'm trying to think how this could happen by accident to natural data, and how it could happen when making fake data, and everything I can think of for the former case just seems a bit too contrived.

        •  Totally agree. (0+ / 0-)

          Nobody falsifying data would do the odd odd / even even thing.  A software bug, of a variety of types but most involving integer handling errors, could explain several of the anomalies.  Kos needs to be careful with public statements on this lest it be shown to be an innocent data handling bug rather than malicious fabrication of input data; if it's the former, Kos' public statements might constitute slander.

          No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

          by steve04 on Tue Jun 29, 2010 at 12:45:18 PM PDT

          [ Parent ]

          •  bullsh*t (2+ / 0-)
            Recommended by:
            G2geek, WillR

            "Kos' public statements might constitute slander."

            Kos says he doesn't have confidence in the numbers.  He says that haven't stood up to analysis by other statisticians.  He has given them opportunity to respond.  

            Where's the slander?  

            Gentlemen, you can't fight in here! This is the War Room.

            by RickD on Tue Jun 29, 2010 at 02:37:58 PM PDT

            [ Parent ]

            •  Really? (0+ / 0-)

              Kos is great and all, but he's walking a fine line between protecting his reputation and making unprovable/irresponsible statements about R2K before he even launched his lawsuit.  I don't envy his position at all--not least because he has a book at the printer right now based on R2K numbers.

              No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

              by steve04 on Tue Jun 29, 2010 at 03:27:35 PM PDT

              [ Parent ]

              •  Someone's about to lose a ton of credibility (1+ / 0-)
                Recommended by:
                G2geek

                Somebody is mistaken or worse. To me, R2K is suspicious. As Nate Silver pointed out, they produced nearly 'perfect' data as well as data that doesn't track well with other pollsters. What's most disconcerting isn't only the fact that the data seems, if forged, clumsily so. It's that the problems only reveal themselves after closer examination, just as it would if people were trying to deceive.  A code bug wouldn't produce data that was superficially convincing but hollow after closer examination. Bugs aren't that intelligent - they don't have command the 'tactical deception' cognitive module. It would probably produce data that  very obviously strained credibility.

                We should wait until all the facts are in before jumping to conclusions, but I think that something very serious is afoot here. The fact that Research 2000 wouldn't put in the leg work to keep a valuable client - and also prevent said client from potentially destroying their business - by backing up its own data with proof is raises some serious red flags.

              •  "unprovable/irresponsible statements about R2K" (1+ / 0-)
                Recommended by:
                G2geek

                Apparently you are not qualified to judge the proofs offered by the expert witnesses.  This case is a slam dunk.

                You do not see behavior like this in data that has been collected by a random sampler.  The evidence given is conclusive.  

                Gentlemen, you can't fight in here! This is the War Room.

                by RickD on Tue Jun 29, 2010 at 09:27:12 PM PDT

                [ Parent ]

                •  I am qualified, I'm not impressed (2+ / 0-)
                  Recommended by:
                  steve04, G2geek

                  I'm a graduate student at Princeton doing statistical consulting for the neurology department, and work in Polling Analysis with Sam Wang at the Princeton Election Consortium.

                  The results are odd but not conclusive. This seems more likely to be the result of a software bug, and possibly some funky stratification, then outright fraud.

                  It doesn't help that the accusers, Weissman and co, have a bit of a reputation in the poli-sci community as being whack-jobs.

                  •  Seems unlikely. (0+ / 0-)

                    It seems unlikely that all these anomalies could be explained by "a" software bug. It would seem to require multiple pretty amateurish bugs.

                    This is not impossible of course. I see it mostly in those who seem to program by random keystrokes (well not quite, but seemingly close) until the program compiles. Then iterating through a fix/unit test cycle until it passes the tests and is declared done. This of course works if your tests actually provide 100% coverage (and the tests correctly reflect the "requirements") -- however the first assumption is effectively impossible to achieve and the second is not possible without formal specification and test validation methods which, as far as I can tell, are not commercially viable. Note that in this process, qualified independent detailed review of requirements, designs (if they exist), code, and tests is rare. This is the approach of "Testing Quality In" (a.k.a. "Teaching the Software to the Test") rather than "Designing Quality In and Validating as a Check" -- the former just doesn't work well.

                    Anyway, it seems that the number of bugs that would have to exist in the software to create just these errors suggests that ALL the code is likely/probably unreliable and ALL results from it should be considered useless until independent parties review or independently redesign/rewrite the programs and test them against data sets OTHER than the R2K data sets. Once the new programs are fully tested, they should be applied to R2K's RAW data to find if the results match all "corrected and final" (if any) R2K results.

                    [I find many companies that claim to be following an "agile" development methodology are actually doing something like this.]

                •  You just don't understand what has been proven. (0+ / 0-)

                  I don't deny that the results as reported cannot accurately represent a random sampling.

                  The part that hasn't been proven relates to fraud, intent, and scope.

                  Please, spare me the simplistic conclusions about my qualifications to judge matters relating to digital processing of numbers and statistical sampling.  Would you care to share what your qualifications are, such that you're in a position to judge mine?

                  No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

                  by steve04 on Wed Jun 30, 2010 at 12:47:47 PM PDT

                  [ Parent ]

          •  there is no way this is a bug of valid data (1+ / 0-)
            Recommended by:
            WillR

            And R2000's lack of statement or response further supports this.  If it were a bug, what they should have done was say "our bad, we think there may be a software problem, we'll get back to you."  but they didn't.

            I cannot think of any possible rounding methods to come up with the results that we see, where the difference between the numbers are ALWAYS even.

            Occam's razor would suggest that the numbers were flat out made up.

            •  asdf (1+ / 0-)
              Recommended by:
              steve04

              I cannot think of any possible rounding methods to come up with the results that we see, where the difference between the numbers are ALWAYS even.

              I can:  you store the mean of the two numbers, and the absolute deviation.  If these numbers are accidentally stored as integers, any data derived from them will have this property.

              Note that this will happen with every possible rounding method.  Whether you round up, or down, or to the nearest integer, however you handle 0.5, it doesn't matter:  as long as the average and deviation are converted to integers, (A+D) and (A-D) will differ by an even number.

              •  well (1+ / 0-)
                Recommended by:
                Nowhere Man

                at least I can at least confirm your math on that part.  However:

                There are two completely and utter fails if this is what happened:

                1. storing the numbers as integers, which would be a massive programming failure, AND
                1. Why are you even calculating the numbers in that way anyway?  It still takes two variables/spaces/whatever to save the data, so why not straight up save it as M and W instead of A and V?  It makes no sense, and a decision to store them in such a way defies all comprehension.

                I guess you've shown that it's possible for a pollster conduct a valid poll, resulting in statistically impossible looking data, given the assumptions that whoever wrote their program was both an absolute moron and a braindead programmer. AND no one ever noticed that, hey, our M and W numbers have changed pre-putting them in the program and post.

                It still doesn't explain why R2000 didn't immediately say something like "we are launching an investigation as to why this is the case" instead of saying "we'll give you our raw data" and then failing to do so.

                And further, it only explains away one of the three problems they found.  There are still the other two to deal with.  Or were those a result of stupidity as well?

                •  asdf (1+ / 0-)
                  Recommended by:
                  steve04

                    1. storing the numbers as integers, which would be a massive programming failure, AND

                  While this is a big mistake, it is a depressingly common one.  Several famous bugs are due to improperly storing data as signed int variables.
                   
                  It doesn't help that in many programming languages, expressions like (100+1)/2 evaluate to a whole number by default.

                    2. Why are you even calculating the numbers in that way anyway?  It still takes two variables/spaces/whatever to save the data, so why not straight up save it as M and W instead of A and V?

                  Good question.  It is indeed daft and contrived to do it this way, and it doesn't sound like a plausible explanation for that reason.

                  But bear in mind that any other mistake that generates this type of anomaly would also be pretty daft and contrived.  I defy you to think of any other way to accidentally produce even-even and odd-odd pairs that doesn't also sound idiotic in retrospect.

                  •  Caj and I each independently came up with (0+ / 0-)

                    the same method.  I haven't seen anyone else attempt an explanation for the statistically impossible odd odd and even even pairings.  I can't easily come up with a second explanation, though surely others do exist.

                    In my day to day work crunching data, I will often read in a raw dataset and translate it into a slightly processed version that is easier to handle.  If you naively set up a model for tracking this data to prevent polling values adding up to 101% rather than 100%, you might force some time of bug like the odd odd/even even breakdown that the R2K crosstabs have.  The difference is that I compare full precision calculations to final rounded values to make sure they are accurate to the expected level of precision based on inputs, but that's because I'm an engineer and if my calculations are off, the product doesn't work as well and real-world physics come home to roost in ways that are incontrovertible.

                    R2K has no incentive to falsify polling data in a way that is so obviously impossible--a human being generating those numbers would have produced a more pseudo-random pattern.

                    No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

                    by steve04 on Tue Jun 29, 2010 at 05:47:15 PM PDT

                    [ Parent ]

    •  Is it possible (1+ / 0-)
      Recommended by:
      G2geek

      that they have a program, where they plug in stuff like a range for each population group, a range for which way each population group would swing, and then a place to basically put in the final numbers you want, and have the program generate the rest.

      That could explain the "rare zero" anomaly, which suggests actual human-decided final results, with automated cross-tab numbers which were generated to look, and even on a cursory review, can be calculated to appear valid.

      I seem to doubt that the crosstab numbers were generated manually, because most people probably WOULD vary the even/odd pairings, which suggests that whatever program was used to generate them had a glitch which caused the difference in the two numbers to always be even (save for 2 cases, apparently).

    •  Maybe they didn't make up the numbers (1+ / 0-)
      Recommended by:
      G2geek

      just misrepresented the number of contacts.  Rather than sample 2400,, maybe they only sampled 1200 and multplied each number by 2.  That would always get you a even number.  If the contact is the biggest time consumer, using a small number of actual contacts and just multiplying everything by 2 or 4 always a makes for even numbers.  

      "War is Peace, Freedom is Slavery, Ignorance is Strength", George Orwell, "1984" -7.63 -5.95

      by dangoch on Tue Jun 29, 2010 at 04:02:50 PM PDT

      [ Parent ]

  •  1 in 10000000000000000000000 (14+ / 0-)

    000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000
    00000000000000000000000000000000000000000000.

    Yeah, that's pretty damn unlikely.

    The play's the thing wherein I'll catch the conscience of the king.

    by KroneckerD on Tue Jun 29, 2010 at 10:18:32 AM PDT

  •  Great lesson in statistics (15+ / 0-)

    Great detective work by the 3 esteemed gentlemen! I hope Kos sues the pants off of these Research 2000 fraudsters. This is just outrageous fraud.

    I wonder if anyone has bothered to run similar analyses on other pollsters. Polling is like the Wild West- there is very little regulatory oversight. Something tells me R2000 is not the only pollster who has been tempted to cheat.

    •  Strategic Vision (12+ / 0-)

      Strategic Vision also flunked statistical tests.(Check old posts on fivethirtyeight.com)  It does take a lot of work, however, to make sure the results are air-tight. Also it takes a good data-base. More of these checks should be run.

    •  There is little regulatory oversight, but for (0+ / 0-)

      most polling companies, political polling is only a small, although very visible, part of their business. Getting a good reputation in political polling is how they get the real business in, consumer research. And a good reputation in political polling is measurable, because it is the only polling where you can compare the sample that the polling company takes with the whole population, i.e. the election. Well, technically you can with other polling as well, but by then you have wasted millions of dollars on introducing a new product that nobody wants.

      The FOX is a common carrier of rabies, a virus that leaves its victims foaming at the mouth and causes paranoia and hallucinations.

      by Calouste on Tue Jun 29, 2010 at 12:06:58 PM PDT

      [ Parent ]

  •  User-Generated Content (4+ / 0-)
    Recommended by:
    clyde, milton333, JVolvo, Justus

    The lack of zeros is pretty disturbing. Did the crosstabs display characteristics of cooked data when sub-populations were evaluated longitudinally across multiple polls? That is, was there anything you could determine by comparing the way white females tracked vs. black females, or vs. white males, from week to week?


    "I play a street-wise pimp" — Al Gore

    by Ray Radlein on Tue Jun 29, 2010 at 10:20:06 AM PDT

  •  I Could Not Follow Everything in the Report (8+ / 0-)

    But I was able to grasp that the numbers provided by the polling company were extremely anomalous -- a "one in a million chance" of occurring.  I would not know how to independently validate their argument, but things look really bad.

    And on to other polls:

    How do we know when/if other polling companies are generating fraudulent data?

    "Simon Wiesenthal told me that any political party in a democracy that uses the word 'freedom' in its name is either Nazi or Communist."

    by bink on Tue Jun 29, 2010 at 10:20:42 AM PDT

    •  We don't (1+ / 0-)
      Recommended by:
      kurt

      A major part of the reason R2000 was selected, if I recall correctly, was that they were willing to provide full crosstabs and allow DKos to publish them. So many other pollsters could be doing similar things and we'd never know. (One ray of hope is that the crosstabs should be available to paying subscribers, and they may now be prompted to have statistical analysis done to make sure they're not paying for garbage.)

  •  Time for a lawsuit, kos? (7+ / 0-)

    Your money was stolen from you, plain and simple. Time to get it back.

    •  It feels to me (1+ / 0-)
      Recommended by:
      beauchapeau

      like our money was stolen from us.

      "the human animal is well adapted to a great many different diets. The Western diet however, is not one of them." Michael Pollan

      by greycat on Tue Jun 29, 2010 at 11:39:29 AM PDT

      [ Parent ]

    •  Yep, per TPM (0+ / 0-)

      Hammertime!

      Calling into question years worth of polls, Daily Kos founder Markos Moulitsas said today his site will sue pollster Research 2000 after a statistical analysis showed that R2K allegedly "fabricated or manipulated" poll data commissioned by Kos.

      Two weeks ago, after Kos dropped R2K for inaccuracy, a group of three of what Kos calls "statistics wizards" began looking at some of the firm's data and found a number of "extreme anomalies" that they claim may be the result of some kind of randomizer.

      The Republican motto: "There's been a lot of progress in this country over the last 75 years, and we've been against all of it." ~ Hillbilly Dem's 78-yo Dad

      by JVolvo on Tue Jun 29, 2010 at 12:00:21 PM PDT

      [ Parent ]

  •  Outstanding work (11+ / 0-)

    The structure and role of variation is arguably the worst understood major component of modern (also presumably ancient) life.

    We have the statistical tools to use patterns of variation as powerful diagnostic tools. In certain cases, they become a nearly infallible lie-detector.

    Yet a large percentage of the American public is totally unaware of any of this.

    Kudos to the team for direct, succinct analysis and a clear and obvious explanation.

    -2.38 -4.87: Damn, I love the smell of competence in the morning!

    by grapes on Tue Jun 29, 2010 at 10:21:47 AM PDT

  •  Second pollster to go down recently (12+ / 0-)

    FiveThirtyEight recently uncovered similar statistical anomalies in a Republican pollster.

    The time is definitely NOW to go over all other pollsters' results with a fine-tooth comb. It's a very easy industry to simply lie in, and any liars will probably learn their lesson from these two examples and start to add plausible sampling error to all their out-of-the-ass made-up numbers.

    Senate rules which prevent any reform of the filibuster are unconstitutional. Therefore, we can rein in the filibuster tomorrow with 51 votes.

    by homunq on Tue Jun 29, 2010 at 10:21:51 AM PDT

    •  I still think the liars are the exception... (3+ / 0-)

      ... but I'd bet there are a few more of them lurking out there nonetheless. Catch them now, before they wise up.

      Senate rules which prevent any reform of the filibuster are unconstitutional. Therefore, we can rein in the filibuster tomorrow with 51 votes.

      by homunq on Tue Jun 29, 2010 at 10:23:15 AM PDT

      [ Parent ]

      •  Wouldn't the market drive towards liars? (1+ / 0-)
        Recommended by:
        theran

        If most liars aren't caught, and you increase profits in the short term by lying, the market would reward lying. That money would lead to more investment in the firm, giving you capital to extend your operations, improve marketing, etc.

        So, if there are more than 1 or 2 other liars, the best bet is that most of the pollsters are liars rather than the reverse. The only question is the average-time to discovery.

    •  This is sad to say part of what science is about (0+ / 0-)

      Thank God there are people like Nate Silver and Mark Blumenthal and the folks, Grebner, Weissman and Weissman, who did this investigation, to keep tabs on this stuff.

      Science is about discovering mistakes, errors or gaps in understanding and correcting them. Granted these mistakes and errors aren't supposed to be - and aren't the great majority of the time - deliberate, but it's still good for everybody for someone to catch them. The more the unscrupulous and the fraudulent get weeded out of the public opinion business, the better vision we have of what public opinion actually is.

  •  Difference Between Progressives and Conservatives (23+ / 0-)

    While Kos will get pounded by the right wing media for this, his honest admission of the polling problems is something I suspect we'll never see from anyone using Rasmussen to bolster their ideology. The progressive political sites and commentators regularly correct their errors, while the right wing pundits only correct them under the greatest duress, if at all.

    Good job, Kos! This is what we need to see much more of in politics.

  •  Polling is a great business... (8+ / 0-)

    if you don't actually do the polling. Imagine the profit margin if you're just making the stuff up!

    More than one person at R2K must have known about this. Reminds me a little of Madoff's boiler room.

  •  you might be able to apply Benford's rule to (3+ / 0-)
    Recommended by:
    ArkDem14, Justus, James Robinson

    some of those sets of numbers to check for true randomness.  I'm no expert, but it's a simple check that your average person can use to check any group of allegedly random numbers for "naturalness".

  •  While I agree with the overall analysis (1+ / 0-)
    Recommended by:
    grapes

    the claim that the M/F favorability numbers have no possible correlation is bunk. If the pollster calls households with both men/women in it there is some probability that the opinions will be the same. It won't be 1.0 but it'll be greater than 0.0. Thus the 50/50 chance of the values being the same is bogus. Now the fact that 777 of them matched? That's still not remotely possible, but probably much smaller than the 1E+228 claimed

    Republican ideas are like sacks of manure but without the sacks.

    by ontheleftcoast on Tue Jun 29, 2010 at 10:24:26 AM PDT

    •  IMHO... (10+ / 0-)

      In my limited experience, I have never had a pollster ask me to hand the phone over to the wife AFTER I've completed their poll. I suspect that SOP is to only collect a response from 1 person per household.

      "Leave it as it is. You cannot improve on it. The ages have been at work on it and man can only mar it." -President Theodore Roosevelt

      by DemHikers on Tue Jun 29, 2010 at 10:27:09 AM PDT

      [ Parent ]

    •  Depends on polling methodology (2+ / 0-)
      Recommended by:
      milton333, ontheleftcoast

      Of course, here it looks like they didn't actually have a polling methodology.

      But if they did, the decision to poll multiple individuals on one phone call would have to be part of the polling design. Given the likely correlation, it would be hard to justify it except as a cost-saving measure - then the analysis methodology would have to take that into account.

      -2.38 -4.87: Damn, I love the smell of competence in the morning!

      by grapes on Tue Jun 29, 2010 at 10:28:42 AM PDT

      [ Parent ]

    •  I may have misread, but I think they're saying (10+ / 0-)

      that it's unusual how the favorability numbers correlated in terms of always being both even or odd:

      In one respect, however, the numbers for M and F do not differ: if one is even, so is the other, and likewise for odd.

      In other words, not that there's no correlation between them, but that they correlate on what should be an otherwise arbitrary detail.  You'd still have a 50/50 chance of that, no matter how many households were being polled.

      Why, for example, if favorability among men was at 34%, should favorability among women always have been at 46% instead of 45% or 47%?  That part doesn't make sense.

      Saint, n. A dead sinner revised and edited. - Ambrose Bierce

      by pico on Tue Jun 29, 2010 at 10:32:59 AM PDT

      [ Parent ]

      •  Data doesn't distribute into buckets (2+ / 0-)
        Recommended by:
        steve04, pico

        evenly. I'm wondering if the size of the buckets had any impact on these results. For example, say we poll 101 people and get 50 yes and 51 no votes. Does that show as 49/51 or 50/50, or even 50/51. Anyway you do it your slightly misrepresenting the results. 49.5/50.5 is closer to the truth but by standard rounding rules you round 49.50495 up to 50 and 50.49505 down to 50 so you'd report 50/50.

        Republican ideas are like sacks of manure but without the sacks.

        by ontheleftcoast on Tue Jun 29, 2010 at 10:44:02 AM PDT

        [ Parent ]

        •  Sure, but here we're talking about (7+ / 0-)

          two separate sets of buckets, right?  There might be some level of correlation between them insofar as some of the people polled live in the same household, but there's really no way that both numbers will turn out even or both numbers will turn out odd every time, even if you manipulate the data to get round numbers.  No matter the method you'd still end up with a 50/50 shot of odd/even on each of the results, right?  There's just no way these things should correlate at all.

          Saint, n. A dead sinner revised and edited. - Ambrose Bierce

          by pico on Tue Jun 29, 2010 at 11:03:43 AM PDT

          [ Parent ]

          •  It really doesn't matter that much (2+ / 0-)
            Recommended by:
            steve04, pico

            Coin flips aren't really 50/50 but people make that claim all the time. Small errors creep into statistics all the time, that's what I was trying to get at. There could be some honest mistake in the methodology used to collect the data, there could be an unknown correlation between the two numbers, but to make the claim that the two values are absolutely, guaranteed, 100% completely independent and extrapolate that to the 228 digit conclusion is just as bogus. But I guess the only point that matters is calling R2K the worst of the worst of the worst to the umpteenth power.

            Republican ideas are like sacks of manure but without the sacks.

            by ontheleftcoast on Tue Jun 29, 2010 at 11:33:25 AM PDT

            [ Parent ]

            •  Sure, well as they say above (1+ / 0-)
              Recommended by:
              ontheleftcoast

              if you keep flipping 100 coins and getting 50/50 every time, something is wrong there, too!  It's not impossible that men's/women's responses would correlate on even/odd values nearly every time, it's just insanely unlikely - and that's one of the reasons they've asked for R2K to respond, to figure out how this happened.  I'm not a statistician, but I can't imagine it happening without there at least being some kind of data manipulation involved.  As I outlined in a comment below, even basic ways of rounding the data should result in a high level of disparity.  

              As you said, if there are a lot of couples living together, we'd expect maybe there'd be correlation in areas like: do we so both numbers rising or falling?  For even/odd, though, I just can't imagine any 'natural' reason for that, and even the best case scenario looks bad for R2K.

              Saint, n. A dead sinner revised and edited. - Ambrose Bierce

              by pico on Tue Jun 29, 2010 at 11:49:30 AM PDT

              [ Parent ]

    •  even-odd vs. average values (9+ / 0-)

      Whoa, you're confused on multiple grounds. We carefully explained that the M-F percents were generally quite a bit different from each other. It only takes 1% to switch from even to odd!
      And on a minor point- that's not how polls are taken.

      •  And your focusing on an arbitrary (1+ / 0-)
        Recommended by:
        sneakers563

        bit that may/may not truly be independent. There are all sorts of ways that correlation can occur, yes, fraud is one of those ways and given the other evidence here I agree that is possible.

        Republican ideas are like sacks of manure but without the sacks.

        by ontheleftcoast on Tue Jun 29, 2010 at 10:50:32 AM PDT

        [ Parent ]

        •  "all sorts of ways" (2+ / 0-)
          Recommended by:
          RickD, Pender

          Look, if you have a thought about innocent reasons why the numbers would tend to be both even or both odd, what is it?

          Just saying that it's an "arbitrary bit that may/may not truly be independent" is handwaving, pure and simple.

          •  Actually it's a well known problem (0+ / 0-)

            in computing. The low order bit of a pattern/signal can have all sorts of goofy statistical problems. This comes up all the time in signal/image processing, and compression. I don't know if there was a stupid and possibly honest mistake in the methodology or some unknown correlation behind the data. I'm strongly suspicious of fraud. But the "it's got to be 50/50" claim doesn't pass the sniff test.

            Republican ideas are like sacks of manure but without the sacks.

            by ontheleftcoast on Tue Jun 29, 2010 at 12:57:24 PM PDT

            [ Parent ]

            •  50-50 (0+ / 0-)

              We didn't say that the last bit should be 50-50 even odd. (It wasn't far, however, see Jon's blog.) We said that the last bits of two separate random variables, which are substantially different from each other and each contain uncorrelated random noise with sigma more than 2, should be nearly uncorrelated. This claim was vetted in detail by two of the best statisticians in the country.

            •  more handwaving (1+ / 0-)
              Recommended by:
              HudsonValleyMark

              the bit parity is happening in pairs of variables that one would expect to be independent (while not happening in pairs of variables that we know are not independent).  

              There's no good explanation for this.  Saying "well, there might be a good explanation for this" while muttering stuff about parity doesn't constitute an explanation.  

              Gentlemen, you can't fight in here! This is the War Room.

              by RickD on Tue Jun 29, 2010 at 02:41:05 PM PDT

              [ Parent ]

      •  Could it have been some kind of rounding (2+ / 0-)
        Recommended by:
        steve04, milton333

        algorithm? The original data had to be converted to fractions and would typically had to be rounded for the tables. Could there have been a some kind of programming glitch so that the rounding always went in the same direction?  

        •  Even if that were the case, (3+ / 0-)
          Recommended by:
          milton333, Pender, KroneckerD

          you can put together some hypothetical data points to see why you'd still end up with even:odd differences:

          M = 34.767
          W = 43.126

          Drop the decimals, E:O.  Round up, O:E.  Round to the nearest, O:O.

          M = 34.226
          W = 43.126

          Drop the decimals, E:E.  Round up, O:O.  Round to the nearest, E:O.

          M = 34.767
          W = 43.512

          Drop the decimals, E:O.  Round up, O:E. Round to the nearest, O:E.

          See what I mean?  Regardless of what rounding mechanism you choose, you're still going to get a lot of different even:odd values.

          Saint, n. A dead sinner revised and edited. - Ambrose Bierce

          by pico on Tue Jun 29, 2010 at 11:07:29 AM PDT

          [ Parent ]

          •  But you're doing it right. (7+ / 0-)

            The data could be processed in a brainlessly wrong way that would consistently produce E-E and O-O pairs.

            For example, I could take your numbers and store them as an average and a deviation:

            M = 34.767   -->  A=38.946
            W = 43.126   -->  D=-4.1795

            Then I'd retrieve the data as M=A+D, W=A-D.  If I mistakenly stored A and D in integer fields, I would have consistently even-even and odd-odd pairs.

            Now, I can't for the life of me figure why anyone would do this, except that I've seen impressively dumb code for a long time.

            On the other side of the coin, if I tried to generate fake numbers using the same mistaken technique, I'd get the same anomaly.  That strikes me as a more plausible explanation, because a faker of data may decide to choose a target approval rating, and then choose a random deviation between men and women to make it seem like natural data.

            •  Good points. n/t (1+ / 0-)
              Recommended by:
              steve04

              Saint, n. A dead sinner revised and edited. - Ambrose Bierce

              by pico on Tue Jun 29, 2010 at 12:15:25 PM PDT

              [ Parent ]

            •  You put it much more clearly than I (1+ / 0-)
              Recommended by:
              pico

              That's the point I tried to make up-thread.

              No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

              by steve04 on Tue Jun 29, 2010 at 12:52:31 PM PDT

              [ Parent ]

            •  Red'd for "impressively dumb code" n/t (1+ / 0-)
              Recommended by:
              pico

              Electronic media creates reality - Meatball Fulton

              by zeke7237 on Tue Jun 29, 2010 at 01:20:57 PM PDT

              [ Parent ]

            •  Good point, but... (2+ / 0-)
              Recommended by:
              steve04, pico
              That would be incredibly stupid code--and yes, I've run across that too so we can't rule it out--but why hasn't Research 2000 responded?  I mean, it should be fairly easy to find the routine where those values are calculated and see what's going on.

              R2000's silence speaks volumes.

              •  I think it's clear they messed up (2+ / 0-)
                Recommended by:
                RF, pico

                Publicly admitting as much just makes it easier for Kos to extract financial damages.  They obviously don't want to refund Kos $$$ for "work" already performed, however badly, because they've presumably paid people for the work done and can't easily recover costs themselves.  It'll all come out in discovery in the lawsuit--but at this point it's clear they did not deliver the product that was paid for, whether by accident or on purpose.

                No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

                by steve04 on Tue Jun 29, 2010 at 02:54:03 PM PDT

                [ Parent ]

        •  Research 2000 (5+ / 0-)
          Recommended by:
          OLinda, G2geek, TexasLiz, Pender, KroneckerD

          has had 2 weeks to explain the issue. They've refused. If it was this simple an explanation, they would've offered it.

          And god knows, I would've killed to have the raw data or a valid explanation to justify the anomalies.

          •  If you're suing them, why would they comment (0+ / 0-)

            publicly?  Kos, you need to be very careful here with your public statements.  If they end up to have performed the polls, and just screwed up in their number handling code, I think that's less damning than fabricating results, which is what you are saying publicly.

            No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

            by steve04 on Tue Jun 29, 2010 at 12:53:54 PM PDT

            [ Parent ]

          •  See this comment by Caj on potential mistakes (0+ / 0-)

            http://www.dailykos.com/...

            The key thing is that if they process the data in a stupid way, it could produce the odd odd / even even pairing.

            No human massaging numbers by hand would generate that odd odd / even even pattern, so I think it has to be a mistake in number precision handling, either with falsified data, or with legitimate data, but the statistical analysis here merely proves that the numbers reported are not plausible; nothing can be proven about the raw inputs based on the reported outputs.

            No on Prop 8::Sometimes I get to hitch a ride on the Democratic Bus--they let me stand on the back bumper.

            by steve04 on Tue Jun 29, 2010 at 01:07:10 PM PDT

            [ Parent ]

        •  i'm not seeing it (0+ / 0-)

          Let's put it plainly: numbers that are measured and stored in a table do not have this behavior.  At no point should it be necessary to store any fractional data as an integer.  

          If you were measuring numbers and storing them in this way, why would you pair %fav(men) with %fav(women)?  The natural thing to do (when calculating the numbers) would be to group the numbers with the same denominator, i.e. %fav(men) with %unfav(men).  

          Gentlemen, you can't fight in here! This is the War Room.

          by RickD on Tue Jun 29, 2010 at 02:45:24 PM PDT

          [ Parent ]

          •  asdf (0+ / 0-)

            At no point should it be necessary to store any fractional data as an integer.

            That's why doing so would be considered a bug.

            •  not a bug (0+ / 0-)

              an idiocy.  This is 2010.  Any moderately intelligent programmer understands about the existence of non-integer data types.

              It'd be like "accidentally" having an ATM greet the user with "Guten Tag!" when it's supposed to make a greeting in English.

              Gentlemen, you can't fight in here! This is the War Room.

              by RickD on Tue Jun 29, 2010 at 09:24:35 PM PDT

              [ Parent ]

    •  how do i put this nicely? (4+ / 0-)
      Recommended by:
      theran, codairem, Pender, IL JimP

      Your argument is ridiculous.

      Whether the approval rating among men is even is even or odd (or rounds to even or odd) as opposed to the same statistic for women is about as close to a pair of independent random variables as you can get.

      I challenge you to demonstrate otherwise.  

      The male population and the female population clearly are disjoint sets of opinion-makers.  Whatever correlation there is between the values measured does not extend to a parity test.

      If you cannot formulate a mathematical model that would explain correlation of this nature, perhaps you can justify your hypothesis by looking at data sets from other sources?  Show a significant difference in the parity measure for other sets of random data.  

      Gentlemen, you can't fight in here! This is the War Room.

      by RickD on Tue Jun 29, 2010 at 10:59:38 AM PDT

      [ Parent ]

  •  Very impressive, if only other pollsters were so (0+ / 0-)

    open with their cross tabs.

    That was a great description of solving a puzzle.

  •  Wow, I had WONDERED where the R2K polls went! (4+ / 0-)

    I have to admit, I was VERY concerned.  For the past couple of weeks, the Obama Fav/Unfav that greeted us on the banner of Kos has inexplicably vanished.

    Was the Grinch responsible?  The Cookie Monster?  It looks like a combination of both.

    The terribleness of this news can not be underestimated.  I had seen the rankings on 538 as well, and had become very concerned then...but held my piece not being an official statistician, just someone very good with and very interested in them.

    The ONE shining silver lining is that is FAR better to have this all out in late JUNE than in late SEPTEMBER.

    The type of data R2K polls provided was invaluable for getting a good measure of the opinions of the people in as unbiased manner as possible.  I hope we find another outfit as prudently possible.

    Somewhere there is a kettle of rotten fish.  I hope we get to the bottom of it soon!

    What separates us, divides us, and diminishes the human spirit.

    by equern on Tue Jun 29, 2010 at 10:25:29 AM PDT

  •  For those of us who are not versed in statistics, (5+ / 0-)

    does this boil down to "we can't prove it, but it seems that some or all of the poll numbers were simply made up?"

  •  This couldn't have been easy to do. (6+ / 0-)

    And I'm not talking about the statistics. Great work by Mark, Michael and Jonathan.

    Markos, well done facing down the problem openly & transparently - just as you'd hoped the polls were done. It's a shame that this happened, but at least the hard work done to identify the fraud will remove another leech from the body politic.

    There are enough dirty actors in politics to obfuscate political reality & dirty media bending their services to appease their subjects that we don't need dirty filters like R2K clouding the landscape before even reaching the morass that is Washington, DC.

    I'm not an actor, but I play one on TV.

    by zeitshabba on Tue Jun 29, 2010 at 10:29:11 AM PDT

    •  It's very strange - and very dumb (13+ / 0-)

      The analysis strongly suggests that they made these numbers up subjectively.

      It would have been a lot easier to build a program (excel would do it at a stretch) to generate data that would have passed these forensic tests - or at least have been ambiguous. Start with the desired averages and a good random number generator and work backward.

      Assuming that they were cutting corners on other clients, why did they not build a better tool to fake the data?

      More worrisome is the possibility that there are other polling firms that have built a more effective poll simulator.

      -2.38 -4.87: Damn, I love the smell of competence in the morning!

      by grapes on Tue Jun 29, 2010 at 10:37:23 AM PDT

      [ Parent ]

        •  Fear of this could open crosstabs & auditing (3+ / 0-)

          If there is enough fear raised about this type of fraud, the real answer would be to institute independent, polling-firm paid auditing of the methods and analysis systems.

          Companies hire CPAs to do their books and issue letters to their annual reports.

          Same deal.

          -2.38 -4.87: Damn, I love the smell of competence in the morning!

          by grapes on Tue Jun 29, 2010 at 10:46:34 AM PDT

          [ Parent ]

      •  Worst RNG ever? Perhaps. (8+ / 0-)

        The problem I see in building a better method to randomly generate results is what scope do you allow the random number generator to operate in? To keep the numbers realistic & believable, you have to ensure that they do not stray far from the real pollsters yet convey some level of uniqueness. Provide a baseline & a reasonable level of variation and viola! Your cooked numbers should look good.

        I buried my programmer past in a shallow grave over a decade ago and the trained mathematician within me lies dormant, but damn I know I could write a program within a month or two that could produce results that would pass the initial scrutiny that led to this investigation.

        My guess on why no better tool to cheat? They were too lazy, too arrogant & too confident the charade wouldn't be exposed. Seriously, if you know your client is transparently offering all your cross-tab data and think this hokey fake data would stand up, you are a damn fool and deserve every bit of misery that the lawyers will bring down upon you.

        I'm not an actor, but I play one on TV.

        by zeitshabba on Tue Jun 29, 2010 at 10:45:37 AM PDT

        [ Parent ]

        •  That's easy -- if you start with a known (1+ / 0-)
          Recommended by:
          Justus

          distribution. You just randomly produce data for your poll from that distribution -- and good random number generators are in most operating systems.

          I bet they did have a cooked program -- but wrote their own piss-poor random number generator, or used one of the bad random generators that are faster.

      •  I'm not an expert, but isn't it likely (2+ / 0-)
        Recommended by:
        milton333, Justus

        they simply used much, much smaller samples, then massaged the data to look more "real?" That would explain most of this stuff.

        "They paved paradise, and put in a parking lot."
        "...Don't it always seem to go, that you don't know what you've got 'til it's gone?"
        - Joni Mitchell

        by davewill on Tue Jun 29, 2010 at 11:02:09 AM PDT

        [ Parent ]

      •  You assume too much thought was involved (0+ / 0-)

        Assuming that they were cutting corners on other clients, why did they not build a better tool to fake the data?

        Consider for a moment that after all these years, we still convict criminals with fingerprint evidence, even though any idiot can prevent this by wearing gloves.

        If indeed someone fell behind and decided to fake data with a computer program, it's quite plausible that they didn't know that much about what they were doing, or didn't have time to do it right.

        •  Faking could be harder than getting real results. (0+ / 0-)

          The better the fake results the more effort and money it costs to develop the software, hand create fake data, etc.... Basically, if you are greedy enough to fake something like this eventually you are probably greedy enough not to spend a lot of money on generating really good faking software or processes.  At the extremes it eventually becomes easier to do the poll right than to fake it.

  •  Were they telling us what we wanted to hear? (9+ / 0-)

    I was wondering if they were telling us what we wanted to hear, or adjusting numbers to justify the polling-

    i.e. in a small order like Dkos, if they show a tie in a Senate primary, we will likely order 12 more polls.  If they show an 8 pt difference, we may order 3, if they show a 15 pt gap, we may order none.

    So I wonder if they were "fixing" the results to drive poll ordering???

    I also note that even though they were "bad" polls (though no worse than Rasmussen), R2K polling races no one else was polling at least helped drive some other pollsters into some races.

    •  What was their ultimate goal with fake data? (3+ / 0-)
      Recommended by:
      SLKRR, milton333, Gay In Maine

      Were they acting like the House of Ras, trying to steer the narrative with bad data?

      Or just moronic con artists with no horse in any race?

      I hope it's the second for entertainment purposes.

      I'm not an actor, but I play one on TV.

      by zeitshabba on Tue Jun 29, 2010 at 10:50:11 AM PDT

      [ Parent ]

      •  I think its an important question (3+ / 0-)
        Recommended by:
        zeitshabba, boadicea, milton333

        A poster above (one of the authors of the report?) said that they didn't go into a lot of detailed analysis beyond what's written up here, because hey--once you know you can't trust the numbers, you just DON'T, and that's it.

        But I am VERY curious about whether they tried to suggest any particular conclusions with their fake numbers, for profit (like the idea above of a tie leading to more polls being ordered) or otherwise (for example [insert any conspiracy theory you care to cook up about why someone would want to screw Markos/DailyKos]).

        •  rereading my comment, I apologize... (2+ / 0-)
          Recommended by:
          RickD, Justus

          I think I may have misrepresented docmidwest's comment a bit, so I'll quote the one I'm talking about and let dmw speak for him/herself rather than putting words in his/her mouth:

          I don't think they had the crosstabs broken down that far. We don't want to get into discussing all the other statistical features, because for obvious reasons it's prudent to stick with ones that we repeatedly checked, vetted, etc. If you find a unicorn, a griffin, and a sphinx in your yard, you don't really have to keep looking that much further to know things are a little strange.

        •  Why would anyone want to screw over DailyKos? (2+ / 0-)
          Recommended by:
          kydoc, annominous

          Oh...
          Um...
          Nevermind.

          I'm not an actor, but I play one on TV.

          by zeitshabba on Tue Jun 29, 2010 at 11:26:22 AM PDT

          [ Parent ]

    •  That makes sense as to the explanation of why (1+ / 0-)
      Recommended by:
      milton333

      they were so stupid -- if they were trying to "up the orders", it's difficult not to inject human errors since you'd be massaging data on a case by case basis, rather than simply running a good statistical program to produce your data in the first place.

      Much better conspiracy than some of the attempts at "why?"

  •  Even Disraeli would be amazed (6+ / 0-)

    Pretty bad when the statistics themselves are clearly "damned lies".

    I am become Man, the destroyer of worlds

    by tle on Tue Jun 29, 2010 at 10:29:41 AM PDT

  •  Great Job (0+ / 0-)

    Thanks for making the community trustworthy again.

    Timing is everything: I met my wife at a bookstore and now we both have Kindles. . .

    by Gilmore on Tue Jun 29, 2010 at 10:31:33 AM PDT

  •  Selective or across clients? (6+ / 0-)

    It would be very time consuming, but wondering if such issues exist across their client base or are selective? You would know for example, which of your clients had better data than others which could lead one to exert leverage based on the outcomes.

    The perfect plan, Is not the man Who tells you, You are wrong

    by dss on Tue Jun 29, 2010 at 10:31:44 AM PDT

  •  So where did the data come from? (3+ / 0-)
    Recommended by:
    Timaeus, milton333, Justus

    In my albeit limited experience with fraudulent data, people rarely make things up out of whole cloth. That's too hard. If you are going to "dry lab it", it is easier to take real data and change it.  So, the other half of the sleuthing in this case would be to see where the numbers that were purported to be polling data actually came from. Aggregates of other published polling data? Extrapolations from a small data set polished to look better?

    Poking holes like this is important and probably reaches the "reasonable doubt" threshold, but there is always room for argument and rebuttal.

    •  rebuttal (5+ / 0-)

      If there were "room for argument and rebuttal" we would not have published.

      •  Yeah, those probabilities are unarguable (1+ / 0-)
        Recommended by:
        KroneckerD

        I'm still interested in both how and why they did it, but it doesn't look like there's much room for (honest) rebuttal

      •  Ok. But even a case supported by strong physical (1+ / 0-)
        Recommended by:
        Justus

        evidence can be helped by a motive, kwim?  If the original pollsters were simply making things up, they would not have made such an obvious error as the M/F O/E one.  That is clearly a red flag. So putting aside outright invention, knowing where the numbers did come from would be helpful. For example, R2K could argue that the primary data were solid, but there was a programming error that corrupted the end figures.

        •  The assumption that they wouldn't have made (0+ / 0-)

          an obvious error is unwarranted. If you are trying to fool a layperson-consumer (Kos) you may use a method that is obvious to a statistician but non-obvious to the layperson.  They may not have expected that anyone would bother to do the work to prove the errors weren't random.

          In any event, the analysts are saying that these errors are ONLY reproducible by making up the numbers.

          Two wrongs don't make a right, but three rights do make a left.

          by Simian on Tue Jun 29, 2010 at 12:49:21 PM PDT

          [ Parent ]

          •  As I said above... (2+ / 0-)
            Recommended by:
            Simian, Russycle

            The better the fake the more effort it is, and the more effort the more it costs.  So, if you are faking in order to make more money, you are probably greedy.  If you are greedy enough to fake data, you are probably greedy enough to skimp on faking it.  Finally, at the extremes, faking the data is probably harder and more expensive than just doing the poll.

      •  Rebuttal? (0+ / 0-)

        I think the phrase you're looking for is 'rectal extraction'.

        Oh, wait.  You aren't talking about the generation of these numbers...

  •  Well, polls in general (3+ / 0-)
    Recommended by:
    blueoasis, Justus, annominous

    should be taken with a grain of salt.

    And basing future reliability on past results is not necessarily very predictive.  That's to say, just because AP/GfK did really well in, say, 2008, doesn't mean they're going to continue to do well.  In the end it's a crap shoot.

    (Wasn't Zogby considered super-reliable at one point?)

    Pollsters know this.  They know that for all the statistical reliability data, we're talking about moving parts in real time which are in constant flux.  Attitudes and perceptions don't lend themselves to easy "1 or 2" binary thinking.  And when you ask "man on the street" to give you his opinion in binary ways, you're not likely to get meaningful results.

    So in this way, when you ask, "Do you approve or disapprove of the job President Obama is doing?" you're getting a very kneejerk reaction - a snapshot of a moment in time.  That's good as far as it goes, but there are more meaningful and valid scales.

    I'll give you a couple for-instances...

    1.  On a scale of 1-10, 10 being outstanding and 1 being terrible, how would you rate the President's job performance?  (Then you can take the scale of numbers, which is more complex ... and you're very likely to end up with pretty small variance because of regression to the mean, but you can still read some tea leaves.)

    But to really know what "man on the street" is thinking, you need to listen more.  That's one reason I think it's so valuable to do follow-up interviews that give people more opportunity to go deeper.

    Full Disclosure: I am not Ben Leming. But I think he's pretty cool.

    by Benintn on Tue Jun 29, 2010 at 10:37:55 AM PDT

  •  Shit (4+ / 0-)

    I just realized the Kos based much of his forthcoming book on Research 2000s polling!

    I think its already in the pipeline to come out any day now.

    Sorry Kos.  I hope it is not yet printed and you have time to fix it.

    "Empty vessels make the loudest sound, they have the least wit and are the greatest blabbers" Plato

    by Empty Vessel on Tue Jun 29, 2010 at 10:42:05 AM PDT

  •  Wow... (0+ / 0-)

    If I were a pollster interested in self-preservation, this would make me LESS likely to offer to publish my cross-tabs.  The only way it happens is if clients demand it en masse.

    •  If the pollster is honest, they have nothing to (7+ / 0-)

      ...fear from providing the raw data and the polling methodology for independent checking.

      A pollster who does that would benefit from an open, community review process, just like open-source software does.

      "Certainly the game is rigged. Don't let that stop you; if you don't bet, you can't win." Lazarus Long

      by rfall on Tue Jun 29, 2010 at 10:45:20 AM PDT

      [ Parent ]

      •  That's the head-scratcher (4+ / 0-)
        Recommended by:
        tunesmith, Lefty Mama, eyesoars, Justus

        If you know you're cooking the numbers, and know that your client will post your cross-tabs . . .?  I mean, I'm convinced that the numbers are faked, I'm just having trouble understanding why R2k believed it could fake the numbers knowing that the cross-tabs were posted.

        Thought is only a flash in the middle of a long night, but the flash that means everything - Henri Poincaré

        by milton333 on Tue Jun 29, 2010 at 11:33:57 AM PDT

        [ Parent ]

        •  yes. (1+ / 0-)
          Recommended by:
          milton333

          This is a key point to me.  For all we know there could be a major PR volley coming back from R2K, including libel accusations, etc.

          •  Although (0+ / 0-)

            Even with the cross-tabs posted, no one noticed this problem for, what, a year and a half?  Until R2K was so far off-base with Lincoln/Halter?  I mean, it seems obvious when it's laid out here, but they did get away with it for all this time.

            Thought is only a flash in the middle of a long night, but the flash that means everything - Henri Poincaré

            by milton333 on Tue Jun 29, 2010 at 12:06:04 PM PDT

            [ Parent ]

            •  Or maybe they haven't been "cheating" all along, (0+ / 0-)

              perhaps that is a relatively new phenomenon for them. Perhaps when they changed their methodology for polling? That signaled a change of some sort in the company's approach. Maybe new personnel or financial constraints were involved.

              Getting away with something is easier if you have a good reputation to begin with, too. Given the relatively few people able/interested in delving at depth into statistics, particularly when there is no reason to suspect them, the greater surprise may be that they were caught so soon. Without the raw data, this type of analysis is challenging enough to deter many people, too.

              Those who are too smart to engage in politics are punished by being governed by those who are dumber. ---Plato

              by carolita on Tue Jun 29, 2010 at 02:06:46 PM PDT

              [ Parent ]

        •  The ancient Greeks had a word for it (0+ / 0-)

          It starts with "h".

          If Nixon was cocaine for the resentful psyche, Palin is meth—Andrew Sullivan

          by ebohlman on Tue Jun 29, 2010 at 06:05:18 PM PDT

          [ Parent ]

  •  Wow, Mark...A long way from pinball at Case Hall. (3+ / 0-)
    Recommended by:
    milton333, Justus, James Robinson

    Very interesting analysis.

    I've noticed a lot of little oddities in the R2K numbers, but have assumed them to be typos -- probably were, in fact.

    The no zero analysis is very telling.  In the absence of major news, zero should be the most common difference, especially when you consider random error -- the pollster's version of quantum uncertainty.

    One question ---

    Did you look for any oddness related to big news events, or would the numbers be too small?

    It would be interesting to know what the little statfakers did to ensure credible results when the news required jumps in the numbers.

    Guess it just goes to show: "Grebner. No worse than the rest"

    Free speech? Yeah, I've heard of that. Have you?

    by dinotrac on Tue Jun 29, 2010 at 10:45:52 AM PDT

  •  Creators of reality. (1+ / 0-)
    Recommended by:
    G2geek

    This begs the question: how to ensure that the other pollsters don't simply generate their results? After all, it is cheaper.

    So where's all the outrage against anti-atheist bigotry?

    by skeptiq on Tue Jun 29, 2010 at 10:46:46 AM PDT

  •  Wow (1+ / 0-)
    Recommended by:
    G2geek

    Thank you for the detailed, interesting, and reasonably accessible statistical take-down of R2K.

    This really makes me sad. Not only that Kos was defrauded, but that we really can't trust any polling after this. How will I get my polling fix during election season?

  •  One thing you should not do is beat yourself up. (2+ / 0-)
    Recommended by:
    sydneyluv, James Robinson

    Smart people steeped in numbers always have the upper hand when you trust them, and you should never do business with people you don't trust.

    Bottom line:

    You're going to get burned sometimes.

    It happens to EVERYBODY -- E V E R Y B O D Y

    Beautiful way to handle it, BTW.

    Free speech? Yeah, I've heard of that. Have you?

    by dinotrac on Tue Jun 29, 2010 at 10:49:24 AM PDT

    •  that's why you must understand numbers too! (2+ / 0-)
      Recommended by:
      Pender, James Robinson

      Unfortunately too many people can analyze books, magazines, newspapers and politics, but not enough can read a graph and tell you if it makes sense or not.

      In a democracy, everyone is a politician. ~ Ehren Watada

      by Lefty Mama on Tue Jun 29, 2010 at 11:01:19 AM PDT

      [ Parent ]

      •  That's a good thing, but understanding numbers (1+ / 0-)
        Recommended by:
        James Robinson

        and delving into patterns that show up only over time and making sense of them is not the same thing.

        The folks at R2K understand numbers very well and are well-equipped to do a little wool-pulling.

        Free speech? Yeah, I've heard of that. Have you?

        by dinotrac on Tue Jun 29, 2010 at 11:06:29 AM PDT

        [ Parent ]

        •  Apparently not. (0+ / 0-)

          If they had cheated competently, they never would have gotten caught. Just look at competing polls to assess averages and variances overall and in each cross-tab category, and then use a random number generator to reconstruct approximately the same means and variances from the ground up.

          That kind of cheating is essentially impossible to catch without full data disclosure (i.e. exactly whom you called, at what time, and what their answers were, for each and every one of 2400 calls). But they faked it stupidly. They cut corners.

  •  what this smells like (3+ / 0-)
    Recommended by:
    SLKRR, Lefty Mama, KroneckerD

    The parity issue shows that not only were these numbers not generated as a result of actual polling, but also that they were not created randomly.  

    So why, if a polling firm were going to create numbers intentionally, would they do so in a way that would contain a flaw that would be obvious to any minor statistical analysis?  You have to really want the pattern to be detected eventually.  Otherwise, there's no rational explanation for it to be there.

    My conclusion?  Research 2000 was engaged in an attempt to ratf*ck Kos.  The plan would be to publish bogus numbers that were favorable to Obama for a couple years, and then when election season came, somebody would blow the whistle on them.

    Sound unlikely?  Look at what happened to Dan Rather.  

    Good on Kos for sussing this out (and to Nate Silver, for being the one to first notice the problem with Research 2000).

    Gentlemen, you can't fight in here! This is the War Room.

    by RickD on Tue Jun 29, 2010 at 10:51:37 AM PDT

    •  Or they were just lazy? (0+ / 0-)

      Hired an incompetent programmer who didn't know the differenct between a good PRNG and a bad one?

      •  Doubtful that any PRNG was involved (2+ / 0-)
        Recommended by:
        milton333, Justus

        Even the default, non-cryptographic quality PRNGs on most systems are much better than this. This has all the hallmarks of someone sitting in front of Excel, making up numbers by hand.

        Moreover, a novice programmer is much more likely to use the rand() function straight out of the standard C library (or a wrapper around it in most other languages) than attempt to create a new one. He might fail to seed it properly, but that generates an entirely different class of statistical anomalies.

        The sort of flaws you see in non-crypto grade random number generators are difficult if not actually impossible to detect in small datasets like this. If R2K actually went to the trouble of generating fake raw data, that's where you'd see predictable series in the numbers. Since they have been unable to crank out phony raw data to go along with their phony results -- which wouldn't be terribly hard to do -- that again suggests that their operation wasn't sophisticated enough to involve even a newbie programmer, and we're back to hand-generated data.

        Yes we can! The president, however, I'm not so sure about.

        by eodell on Tue Jun 29, 2010 at 11:29:45 AM PDT

        [ Parent ]

        •  Actually, a game library is quite likely I'd say (0+ / 0-)

          "Even the default, non-cryptographic quality PRNGs on most systems are much better than this."

          There's a specific class of not-really-RNGs that are used to provide perceived randomness instead of actual randomness, though, and I very strongly suspect that's what's going on here.

          What a human perceives as "random" distribution is actually not random at all, but "uniform normal" distribution. And in the case of games, like for instance when I open this chest is it full of diamonds or is it an Acme Exploding Boom trap, a purely random number generator will sooner or later provide a chain of results that the user perceives as not being random. And will get really quite amazingly upset about. To fix this problem, game companies have specific probability generators -- you can't really call them "random number" generators -- that provide the human illusion of true randomness through redistributing the actual results.

          And one of the telltale signs that such a not-really-random generator is in use is that they have a very VERY strong bias against repeating consecutive results, as humans incorrectly perceive even a single instance of repetition as lack of randomness quite readily.

          I strongly suspect that the lack of 0 changes is a smoking gun piece of evidence that somebody who was tasked with putting together the FakePollGeneratingEngine™ was just aware enough of programming theory to realize that random data will occasionally produce suspicious-looking runs, and thought that using a uniform distribution engine as described above, whose whole purpose is to eliminate those, would be a better idea than using their environment's version of rand(). (Which isn't very random in any case, I trust at least arc4random() would be the alternative, but I digress...)

          Unfortunately for them, they were not aware enough of programming theory to realize that the instant you mathematically analyze the results the lack of proper randomness is very obvious indeed.

          •  "uniform normal"?? (0+ / 0-)

            Which is it, uniform or normal?

            Ordinarily, these two terms are used to refer to quite different things.  

            Gentlemen, you can't fight in here! This is the War Room.

            by RickD on Tue Jun 29, 2010 at 02:15:29 PM PDT

            [ Parent ]

            •  It's naming-by-sarcasm (0+ / 0-)

              That's pretty much the point, actually, that perceived randomness is reached by a sequence which is neither uniform nor normal.

              "Evenly distributed overall with medium to large gaps between individual items in the sequence" would be an accurate way to describe a distribution that people perceive as random. But "uniform normal" is funnier.

    •  No evidence of this. (0+ / 0-)

      Probably just an error-ridden crappy generator program.

      So where's all the outrage against anti-atheist bigotry?

      by skeptiq on Tue Jun 29, 2010 at 11:07:45 AM PDT

      [ Parent ]

      •  what kind of errors? (0+ / 0-)

        Seriously, what kind of "crappy generator program" would produce this kind of output?  The evidence is in the numbers: you don't see numbers like this randomly.  It's not like the problem of random number generation is novel.  It's older than I am (and I was born in the late 60s.)

        The argument for detecting non-randomness is that the statistic varies considerably from what would be seen randomly.  But this behavior is not only non-random, it is absurdly non-random.  

        Gentlemen, you can't fight in here! This is the War Room.

        by RickD on Tue Jun 29, 2010 at 02:19:08 PM PDT

        [ Parent ]

    •  Man, get that tin foil hat on, quick (2+ / 1-)
      Recommended by:
      MadEye, steve04
      Hidden by:
      RickD

      How much would R2000 have to be paid to make it worth their while to deliberately screw with dkos, knowing they would eventually get caught, ruining their reputation and ending their business?  

      More than anyone would be willing to pay, is my guess.

      •  Now we're back to Dunning-Krueger (0+ / 0-)

        People who are really good at their jobs underestimate on average how good they are. People who are lousy at them overestimate how good they are.

        The US Senate is begging to be abolished. Let's fulfill its request.

        by freelunch on Tue Jun 29, 2010 at 12:44:57 PM PDT

        [ Parent ]

      •  not really needed (0+ / 0-)

        your sarcasm isn't appreciated.  

        What is your explanation for what has happened here?  Why would this pattern have appeared, if not intentionally?  

        I don't need a tin-foil hat, I have a Ph.D. in mathematics, thank you very much.

        Gentlemen, you can't fight in here! This is the War Room.

        by RickD on Tue Jun 29, 2010 at 02:20:19 PM PDT

        [ Parent ]

        •  Just goes to show you what a PhD is worth these (0+ / 1-)
          Recommended by:
          Hidden by:
          RickD

          days.  Who knows what they did with (or to) the data, but I'm pretty sure it wasn't part of some evil Republican plot.  And perhaps you are way beyond the tin foil hat stage.

          •  you want a CV? (0+ / 0-)

            You are certainly in way over your head here.

            HR for douchebaggery.

            Gentlemen, you can't fight in here! This is the War Room.

            by RickD on Tue Jun 29, 2010 at 09:29:57 PM PDT

            [ Parent ]

            •  You are engaging in HR abuse; read the rules of (1+ / 0-)
              Recommended by:
              WillR

              this site and kindly remove your HRs.  Whether or not you have a PhD in mathematics is irrelavant to your claim that this whole R2000 business is part of some plot to bring down dailykos.  That is the wildest of speculation, and you have provided no evidence whatsover in support of that speculation.

          •  well (0+ / 0-)

            you certainly are "beyond the ability to interact in a polite way with people" stage.

            Gentlemen, you can't fight in here! This is the War Room.

            by RickD on Tue Jun 29, 2010 at 09:31:50 PM PDT

            [ Parent ]

    •  Remember William of Occam and his razor.... (0+ / 0-)

      Beware convoluted conspiracy theories.

      •  and... (0+ / 0-)

        What would Occam say about this?  

        What is the "simple explanation" for this phenomenon?

        "Random number generator" doesn't work.

        Gentlemen, you can't fight in here! This is the War Room.

        by RickD on Tue Jun 29, 2010 at 02:22:36 PM PDT

        [ Parent ]

        •  Simpler is (0+ / 0-)

          that some department within R2K was running over budget/deadline and decided to cut corners, most likely by polling fewer respondents than they were supposed to and then "amplifying" their results.

          Either way, Kos didn't get what he paid for.

          If Nixon was cocaine for the resentful psyche, Palin is meth—Andrew Sullivan

          by ebohlman on Tue Jun 29, 2010 at 06:40:23 PM PDT

          [ Parent ]

          •  "some department within R2K" (0+ / 0-)

            You mean the guys working out of Kinko's?

            I'm still waiting to hear how the numbers given could be produced by any method other than fraud.  

            Gentlemen, you can't fight in here! This is the War Room.

            by RickD on Tue Jun 29, 2010 at 09:28:53 PM PDT

            [ Parent ]

  •  Fascinating (0+ / 0-)

    I haven't thought about some of those statistical concepts since grad school . . .

  •  I'm guessing they slipped into fraud accidentally (2+ / 0-)
    Recommended by:
    milton333, sneakers563

    It doesn't make sense that a fraudster wouldn't cook the books in an undetectable way. If I consciously wanted to make a fraudulent poll result, I'd write a program to do it, and I'd use a good random number generator plus some other pollster's recent averages as the parameters. I wouldn't do it by hand!

    So, perhaps they didn't get all the calls they needed one week - so they made some up - then it quickly became easier to make up numbers than track actual phone calls...

    In a democracy, everyone is a politician. ~ Ehren Watada

    by Lefty Mama on Tue Jun 29, 2010 at 10:58:51 AM PDT

    •  Why would you think this (0+ / 0-)

      People said the same thing about Bernie Madoff, but it was a Ponzi all along.  Markos got ripped off, which is a bummer, but the scammer was caught and will be dealt with.

      I don't see any reason to make excuses when we don't know anything about the motives.

    •  triple negatives (3+ / 0-)
      Recommended by:
      zeitshabba, Justus, Book of Hearts

      Please don't do this.  It makes my head spin.

      It doesn't make sense that a fraudster wouldn't cook the books in an undetectable way.

      Gentlemen, you can't fight in here! This is the War Room.

      by RickD on Tue Jun 29, 2010 at 11:03:47 AM PDT

      [ Parent ]

    •  I guess their generator was crappy, (3+ / 0-)
      Recommended by:
      Lefty Mama, RandomSequence, TexasLiz

      possibly the programmer's mistake.

      So where's all the outrage against anti-atheist bigotry?

      by skeptiq on Tue Jun 29, 2010 at 11:06:38 AM PDT

      [ Parent ]

      •  maybe they just trusted a poor RNG (1+ / 0-)
        Recommended by:
        milton333

        I guess I have training in this stuff (masters in statistics and computer science), all random number generators are pseudo-random and need lots of testing to see if they actually mimic random behavior - which is a great exercise for graduate students IMHO.

        Typical random number generator is of the form:

        ( 914593 * (seed) + 592643 ) mod 296430 = (new seed)

        Then you have to normalize it so the result is between 0 and 1 with 6 digit precision. If it works well, you get something that looks like a uniform distribution. You can feed that into formulae to get different distributions like normal, exponential, etc.

        In a democracy, everyone is a politician. ~ Ehren Watada

        by Lefty Mama on Tue Jun 29, 2010 at 11:17:57 AM PDT

        [ Parent ]

    •  If the partners actually did the programming... (2+ / 0-)
      Recommended by:
      milton333, Lefty Mama

      but if they hired some kid to put the program together -- or just used Excel without reading up on PRNG issues?

      Easy enough to screw up.

      •  IIUC, issues with PRNGs (1+ / 0-)
        Recommended by:
        Lefty Mama

        tend to manifest as things like periodicity (they repeat themselves if you run them long enough) and repeating patterns in the last several digits. It'd be much more subtle than this — what we've found is a very consistent pattern in the last bit (i.e. the even-or-odd bit).

        So for things like generating cryptographic keys, where you need unpredictable results even if you generate a ton of random numbers, PRNGs have problems. But for fudging two-digit polling data they'd work just fine, I imagine, or at least the issues wouldn't be this blatant.

        “If I can't dance to it, it's not my revolution.” — Emma Goldman

        by Jyrinx on Tue Jun 29, 2010 at 11:47:59 AM PDT

        [ Parent ]

  •  A while back I joked... (10+ / 0-)

    about a new business model:

    I am going to be a "pollster".  I'll charge less than other people, and report the median of other polls plus normally distributed noise.  

    Apparently R2K wasn't even as smart as my joke.  

    This is a very nice catch by the authors.

  •  Question about point 2 (1+ / 0-)
    Recommended by:
    milton333

    I think - overall - that this makes a convincing case, but I have a question about point 2.

    The week-to-week ratings constitute a time-series. The variance of each is not independent from week to week - people's feelings about the president are not only determined by what he did or said in the last week. they might remember a bit more than that! Second, it might be a reasonable assumption that results positively correlate between the two groups: people who call themselves "Others" MIGHT react in the same way to an action of the president as people who call themselves "Independent". Both of these are likely to inflate within-series variance in relation to differences between the series.

    As you describe it, treating the data as independent points and fitting to a simple Chi-squared distribution would not take these innocent factors into account, and would tend to generate the same type of anomaly observed. Maybe I am missing something from the abbreviated description of the statistical tests?

    •  good question, and more rigorous than mine nt (0+ / 0-)

      "It's called the American Dream, 'cause you have to be asleep to believe it" Mr. Geo. Carlin

      by Mark B on Tue Jun 29, 2010 at 11:06:53 AM PDT

      [ Parent ]

    •  correlations (2+ / 0-)
      Recommended by:
      elfling, Justus

      For detailed questions, it might make sense to step over to Jon's new blog.

      But meanwhile, short answer: all the points you describe raise the variance. What we found was a variance that was way too low. To the extent any of those factors are left (we used the techniques described to remove most of them), we overestimate the p-value, i.e. overstate the likelihood the results could be legitimate.

      •  which variance? (1+ / 0-)
        Recommended by:
        Justus

        Thanks for your answer! I had a (quick) look over at the other blog, but didn't find details about this analysis yet...

        But I think there is still an issue here that I don't get that is quite non-technical. If I understand correctly (I probably don't!), the expected variance of the differences between the groups was calculated from the variance within the groups themselves, e.g.:

        "The expected variance of the sum or difference of two independent random variables is just the sum of their expected variances. "

        If, however, the week-to-week variation within the two groups is not independent (that is, Independents react in somewhat the same way to external events as Others), surely that would reduce the variance of the difference between the groups compared to what might be expected.

        I am sure that you have indeed used some technique to remove this problem, but I disagree with you in that I think that the basic method should underestimate the p-value. In particular, I am not sure about the comparison of the expected variance of 80.5 with the actual 9.947 is fair.

        Thanks for engaging in the comments: making the data available is step 1 towards transparency; helping people properly understand it is step 2.

        •  random variance vs. actual change (0+ / 0-)

          The actual change in the two groups should be correlated, as you say. The random statistical variation is uncorrelated because these are disjoint sets. That's why we looked at the difference- we only wanted to look at random statistical variance. Just one problem- it wasn't there.

          •  I still don't get it (0+ / 0-)

            well I should wait until I can read the detailed report.

            I understand why the difference in margins was calculated.

            I still don't understand how the expected variance for the difference in margins was calculated... or rather, I don't understand how the description of how it was calculated could be correct. It can't have been as simple as summing the two within-group variances because the groups aren't independent. I must be missing something.

            thanks again.

            •  summing variances (0+ / 0-)

              There are two parts of the variance:

              1. due to actual changes
              1. due to random sampling.

              These should add up. The total we found was way less than (2) alone. And for (2), unlike (1), the variance of the differences is indeed the sum of the variances. That simple.

              •  technical version of report? (0+ / 0-)

                I still don't get how adding the variances of separate groups to create an expected variance for the differences is valid: the correlation between the groups invalidates an assumption of the theory.

                Is a technical write-up of the report available?

                •  adding variances (0+ / 0-)

                  We ad the calculated statistical variances, which are indeed of independent random variables. The other stuff you're talking about is all add-ons, about which we make no assumptions. We don't have to, because the reported variance is already way too low for the statistical part alone. Whatever non-statistical part might have leaked through only makes things worse.

      •  Stratified sampling? (0+ / 0-)

        They could have done some stratification or smoothing that easily would have reduced variance. Then the Chi-test wouldn't have been appropriate.

  •  Well Done! (2+ / 0-)
    Recommended by:
    OLinda, theran

    Glad to see this problem and subsequent analysis made public. Hold their feet to the fire!

    Cheers to improving the rigor of discourse on this site.

  •  Reminds me of Mendel's scam. (1+ / 0-)
    Recommended by:
    James Robinson

    He had too many zeros, instead of missing zeros.

    The difference is, Mendel was right -- he unnecessarily scammed data to make it look "better", when that "better" was actually worse.

  •  but is it possible that they have built an (1+ / 0-)
    Recommended by:
    sneakers563

    error into their algorithm inadvertently? I know they'd still be bad pollsters, but some kinds of rounding algorithms can generate bad data. And, I am surrounded by immunologists who are better statisticians than me, and they often see things I don't.

    I'm sure your analysts have thought of this, maybe they could still give this layman an answer?

    "It's called the American Dream, 'cause you have to be asleep to believe it" Mr. Geo. Carlin

    by Mark B on Tue Jun 29, 2010 at 11:05:19 AM PDT

  •  Not just faked, but *stupid* (6+ / 0-)

    I have a degree in computer science. What I know about statistics stems from two 400-level courses: a semester on probability, and a semester on statistics.

    I could have faked this polling data much more convincingly than these people did. In fact, evidence suggests that this was faked by someone who not only doesn't have any kind of real training, but in fact someone who has never even really thought much about it.

    Hell, if I had to guess, I would say that these numbers were actually generated by hand, by someone with a pencil and a piece of paper. When anyone with the slightest grain of sense knows that anything that's supposed to look somewhat random should NEVER be generated by a human. (Or, at least, not out of whole cloth. There are ways of turning a human's actions into random-looking numbers, even without a computer, but 'write down a random series of numbers' is NOT one of them.)

    If this was generated by computer, it was by an amazingly badly-written program. I, with my mumbledy-year-old background in statistics, mostly forgotten, could generate a program that could create reasonable-looking numbers for these values.

    The fact that R2000 couldn't even do that? Wow. They're even incompetent at being fraudulent.

    -fred

    •  All you need is a generator (3+ / 0-)
      Recommended by:
      theran, Jyrinx, Jampacked

      and the mean you want your result generated around.

      It will look really good.

      The US Senate is begging to be abolished. Let's fulfill its request.

      by freelunch on Tue Jun 29, 2010 at 11:19:12 AM PDT

      [ Parent ]

    •  If they were competent at being fraudulent, (0+ / 0-)

      they wouldn't have got caught and we would never have known.

      The FOX is a common carrier of rabies, a virus that leaves its victims foaming at the mouth and causes paranoia and hallucinations.

      by Calouste on Tue Jun 29, 2010 at 12:41:36 PM PDT

      [ Parent ]

    •  that's what is puzzling me (0+ / 0-)

      It's like trying to pass a phony $1 bill with Denzel instead of George Washington.

      Or a blond Mona Lisa.

      Gentlemen, you can't fight in here! This is the War Room.

      by RickD on Tue Jun 29, 2010 at 02:51:47 PM PDT

      [ Parent ]

      •  That's why I think... (0+ / 0-)

        ...that it's GOT to be a bug. These kinds of results don't happen if you're paying close attention to your inputs and outputs; they happen if you have a system (e.g. a piece of software) that you're confident in, such that you don't feel you have to vet the results any more.

        It's like having a piece of software running, and not bothering to check the logs or the output day to day for error conditions, because that piece of software has been running for a couple years and never had problems. There's always the chance that that piece of software has encountered a new situation and has started having problems. There's also always the chance that it's simply been performing the wrong task for the last two years, and you never realized it. At some point you do have to stop intensively verifying everything, because there's always something new to do, but most people don't have a good sense of when it's time to stop, and when you should do a little checkup now and then afterward.

        -fred

  •  How to keep pollsters honest (2+ / 0-)
    Recommended by:
    RandomNonviolence, Jasper47

    One of my main concerns here is that several poll aggregation/methodology sites have talked about basic information from pollsters to consider them trustworthy. Since this is the second time such fraud is uncovered, and Research 2000 was one of the most "open" pollsters, those reporting standards are clearly not enough.

    In addition, these types of tests are only possible after a pollster conducts a large number of polls. Thus, anyone wanting to commit fraud can get away with it for a while.

    In addition, for every new method to catch poll fraud, a dishonest pollster can simply learn new rules to follow when generating data.

    The question is what do pollsters need to reveal to confirm that they are honest. For phone polls, do they need to work with phone companies to confirm that each poll has 1000 connected phone calls that lasted more than 3 minutes each? Are independent auditors of their raw data needed? What's happening now is simply not enough.

    •  Let's Hear it for Peer Review (0+ / 0-)

      Those who will not or have not provided the data to others for peer review have a good reason for their actions - they have fear. If one is afraid of what others may find using the same data, that person or organization should expect to be sued and put out of business.

      Shewhart said the very thing back in the 1930's; don't trust anyone who will not let you see their data. You can see clearly why that is true today.

      •  modern peer review won't even help here (0+ / 0-)

        It usually means a peer examines the methodology and results very carefully to make sure everything makes sense. Outright fraud can get through the process because most people don't think their peers could be so dishonest. When no much money rests on dishonesty and there's no few ways to be caught, the risk is much larger.

  •  wow this diary made my head spin (1+ / 0-)
    Recommended by:
    Mariken

    the chi squared distribution thing flew way over my head. But what I am getting here is that you were scammed. Im sorry kos.

    You're watching Fox News. OH MY GOD--LOOK OUT BEHIND YOU

    by rexymeteorite on Tue Jun 29, 2010 at 11:26:09 AM PDT

  •  Runs test for randomness (3+ / 0-)
    Recommended by:
    OLinda, Jyrinx, Jampacked

    Takes me back almost 30 years to my thesis.  I used a "runs up and down test", essentially a non-parametric test, to test for nonrandomness in species sample populations.  Whether it would work here however, I don't know.  Guess it all depends on the number of samples and the population size.
    Great article, and technical enough to make me bring out the statistics textbooks.

    "Sometimes paranoia's just having all the facts." William S. Burroughs

    by SaltWaterCroc on Tue Jun 29, 2010 at 11:37:44 AM PDT

  •  Your hate mail should go up in quality after this (5+ / 0-)

    You may get some nastygrams from R2K employees, who won't call you what your usual hate-mailers call you, but who will probably call you a number-lover, an anti-statisticite, or a homodata.

    (snark!)

    -8.75, -6.72 If it's 15000 ft below sea level, maybe it should stay there.

    by SciMathGuy on Tue Jun 29, 2010 at 11:39:30 AM PDT

  •  Are you asking for your money back? (0+ / 0-)

    Progressive -> Progress; Conservative -> Con

    by nightsweat on Tue Jun 29, 2010 at 11:49:13 AM PDT

  •  Sir Humphrey on Polling (2+ / 0-)
    Recommended by:
    RandomNonviolence, Mariken

  •  There is so much heuristic fudging (0+ / 0-)

    ...in all polls I wonder how much any two numbers can be said to be independent, even if by gut feeling they should be. It's a black art, the "scientific" moniker notwithstanding.

    For example, polls often try to balance Democrats, Republicans and independents based on some prior idea of how many there should be of each in the sample. I suppose they may doctor the "women" or "men" categories based on party affiliation and other factors (race, region, etc.) as well.

  •  Golly, and here I've been saying for years (0+ / 0-)

    that polls are crap info. Go figure.

    You can make a poll say anything you want it to say.

    •  sure, if you lie (1+ / 0-)
      Recommended by:
      G2geek

      Lying with numbers is not different in nature than lying with letters.

      I can type "Ohio borders Nebraska" and somebody who doesn't know the map might believe me.  But anybody who knew the map would know it's a lie.

      The same principle holds with numerical analysis.  It's a lot tougher to lie with numbers to people who work quantitatively for a living.

      Gentlemen, you can't fight in here! This is the War Room.

      by RickD on Tue Jun 29, 2010 at 02:54:02 PM PDT

      [ Parent ]

  •  Reminds me of an Enigma (4+ / 0-)

    Back in WW II, the Germans used a commercially-produced encryption machine called the Enigma.  It looked sort of like a typewriter and had a few code wheels where you set the key.  It was a fairly advanced encryption system, probably the best of its day, and for a while it worked.

    But the codebreakers (I think this was at Bletchley Park) eventually found its vulnerability.  Enigma never, ever substituted a letter for itself.  Its cryptotext was pretty random, but the designers thought that it was a bad idea to ever let a single letter go through unchanged.  This mistake was the mathematical opening that let the codebreakers in.

    No zeroes?  Hmmm...

  •  Ah thank you for providing occupations. (0+ / 0-)

    I wondered if these guys were trained in statistical analysis. Now I know they aren't. That doesn't mean they are wrong, but it does mean I am going to look to the results in the court before I say R2K was fraudulent or not.

    The Raptor of Spain: A Webserial
    From Muslim Prince to Christian King (Updated Nov. 24)

    by MNPundit on Tue Jun 29, 2010 at 12:13:37 PM PDT

  •  two concerns (0+ / 0-)

    We've picked a few initial tests partly based on which ones seemed likely to be sensitive to problems and partly based on what was easy to read by eye before we automated the data download.

    You picked the initial tests after looking at the data?  After MG noticed apparent patterns in the data?  Discussions regarding statistical significance need to take these things into account, don't they?

    A large set of number pairs which should be independent of each other in detail, yet almost always are either both even or both odd.

    This assumes that the digits 0 through 9 are equally likely in the final digit and that the male and female numbers are uncorrelated; this isn't necessarily the case in political polls (particularly on questions that don't correlate much with gender).  Lastly, would you have suspected and analyzed the same effect had you seen matching odd/even properties for (D) and (R), white and black, etc?  What about fav/unfav, etc?  I agree that the result you see is still anomalous, but the coin flip analogy is insufficient.

    That said, the results you find are very compelling; I'm just concerned you're overstating your certainty here.  None of the things I point out are remotely severe enough to explain the effects you see.

  •  Shocked???? (0+ / 0-)

    This is what I already knew for at least a year. lol

    Good to admit it though...although the charade was allowed to go on too long.

    When their polling didn't match the majority of other polls....not even the avg of what others were giving... I stopped taking the Dkos polls seriously.

    I suggest folks follow the Dave Leip site...which I check regularly (great resource).

  •  Can we test other pollsters with this approach? (0+ / 0-)

    This was a brilliant and inspired analysis. But I'm wondering whether Research 2000 is the only pollster fabricating data. I'd love to see these analyses applied to other outfits.

  •  Thank you for the honesty! (0+ / 0-)

    Kudos to Kos for disclosure. :)

    -Joe

  •  Holy Moses (1+ / 0-)
    Recommended by:
    Jampacked

    This makes 2 pollsters.

    It also makes your famous "GOP is crazy poll" extremely suspect.

    Sue their asses, Markos, and then sue their ass hairs, and then sue their dingleberries.

    Very frustrating.

    In theory, there is no difference between theory and practice; but in practice, there always is a difference. - Yogi Berra

    by blue aardvark on Tue Jun 29, 2010 at 01:28:03 PM PDT

  •  Is there enough information on the (0+ / 0-)

    non-randomness to reverse-engineer, even in only a speculative way, what they did do?

    I'm gonna go eat a steak. And fuck my wife. And pray to GOD - hatemailapalooza, 052210

    by punditician on Tue Jun 29, 2010 at 01:29:06 PM PDT

  •  Well that explains why the polls in Arkansas (0+ / 0-)

    were so off in the second district for Halter vs Lincoln. (26 points!)

  •  The odd/even correlation is so obvious... (0+ / 0-)

    it makes me wonder if someone on the inside was trying to send up a red flag, although I can't see why they didn't just make an anonymous phone call.  Or maybe one of the perps is obsessive-compulsive and couldn't resist the symmetry.

    Is there a psychiatrist in the house?

    •  something like that (0+ / 0-)

      Well that is at least a plausible explanation.  

      I'm baffled by this because most of the explanations given by the software engineers don't really stand up to scrutiny.  

      Gentlemen, you can't fight in here! This is the War Room.

      by RickD on Tue Jun 29, 2010 at 03:00:20 PM PDT

      [ Parent ]

  •  More data to add on (0+ / 0-)

    With the Men/Women thing...Just a look at the one poll linked above shows this:

    I have the results for each question, with the final number, the number for men, the number for women, and the different from the total.  Now, unless the ratio of men to women is exactly 50/50, the difference shouldn't be the same.  For example:

    If a poll has 55% women, and 45% men, and women supported something 45/53 and men supported it 47/50, the final results would be thus:

    46/52

    When one subtracts and takes the absolute value of each for/against for each sex vs. the average, one gets this:

    For/Women: 1
    For/Men: 1

    Against/Women: 1
    Against/Men: 2

    As you can see, while one of them is the same, one of them is not, showing that the sample is not split even 50/50.

    So let's look at these numbers in the poll linked to above:

    Obama For: Men: -8, Women: +8
    Obama Against: Men: +10, Women: -10

    Pelosi For: Men: -15, Women: +15
    Pelosi Against: +14, Women: -14

    Reid For: Men: -4, Women: +4
    Reid Against: Men: +3, Women: -3

    McConnell For: Men: +7, Women: -7
    McConnell Against: Men: -10, Women: +10

    Boehnor For: Men: +5, Women: -5
    Boehnor Against: Men: -8, Women: +8

    Cong. Dems For: Men: -8, Women: +8
    Cong. Dems Against: Men: +5, Women: -5

    Cong. GOP For: Men: +9, Women: -9
    Cong. GOP Against: Men: -8, Women: +8

    Democratic Party For: Men: -7, Women: +7
    Democratic Party Against: Men: +9, Women: -9

    GOP For: Men: +9, Women: -9
    GOP Against: Men: -7, Women: +7

    Direction Right: Men: -4, Women: +4
    Direction Wrong: Men: +4, Women: -4

    Gen. Ballot Dems: Men: -8, Women: +8
    Gen. Ballot GOP: Men: +8, Women: -8

    Vote Likelihood Def: Men: +2, Women -2
    Vote Likelihood Vote: Men: +2, Women: -2
    Vote Likelihood Not Likely: Men: -3, Women: +3
    Vote Likelihood Def. Not: Men: -1, Women: +1

    Support HC Repeal: Men: -6, Women: +6
    Reject HC Repeal: Men: +9, Women: -9

    They SAY that the demographic breakdown is 52% women, 48% men.

    As one can see, in every single instance, the difference of men from the average is the same as women from the average.  That would seem to be unusual unless the breakdown was exactly 50/50.

    Now, if Caj's theory is right, and they were somehow storing valid data as average and variance, and accidentally storing it as integers, you would get this result (or at least, when I tested it over my 100 randomly generated examples in excel) with gender break down of 52/48.

    However, the reverse could be true:

    They could easily put in the average as an integer, put in the variable as the integer, and then make the numbers that way.

    So if they wanted the final Obama approval to be 47%, and put in a variance of 6%, viola, you'd have a men's percentage of 41% and women's percentage of 53%.  They would both always be odd or always be even.  This would actually be a sensible (if simplistic) way of faking data if you wanted to.

    Now, having stated all this, there are two things I don't know:

    1. I don't know if this trend continues in other polls because I haven't checked, and
    1. Given the gender breakdown, I don't know what the probability of ALL of the variances matching like that is.  Maybe it's actually more than I suspect.

    I just thought I would note this.

    •  ok, this is kind of crude (0+ / 0-)

      but I ran the numbers on the non-averaged numbers (ie, calculating the normal way) and I got a non-equal difference in 61 of 100 sets.

    •  Right. (0+ / 0-)

      However, the reverse could be true:

      They could easily put in the average as an integer, put in the variable as the integer, and then make the numbers that way.

      Exactly.  This could be buggy processing of real data, or buggy fabrication of fake data.  The bug in question could manifest in either case, and it doesn't rule out one case or the other.

      On the other hand, as you said above, this seems like a pretty contrived, pointless and zany way to process real data (why average two numbers just to take them apart again?), while it comes across as a fairly straightforward if simplistic way to manufacture fake data.  

      That might lead one to suspect the latter case, but it's really hard to second-guess such things.

  •  Technical Mistake (0+ / 0-)

    In your report, you say: "the average expected variance in this difference of margins over the first 60 weeks was 80.5, i.e. a range of +/- 9%"

    But the 95% confidence interval is 2*sqrt(var), so I think you mean +/- 18%.  This actually makes your point stronger.

  •  thanks for being honest (2+ / 0-)
    Recommended by:
    SDorn, G2geek

    rather than cover it up and let it affirm our conventional wisdom, which would have been very easy to do. Honesty really is the best policy.

  •  Is a warning disclaimer appopriate? (0+ / 0-)

    It's great to see a rapid and open response by DK to this issue.

    Perhaps this has already been addressed elsewhere...

    The web, unfortunately, has a long memory. Links to DK stories which reference and perhaps rely on now suspect R2K results will exist for a long long time.

    Unfortunately, someone following one of these links who is/was not a regular visitor to DK in this timeframe will usually have no knowledge of the R2K problem. They may even create additional links that use the stories to defend their position. Unfortunately, others who do know about the R2K problem are likely to add, possibly frothy, commentary such as:

    "DK sucks, they stood behind this data for a year before disclaiming it, you're an idiot to trust DK"

    This leaves the person who inadvertently relied on the link even after DK effectively renounced the R2K data, but not in a persistent and visible way that exists forever, with a bad taste in their mouth about DK and a reduced willingness to rely on them in the future. It also leaves a "defendable" comment trashing DK in Google et al forever.

    It would seem that many, although probably not all, of the front page stories written before today that likely relied on R2K data could be located with some simple searches (such stories would mention R2K or Research 2000 or, recursively, contain a link to such a story or to an R2K site).

    I'd suggest these articles be identified and be marked in the database as "Possibly R2K Dependent" (PRD). Then, when any story with the PRD mark is displayed, at the very top of the story, a standard disclaimer in a prominent box would be displayed that read something like (and, this could change over time as more is known):

    This story may rely on data from the polling firm R2K that DK contracted with to perform custom polls from xx/xx to yy/yy.

    Due to anomalies that were reported on zz/zz [link to Kos story on this], DK became concerned that some R2K polls did not track actual voter preferences adequately and discontinued their relationship with R2K.

    Subsequently, very serious statistical anomalies [link to this story perhaps] in the R2K results were reported after additional analysis [link to this story perhaps] by outside parties.

    This has lead DK to concluded that, at this time, they are unable to stand behind the results of any R2K polls. The current status of this evolving situation can be accessed here [link to a new FAQ/Info page that reflects the current state of the situation in much more detail].

    At this time, DK urges caution in relying on anything in this story that is based on an R2K poll result.

    (Yes, it should be much shorter - I lack the pithy gene! I conclude it's either recessive or is passed down only from the maternal parent.)

    Some number of superusers (at least anyone who is trusted to "represent" the site!) should be able to remove the PRD mark (not sure if this should apply recursively though). Perhaps TUs could be allowed to mark a story for review for removal of the PRD mark and if the reviewer disagrees with the removal, TUs would no longer see the 'review for PRD mark removal'.

    This really is a pretty big thing, not something DK is really "responsible" for in some sense (although, as we know, the buck stops at the top even if the CEO couldn't even reasonably know about a problem!).

    Although I don't think DK considers itself to be a reliable source of facts the same way the NYT or WSJ does, some R2K poll results were commissioned by DK and reported as fact rather than the opinion of DK or R2K. Hence, it seems appropriate and a good way to get ahead of the curve to publish errata warnings in this particular case.

  •  Shocked??? (0+ / 0-)

    Adding to earlier comment...

    This is what I already knew for at least a year. lol

    Good to admit it though...although the charade was allowed to go on way too long.

    When their polling didn't match the majority of other polls....not even the avg of what others were putting out... I stopped taking the Dkos polls seriously.
    Not really a surprise, especially for this site. These sites become victims of tunnel vision and the echo chamber effect. Readers should have been questioning the poll results earlier...and should have felt safe enough to question them openly. But what I have noticed here is that any questioning or dissent (no matter how thoughtful) means you will be attacked, mocked, trashed, disrespected, etc.

    I suggest folks follow the Dave Leip site...which I check regularly (great resource). Site follows Gov. and Senate races...but not House races. It provides a wider breath of polling to give a better sense of what opinion really is. It also allows you to get a sense of trends...and takes in all numbers to come up with likely outcomes. My other favorite resources are CQpolitics polls (maps/districts page) which looks at a number of polls for each contest, & gives an idea of what's in play, what's safe, likely gains/losses, etc. Polling Report and pollster for opinion numbers and for Pres. favorability rating. It's always better to look at a wider variety of polls IMO...to get a sense of what's going on. But if I am going to look at a singular poll.... I use Gallup.

    Would like to see Dailykos go to a system more along the lines of Polling Report and/or Pollster... where we could look at 3, 4, 5 polls for each question/topic. There would be a much better mix. Would be nice to have an interactive map like NPR's, CQpolitics or Dave Leip on the Dailykos frontpage at all times... something dynamic that updates by the hour or by the minute...based on polling inputs. This is something that kos could do now with existing polling info that is already available.

    Relying on one polling organization was one of the big mistakes here IMO. I know it probably had to do with Markos wanting to provide some sort of exclusive service for the site...but it might have made more sense to partner with Pollster,
    Polling Report, 538, Pew or one of the other established polling/research sites. Results would have been more balanced and rock solid (trustworthy).

  •  Maybe they can look at the National Elections (0+ / 0-)

    of 2000, 2002, 2004 and 2006 while they're at it.  A lot of those numbers did not seem right, either.

    Obama needs to channel TR+FDR: Walk Softly, Carry a Big Stick and Welcome Their Hatred. He has Walk Softly down pat. Time to get on with the rest...

    by FightTheFuture on Wed Jun 30, 2010 at 05:05:22 PM PDT

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site