This is a very interesting explanation of why the Michigan polling debacle happened. The problem stems from the Michigan chaos of 2008. This is a bit wonky and about polling, not candidates.
…
Shortly before the 2008 primaries, Michigan, like Florida, tried to increase its relevance by jumping the line on when its primary would be held. The legislature passed a bill moving the primary up to January 2008, contrary to the party’s plans. So Dean, then the chair of the Democratic National Committee, led an effort within the party to strip the states’ delegates(although eventually they were given half a delegate for each delegate they should have had).
Barack Obama never filed paperwork to be an official Michigan candidate, and when the day of the election rolled around, his team encouraged people to vote “uncommitted.” Hillary Clinton won the state with 54.6 percent of the vote, and uncommitted came in second.
Total turnout was only about 600,000, since many voters didn’t bother to turn out for a race that didn’t mean anything.
And right there is the problem.
For pollsters to know what an electorate will look like in an election, they look at what’s happened in past contests. That gives them a sense of what sort of composition of poll respondents will reflect the actual turnout on Election Day. But in Michigan, the most recent election was a weird one, in which turnout was dampened. Michigan so far is the only state that has seen more Democratic votes in 2016 than in 2008, with about 1.2 million people turning out Tuesday.
So if you based your polling model, all or in part on the 2008 race, you were probably going to have a problem. If you didn’t use the 2008 race, you had to go back to 2000 which presented problems of a different type. There were also some underestimates of youth turnout gender ratios that would have occurred anyway, but probably to a much lesser extent.
So can we expect similar errors in polling going forward? Not very likely. So Sam Wang over at the Princeton Election Consortium has created a very nice chart to show the accuracy of polls so far in this primary season.
In the graph above, the closer the X and Y axis are to being equal, the more accurate the polling was.
Although this is a significant polling error, it stands in contrast to many polls that did better. In polling, many judgments go into sampling and weighting. Professional pollsters sometimes make wrong judgments, but the error is only visible in retrospect. Failure happens, and it is useful to understand why. Considering that such failures of judgment are inevitable, it is useful to know that even a 20-point lead does not assure a win. For future cases, it might be best to imagine that such a lead comes with, say, a 2% probability of a surprising outcome.
From a pollsters perspective, this was probably a good lesson to be learned.