Daily Kos Elections 2016 presidential forecast: New bells and whistles

by David Jarman for Daily Kos Elections

Friday, Sep. 09, 2016 Friday, Sep. 09, 2016 at 5:31:28am PDT

We’ve got good news on a couple different fronts at the Daily Kos Elections presidential forecast. One is that Hillary Clinton’s chances have ticked up a bit over the course of the week, from 70 percent on Tuesday to 75 percent today. Granted, that’s not a huge change, but what will probably make people feel better is the trend. If there’s one thing that Democrats are good at, it’s panicking even when they’re in the lead, but the unease is understandable when the odds seem to be deflating a couple points a day over the period of several weeks. Over the fold, we’ll go into more detail about what specifically caused this turnaround.

The other good news is that this week, we added some additional cool features to that show the data in new ways. If you start out at our landing page (linked above), you’ll see the overall chances of a Democratic victory in the presidential race and the median number of electoral votes. But if you click through to the President 2016 page (you can find the link on the bar at the top), you’ll see a variety of new features on that page. First and foremost, you’ll see the map shown above. States that aren’t competitive are the dark colors, states that are somewhat competitive are the light colors, and states that are true tossups (where the odds are between 45 and 55 percent for each side) are grey. If you mouse over each state, it’ll tell you the odds in that state as well as the number of electoral votes.

There’s a persistent problem with the usual political maps, though; they tend to have too much red on them. And I don’t mean that in the sense of there being too many Republicans; what I mean is that even a map showing a election where the Democratic candidate wins has more red than blue on it, since Republican strength is concentrated in large, empty states and Democratic strength is concentrated in denser, more populous states. If you want to solve that problem, all you need to do is click on the map, and it changes instantly to a cartogram: a map where the size of each state is weighted according to how many EVs it has.

The only problem, of course, is that cartograms inevitably distort the shapes that you’re used to. Our solution is to leave a big smoldering hole where the Mountain West is, as if the Yellowstone Caldera just erupted … but that underscores just how few people live there even today. Instead the cartogram emphasizes how big an impact the nation’s ten or so most populous states (many of which are safely blue) have on the electoral college.

If you look a little further down on the President 2016 page, you’ll see a rank ordering of all the states according to Democratic chances, and also a ticker showing the most recent state-level polls in the race. (As I’ve pointed out several times before, we don’t use national polls in the model, so you won’t see national polls in that list. We don’t even collect them for our database.)

Finally, at the upper left, you’ll also see what’s called a histogram, which is a chart that shows the distribution of all possible outcomes in the electoral college. If you’re familiar with the concept of “central limit theorem,” where things tend to revert to the mean, you’ll notice that our histogram of possible electoral college outcomes looks a lot like the textbook picture of “normal distribution.”

You may have seen something called a “Galton box” in your high school math class or at the science museum; it’s a board with pegs that you drop balls down, where they have a roughly 50-50 chance of bouncing one way or the other each time they hit a peg. Drop enough balls into the board, and they’ll start piling up in a pattern that looks just our histogram. In our case, instead of dropping thousands of balls down a peg board, we’re dropping thousands of simulations into our model. About half the time the ball bounces left when it hits Florida and the other half of the time it bounces right, for instance.

Because the ball tends to bounce right whenever it hits Colorado, Virginia, Pennsylvania, and New Hampshire, though, the large majority of times the total adds up to more than 270, which is why Hillary Clinton’s odds of winning are pretty high. Sometimes the ball keeps bouncing left over and over instead; those are the times that Donald Trump wins. Usually not, though, and the median result (where 50 percent of outcomes are higher and 50 percent are lower) is in the 290s.

If you look carefully, though, you’ll notice that’s not the result that happens the most times (which you’d call “mode” instead of median); there are spikes for the few likeliest permutations. For instance, you’ll notice that one of the highest spikes is at 308, which is what you get if Clinton wins Florida, Pennsylvania, Nevada, New Hampshire, Virginia, and Colorado, but loses Ohio, Iowa, and North Carolina, which is one of the most common permutations right now. (Think of it this way: a lot of football games end with scores like 14 or 21, given the likeliest ways to score points. You don’t see a lot of scores of 4 or 11.)

And that brings us to the slight uptick in Clinton’s chances in the last few days. What’s driving that? It’s almost entirely from a source you might not have expected: the 50-state poll taken by Survey Monkey for the Washington Post. If you just looked at some of the more ‘oh crap’ moments of the Survey Monkey set, you might have expected it would have a negative impact overall (for instance, they found Clinton narrowly trailing in Iowa and Ohio, and leading by only two in the ‘blue wall’ states of Michigan and Wisconsin). The model, however, is already pretty pessimistic on Iowa and Ohio right now, and there are enough other polls with bigger leads from Michigan and Wisconsin that it didn’t change the trajectory much there.

The most important effect of that set of polls, instead, is that it sent Clinton’s chances bouncing back up in Nevada, where they gave her a 48-43 lead. That’s actually the biggest Clinton lead we’ve seen in the Silver State (the previous best was a +4 from Monmouth in July). More significantly, we simply hadn’t seen any Nevada data in a number of weeks, and when the model isn’t seeing any new data in states, it starts taking cues from the broader trend in other more active states; when the trend is downward in other states, that tends to pull the states that aren’t getting polled downwards as well. With Clinton above water in Nevada now, its six EVs are getting added to her pile in a lot more simulations, which gets her over the top in a number of otherwise-close permutations (like ones where she wins Colorado, Virginia, and Pennsylvania, but somehow loses New Hampshire; Nevada still gets her past 270 in those ones).

More subtly, that was also happening in some other states that you probably wouldn’t think of as competitive in the first place, but that have been sort of medium-blue turf over the last decade and where there’s been a dearth of polls (precisely because no one is treating them as competitive), states like Connecticut, Minnesota, New Mexico, and Oregon. If you’ve been looking carefully at our totem pole showing the odds in all the states, you might have noticed that recently the odds in those states had started to drift down into the 80s, precisely because we weren’t seeing polls there and the model was filling in the blanks in a pessimistic manner.

The Survey Monkey poll confirmed that these states aren’t competitive (with Clinton +12 in Connecticut, +9 in Minnesota, +14 in New Mexico, and +19 in Oregon), which boosted her odds in all those states up to the 99 percent range. That forecloses odd runs where Clinton was say, winning Colorado, Virginia, Pennsylvania, and New Hampshire, but losing Connecticut, which were happening a not-insignificant number of times and gradually adding up.

The Survey Monkey poll is an unusual method, though, so we had to make the tough decision whether to include it at all. It’s an online poll that uses a “non-probability” sample, which casts a large net among a huge pool of people who’ve agreed to be contacted for online polls and then uses demographic weighting to overcome the not-exactly-random nature of the sample. It’s a good way to overcome the big problem that’s been plaguing telephone-based pollsters (response rates plunging down into the single digits, thanks to cellphones, voice mail, and caller ID), but it’s still an unproven method.

But we were sold on it thanks to Survey Monkey’s gigantic sample sizes. Even the smallest sample, in Hawaii, was a robust-enough 546, and in Texas, they got a mind-bendingly huge sample size of 5,147. (Speaking of Texas, you might remember the most breathlessly-reported part of the Survey Monkey poll was that Clinton was actually leading in Texas, 46-45. Unfortunately, there are enough other polls in Texas that look nothing like that that it didn’t really push her odds above one percent there. However, if we start seeing more polls like that, that could have a profound effect on the model. Even if she starts winning only 10 percent of simulations in Texas, given Texas’s 38 electoral votes, that would be the equivalent of dropping a nuclear bomb on the electoral college each time that happens. There’s really no plausible path for a Republican nominee to make up for the loss of the Texas firewall and still win.)

You might have noticed that in recent weeks, Ipsos and Morning Consult also put out what purported to be 50-state poll sets, using similar non-probability methods of gathering responses. We aren’t including those in our model, however. The Ipsos polls didn’t have anywhere near the same sample size; in some states, it was as low as 100 or 150, which isn’t enough to tell anything. And the Morning Consult polls didn’t even report sample sizes or margin of error; instead, what they did, apparently, was to use a regression model to extrapolate data from their national poll respondents in every state.

Those are both interesting and possibly worthwhile ways to create a 50-state poll, but we didn’t feel there’s enough “there” there, compared with the Survey Monkey polls. However, this is something we’re going to have to evaluate after the election is over and we have actual results to compare against the pollsters; it may be a practice that we’ll be receptive to in future elections, going forward.