Election Modeling Round 1

by terranova108

Community

(This content is not subject to review by Daily Kos staff prior to publication.)

Sunday, Sep. 14, 2014 Sunday, Sep. 14, 2014 at 10:31:59am PDT

This is my first diary, and since I am using the opportunity to publish my model-based election predictions (for the Senate only at this time), I think a brief introduction is in order. I work for a living as a biostatistician; that is, I analyze data from clinical trials to help determine the safety and efficacy of new drugs for a pharmaceutical company, with the goal of trying to get the drug through the FDA and on the market (if it is safe and effective).

As part of my work, I have done modeling using meta-analysis methods, where I combine results from multiple studies to try to end up with a single result or outcome. I have not worked professionally with survey or polling data (so I will understand if you take my results with a grain of salt).

These are predictions of the final results, not predictions of the current state of the races. As I am still refining my model-based approach, I will provide more details later, but essentially, the model has 3 sources of data: (1) relevant polls in the last 3 months, (2) turn-out estimations for D, R, and I identified voters, and (3) the estimation of how the self-identified independents will vote. These are based on state-level data, supplemented by national data only if no state-level data exists. What I mean by this is that for (2) and (3), I use exit polling data from both 2006 and 2010, but this does not exist at the state level for some states.

A word about the exit polling data mentioned above. Note that 2010 was an R-wave election (net change of R+63 seats in the house, national vote of R+6.8%) and 2006 can be seen as a D-wave election (net of D+31 seats in the house, with the national vote of D+8%). So there must have been many more Democrats voting in 2006 and fewer in 2010, right? Well, no, not really.

In 2006, based on national exit polls, the electorate was split (self-identified party) 38% D, 36% R, 26% I (note: D+2). How about 2010? 35% D, 35% R, 29% I (D+0).

Let that sink in for a minute. The Democrats actually had decent turnout and came out to vote in 2010. In fact, in both 2006 and 2010, the D's voted for the Democratic candidate more than 90% of the time, and the R's voted for the Republican candidate over 90% of the time. So, what was so different between the 2 midterm elections? Well, two things (more after the fleur de lis):

1. The 2006 independents voted 57% D/39% R, while in 2010, it was 37% D/56% R.
2. In 2006, 81 million people voted nationally; in 2010 87 million people did.

So a major difference from 2006 to 2010 was not Democrats sitting at home on election day in 2010. Rather, it was more independents (yes, I know, this included TPers and Republicans who were calling themselves independent, which is again why the final result was R+6.8%) voting for the Republican coupled with more of them overall coming out to vote. I provide this data (which affects my model for inputs (2) and (3) described above) just to give some insight into my approach. But now, on to the my current estimates!

First, the easy ones -- these are well outside of the margin of error, so I will call them safe:

Safe D (alphabetical, no numbers provided):
DE
HI
IL
MA
MN
NJ
NM
OR
RI
VA

Safe R:
AL
ID
ME
MS
MT
NE
OK1
OK2
SC1
SC2
TN
TX
WV
WY

The rest:
SD    Rounds+7
KY    McConnell+4
GA    Perdue+2 (run-off expected; prediction is pre-run-off)
LA    Landrieu+0 (run-off expected; prediction is pre-run-off)
AK    Begich+0.5
KS    Orman+3
AR    Pryor+3
NC    Hagan+4
IA    Braley+5
MI    Peters+8
CO    Udall+8
NH    Shaheen+11 (this is well outside the margin of error, but I'm including it since it is a "battleground" state)

Result
D/I: 52
R: 47
Tied: 1 (LA is a pure toss-up)

A couple of notes:
1. Again, this does not reflect the current state of the races, but rather a prediction on election day. This is why states like MI, CO, and NH have slightly higher estimates than current polls (i.e., generally bluer states will tend to have a bluer electorate even if not fully reflected in current polls).

2. Pryor is benefiting from a D+5 expected electorate in AR, however these D's in AR are modeled to vote for Pryor only 80% of the time (compared to nationally over 90% for a Democratic candidate). There is some lack of data because Pryor's 2002 election did not have exit polls, and he ran essentially unopposed in 2008 (he got 80% of the vote against only a Green candidate).

3. Begich got a lot of votes against Ted Stevens in 2008 and benefits from early (June/July) polls, but does not benefit from AK having a lot more registered R's than D's; the momentum is in the other direction.