Pollster.com has a great article today about how likely a candidate is to win, given a particular poll spread before the race. We are pinning our hopes on a great many coin flips landing the way we would like. To understand this quantitatively, I did a little calculation.

UPDATE: As several alert readers have pointed out, this assumes that the results are statistically independent, which is probably not the case. However, it does suggest that our optimism may be fueled by enthusiasm as much as by history and science. The independent odds are 3%, the rest is a wave.

Take the jump for the details.

First, I looked at the odds of a victory 13 races based on their current, average point spreads and eyeballing the graph on pollster.com. These gave me the following figures:

 State Current Poll Margin (+dem/-rep) Probability of Dem Victory AZ -7.3 .2 MD 3.8 .65 MI 10.4 0.9 MO 1.6 .55 MT 3.2 .6 NJ 6 .75 NV -15 .05 OH 11.2 .9 PA 10.2 .9 RI 6.2 .7 TN -7.4 .3 VA 1.4 .55 WA 10.8 .9

Then I wrote a computer program that went through the probability of every potential outcome of the race. For example,

• Odds of Dems winning MO MT NV OH RI TN VA WA =

LoseAZ*LoseMD*LoseMI*WinMO*...*WinWA =

.8*.35*.1*.55*...*.9 = 1.08056 *10e-6

• Odds of Dems winning MD MI MT NJ OH PA RI TN VA WA =

LoseAZ*WinMD*winMI*...*WinWA =

.8*.65*.9*...*.9 = 0.00758061

Then I tallied all of these up, based on whether or not the dems won more than 10 of the races (giving them control), 10 races exactly (a tie) or less than ten (republican control).

I've been saying since July...
We're not going to take the Senate... too many things have to happen just right and we have to trust Lieberman not to screw us...

But the House.  Yeah.  We'll take the House.  Big time.

-9.50;-6.62. But it don't mean nuttin if you don't put your money where your mouth is

However, there are always many intangibles.  It is nice to be anchored to realism, though.  Particularly when the spinning begins.

Begins?
Right now, I'm so dizzy I've had to sit down.

wait a second
I don't accept your input numbers.  A 10% chance that Santorum wins?  More like 1%.  Same for DeWine and McGavick.   And 75% for Menendez, 65% for Steele seem awfully low.  I'd go with the numbers on tradesports, input those, and see what happens.

• ##### The numbers come straight from the pollster graph(0+ / 0-)

Apparently we tend to be overconfident with polls.  They are less accurate than we would like.

I was thinking of redoing the analysis with the odds from tradesport.  Perhaps if people are interested ...

right now.

Their individual Senate race probabilities match (roughly) with that probability, otherwise someone would have an opportunity at arbitrage and I know there are people looking for those opportunities because one of my friends analyzed it during the last election cycle (he works with hedge funds).

I agree.

The standard deviations (which is half of the customary margin of error) is on the order of 1.5-2 percentage points in most of these races.  This means that Ohio, for example, is probably 5-6 standard deviation gap.  This should be better than a 90% likelihood of a Democratic party win.

"Those who can make you believe absurdities can make you commit atrocities" -- Voltaire

"Those who can make you believe absurdities can make you commit atrocities" -- Voltaire

That is poll error, not outcome
I think the logic of the pollster data is that the outcome of the election is far sloppier than what is predicted by the polls leading up to the election.  The  outcome of the vote is not necessarily the same thing as the margin of error of the poll leading up to the vote, which measures how people say they will vote rather that how they voted.

is that the data set is too small to be making those kinds of conclusions.

As I eyeball the graphs, I see 5 outliers (i.e. 10 percentage points behind or more candidates winning), all in U.S. Senate races.

There should be over the two election sample, 66 Senate races, 50 Governor's races, and 50 Presidential election races.  Thus, there should be 166 data points.  This is a very small sample to be basing such a key assumption upon, especially at the extremes.

A good model would hypothesis test the actual results of the entire sample against a pure sampling error model and see how far off it is.  If 5 outliers in 166 data points is within what one would suspect from a null hypothesis that there is no non-sampling error (and one would expect 8 results outside of a 95% MOE give or take one or two, in a sample of that size), then sticking with pure mathematical sampling error would be the best approach.

Systemtic poll bias one way or the other can be controlled best by using a variety of pollsters for each race (which also reduces sampling error).

Also, five outliers in 166 data points is a 97% chance of a win when there is a 10 point gap, rather than a 90% gap.  This is a lot closer to what you would expect based upon plain vanilla sampling error theories than 90%.

This is a huge difference in the aggregate probabilities when you have 4 races in the 90% zone.  At 97% each, there is an 88.5% chance that all are wins.  At .9 there is a 65.6% chance that all are wins.

"Those who can make you believe absurdities can make you commit atrocities" -- Voltaire

If the lead is outside of the margin of error

then the probability of the trailing candidate winning is less than 5%.  A lead that is twice the margin of error is less than 1% likely to be false.

Those .9 figures accumulate to give you an artificially low probability of takeover.  I agree the odds are not great, but you have pegged them too low.

Tony Barr for Congress in PA-09.

• ##### This calculation assumes independence(7+ / 0-)

These are not independent events like coin flips. There is a high correlation between doing well in VA and doing well in MO for example and they joint probability needs to be calculated by using a convolution of conditional probabilities.

It comes down to the overall size of the wave. If the Dems take 30+ house seats, they take the senate. If they take <25 house seats, they miss out.</p>

I assumed independence, since I have no data to support correlations.  Are there data to use for the correlations?

• ##### Absolutely(1+ / 0-)
For example, if we win Missouri, then national conditions probably mean we take everything except Tennessee.

"This machine kills fascists"--words on Woody Guthrie's guitar

• ##### The variables aren't independent.(7+ / 0-)

Usually, a party will do well everywhere, or nowhere.  Close elections are largely a product of the national mood and that is largely a product of things like mass media.

Good weather in key races that don't have mail in voting (except TN) will help Democrats everywhere.

Whatever else ends up on the front page of the newspaper besides the election will have an impact.

Electoral behavior on the eve of an election may be chaotic, in the sense that slight changes in conditions may cause significant changes in outcome.  But, it is neither truly independent, nor random.

"Those who can make you believe absurdities can make you commit atrocities" -- Voltaire

• ##### Here's the problem with this theory...(9+ / 0-)

...you are assuming that the numbers are all independant... that it's like flipping coins... when an election is a reading of the electorate and the electorate often moves en-masse.

I'm not arguing that we should be cautious, but your odds are wrong.

RedState.com: nowhere can you find a smaller, more irrelevant group of morally bankrupt sycophants with the IQ of a dozen decapitated squirrels.

You don't understand.
I WANT the Senate.

Someday the plain folks of the land will reach their heart's desire, and the White House will be adorned by a downright moron. -- H.L. Mencken

Two comments:
(1)  I'm highly skeptical of pollster.com's methodology here:  They've combined data from 2000 and 2002 to generate the curve, but turnout for a presidential election and turnout for a midterm election are different.

(2)  That said, I think you've misread their chart.  For example, you have the probability of a Democratic win in Arizona lower than in Tennessee, even though the poll spread is ever-so-slightly more favorable.  Similarly, you have a higher probability of a Democratic victory in New Jersey than in Rhode Island, even though the poll spread is tighter in New Jersey.

Combining their errors with yours makes the results of your calculations highly doubtful.  I agree, by the way, that the method you used to calculate the results is sound, but disagree that the data you used in your calculations is valid.

The TN/AZ problem was a result of bad eyeballing.  If they are both set to .2, p(Dem Control) = .026.  If they are both set to .3, p(Repub Control = .041.

As for point 1, I thought pollster's was a pretty cool analysis. I discovered that I was way too concerned about whether one candidate or another was up in the polls by a few points.  There was much more slop than I had imagined.

An election race with a 8% point spread is way beyond the margin-or-error and is beyond the range where GOTV can make a difference.  It has a 96+% chance of going in the direction of the leader.

So for all practical purposes it can be counted as a 100% probability.

But your math doesn't 'round up' or down the advantages that way.  So basically I think it's overly pessimistic.

It's too early to play with statistics like this. Too many races are too close to call.  You can get them to say anything you want, based on your starting presumptions.

A nation of sheep will surely beget a government of wolves. www.writtenlandscape.blogspot.com

• ##### Dependent vs. Independent variables.(3+ / 0-)
This is a kind of argument that has been used before, but it gives misleading conclusions.

The problem is that the elections are not like flipping coins.  A coin doesn't remember whether it came up heads or tails so what happened last time has no impact on what will happen next time.  You could flip "heads" 20 times in a row and the odds of the next coin coming up "heads" will still be 50/50.

That isn't true about elections, especially in the context of a national wave.  Given where we are now, if you were told that Ford won in Tennessee that would almost guarantee that Webb won Virginia, Tester won Montana and so forth.  Tennessee would be a sample showing national factors which would help all the other Democrats as well.

So, don't panic.  The odds aren't that bad.

Bush's only "exit strategy" is to let somebody else figure it out in 2009.

• ##### Another suspicious aspect of the graph(1+ / 0-)
is that there are virtually no Presidential or Gubenatorial race outliers, while their are 5 Senate race outliers.  This suggests to me that the Senate races have inferior quality data.

"Those who can make you believe absurdities can make you commit atrocities" -- Voltaire

"Those who can make you believe absurdities can make you commit atrocities" -- Voltaire

• ##### Your Stats Are Wrong - - (2+ / 0-)
Sorry but - -
The a*b*c*d - - -
Assumes independent events.
Senate races this year are not completely independent.
National issues impinge more than in recent memory.
Thus something that impacts the race in Virginia may also impact the race in Missouri - like a disgraced Colorado minister depressing evangelical turnout.

this other diary

http://www.dailykos.com/...

The odds are not either completely independent or dependent of each other.

• ##### What's astonishing is(2+ / 0-)
the number of commenters here who demonstrate a good working grasp of statistics and probability, and are able to identify the flaws in this analysis in the first place.  Not to disparage EngProf's work - at least it got a worthwhile conversation started.

It really is remarkable, the breadth and depth of knowledge to be found in this forum.  All posted by volunteer bloggers on their own time.  Yet one must come here to find such analysis, because the pundits and talking heads of big-money MSM simply haven't the skills to offer it.

This is why Daily Kos strikes fear into the hearts of the wingnuts and their MSM enablers.  Damn, I dig this joint.

"I seek the truth, which never yet hurt anybody. It is only persistence in self-delusion and ignorance which does harm." --Marcus Aurelius

• ##### This is completely stupid(0+ / 0-)

In what universe do we lose WA and gain NV? That would be an incredibly unlikely occurrence.  It makes no sense-- these events are all indirectly related by the political environment.

The reason the "wave" analogy is used is because it reflects the way that certain races are more likely to flip than others in virtually all cases, similar to the way a wave is always more likely to sweep away someone situated on low ground than on high.

Say nothing once, why say it again? - Talking Heads

• ##### you can't calculate on-off events(0+ / 0-)

In my profession I work alot with probability and statistics. Generally in science we try to construct falsifiable theories. You can't do that with events that happen once. Tomorrow's election is an example of that. There are so many other variables than the ones that were included and as lots of you have mentioned they are not independent.
There's no way to construct a falsifiable model here.

Thanks to all Kossacks! It's the turnout that will make the difference.