Now that everyone here seems to have caught on to the idea that polling composites are just plan better than individual polls, I feel I have to write about how Nate Silver’s composite polling model works.
The red meat about the issue is below the fold. The TLDRs (“Too Long, Didn’t Read”s), for those of you are swing that way, are included here at the top for you to summarize:
TLDRs:
There are some things in the world that just work, even though we don’t know why, like gravity, mathematics, or that Henry Kissinger got so many women to go out with him. They just do.
Nate’s predictions aren’t predictions, they’re odds. Odds are just probabilities, not certainties.
These odds are computed empirically, looking only at the past. There’s no speculation included. If they work consistently on past elections, why wouldn’t they work now?
If you look at one poll when there are many polls available, you’re a fool.
If someone shows you only one poll, when there are many polls available, they’re trying to make a fool of you.
All of us have a choice to make in embracing this technology of better poll analysis: are we going to get with the times and understand it, or live in the past, and reject it?
Here we go!
Let’s start with something big. Can you think of something that you don’t know how it works, but are certain that it does?
I can name one: gravity. Gravity always works. Always. No exceptions. Throw a ball a million times, and it will hit the ground a million times. There aren’t many things out there that are powerful enough to go from “very likely” to “absolute certainty”, but gravity does.
But, here’s a pertinent question: how does gravity work?
Astronomers will tell you that it corresponds to the mass of an object. Larger objects exert a force on smaller objects; the large the object, the more force exerted. Personally, I found this hard to believe. I’m 6’ 3”, 275, yet women do not typically gravitate toward me in public. Somehow, these women are intent on fighting physics.
But that’s because I’m not a big enough object, and all that. The consensus is that mass exerts a pull on other stuff. Here’s the rub though:
Why?
Why does mass make things come together? What’s so special about having mass that makes things get attracted to it? One of the laws of motion says an object at rest will stay at rest until an outside force acts on it. How can “being massive” be a force that interacts with other things to move them?
The truth is that we don’t know.
No scientist can tell you why being massive = force exerted. The concept is currently beyond any of our understanding of how it works. If we did, we’d be able to manipulate it; those anti-gravity things you see in comic books would be realities.
This is why the Theory of Gravity is called a “theory”…we don’t know why it works, it just does. If you doubt it, you’re more than welcome to jump off a cliff and see what happens. Please just let me know before you do, so I can take out a life insurance policy on you.
The truth of the matter is this…there are some things that we know work, we just don’t know how.
Mathematics falls into this category. It can be used to show relationships between things, but give no pretext as to why. Stand by for an example.
Have you heard of the Pythagorean Theorem of Baseball?
No, not the thing for right triangles, although that’s what named it. The Pythagorean Theorem of Baseball is a math formula that attempts to predict a baseball team’s win/loss record based entirely on two things: runs scored and runs allowed.
W%=[(Runs Scored)^2]/[(Runs Scored)^2 + (Runs Allowed)^2]
Now, let’s investigate. How did we come up with this? More specifically, why didn’t we come up with something like runs ^3 or runs ^5, or something different altogether, like logarithms?
Bill James, the father of modern baseball analysis, wanted to settle an argument. Which baseball team is better than another? We can measure by wins and losses, but sometimes a team gets lucky or unlucky. Can we come up with a more accurate way?
Well, there’s a mathematical technique where you can do this. The bad news is yeah, it’s pretty complicated. The good news is that you can do it on an Excel spreadsheet, no sweat. It’s called regression analysis. Click the link if you like; if not, I’ll save you some time.
Remember those problems in algebra class where they’d give you an equation of a line, and your job was to draw the points that make that line on a graph? Well, the process can be reversed. Instead of starting with a line and cranking out points, you can start out with points and find a line.
Looking at the runs scored and runs conceded “points”, and comparing them to the baseball team in question’s actual win/loss record, the “line” emerged. It was discovered that the formula written above will predict a team’s final record very accurately; show me a team’s run totals, and I’ll show you their final record. With 30 teams a year for decades of baseball seasons, the “Pythagorean” expectation predicts a team’s win/loss record within three wins nearly every time.
It’s not exact. The ’07 Diamondbacks beat their Pythagorean expectation by more than three wins that year. But the other 29 teams didn’t, and neither did all 30 teams the year before, nor all 30 teams the year after.
This makes sense, doesn’t it? Baseball games are decided by who scores more runs.
In politics, elections are decided by who gets more votes. Polls, by themselves, are just small tallies of votes. Why wouldn’t it work here?
In a nutshell, Nate Silver is duplicating the process Bill James used for his Pythagorean formula. He’s taking the points of data (the season’s games), comparing to the final result in that election (the final standings), and designing a formula to link the two. But, the thing to realize is that it’s based entirely on past events.
Seriously…much like Isaac Newton, Nate makes no hypotheses; he simply records what he sees. In the past, when Candidate A leads State B by C points with D days to go for the election, Candidate A wins X% of the time. Nate records a poll with the same information, compares it to previous results, and from there, can yield a percentage chance to for a candidate to win that state. Multiply by that state’s electoral votes, do it for every state, and see if he comes out to 270. There you go.
Remember, the results of previous elections are easy to find…half a century’s worth, along with plenty of polling for each state to. Much like decades of baseball mentioned earlier, there’s no shortage to compare it to.
So now, we have a percentage chance in each state. We can do the one-ff calculation to make a prediction if we want, but if we’re doing our jobs right, we need to account for luck. Unfortunately, there’s really only way to counteract luck: the test of repeated trials.
Luck is inconsistent. Literally…if you looked up “inconsistency” in the dictionary, you’d see a picture of luck…if there was one. On one result, anything can happen. So, let’s try it more than once.
Using the percentages by state, Nate Silver uses a process called “Monte Carlo” Simulations. You’ve probably heard of this. Some people call it “trial and error”, although there isn’t much error in this case, just a bunch of trials.
A random number generator is used to decide how each state goes. Take Florida, and his 50.4% prediction for Obama. Take a random number generator that spits out a number between 1 and 1000. If it’s 504 or less, give Florida to Obama. If it’s 505 or more, give it to Romney. Do this for every state and you have one mock election. Do this 100,000 times, and you have a good model. You may want to start a pot of coffee first.
This process is how Nate Silver arrives at his percentage called the “Now-Cast”. If it says Obama is 92% to win, like it did the night before the election, that means Obama won 92% of the mock elections run for that set of poll-generated state-by-state percentages. There is a dampening factor to limit the Now-Cast toward a more likely “Nov 6” Prediction. This is why the Now-Cast favors the candidate in the lead, and why the Now-Cast for Obama was always higher than the Nov. 6 Prediction.
There’s one last thing to bear in mind: polling weights. Which ones are factored in more, and which are less?
(Aside: There was a post here at Daily Kos that showed graphs for each polling company, showing how consistent they were against the average, but I can’t find it. If you know where it is, please post it in the comments.)
Well, let’s define terms first. “Accurate” and “biased” mean two different things. Think about it this way. The electorate has a “true percentage” of its preference of candidates. We’re not sure exactly what that is, because we can’t poll everybody. Even the election doesn’t poll everybody, only around 60% of the electorate voted yesterday.
But, we’ll just assume the “true percentage” is the average of the polls. After all, each poll is one piece of the puzzle. The more pieces, the clearer the puzzle becomes, right? It won’t be exact, but the more pieces we add, the closer we’ll be. Let’s not sweat a few thousandths of percentage points.
So, let’s take the polls for the race and average them. Okay, done. Now, let’s compare each poll to that average. How close are they to it?
If they’re close to it, they’re “accurate”. If they’re far away, they’re “inaccurate”.
Now, polls won’t be accurate all the time. Such is life. But, when they screw up, which side do they screw up on? Democratic or Republican?
A “biased” poll is always on one particular side of the average. An “unbiased” poll screws up on both sides.
These two characteristics aren’t mutually exclusive. Rasmussen, for example, is both biased and inaccurate. It’s possible to be biased and inaccurate; if the margins of error are smaller.
Your Handy Dandy Polling Accuracy Chart
Pop quiz! What's the most ACCURATE poll listed there? Also, what is the most BIASED poll there?
The most accurate poll is YouGov. The range it comes up with is very small, as opposed to CBS/Times, which is all over the place. As for bias? Seems like a 4-way tie; NPR, Gallup, Rasmussen, and UT/Nat Journal all missed the mark by the most.
Anyway, back to our average. We take each poll, and measure the “error”, or how much they missed by. From there, we square that number. Why square it? Because, all errors aren’t created equal. Missing by a little is expected, missing by a lot is bad. We have to punish the bigger mistakes by bigger and bigger amounts. A small number times a small number will also be a small number. A big number times a big number will be a REALLY big number.
So we now have a list of polls, along with its “error”. Since we know how good each poll is, let’s run the average again, with the errors included, and accounted for. The ones that missed by a lot will count for less, the ones that were more accurate will account for more. From there, recalculate the average, and recalculate each polls “second error”.
Our second iteration has made the picture a little clearer, because it includes a little weight. Except, why stop there? We’re checking the average, checking the errors, and then accounting for those errors to get a better average each time. Why not keep doing this again and again? The average will only get better.
On the fivethirtyeight site, hover your mouse over the “signal bars” indicating the poll’s weight. You’ll get an absurdly specific number: 1.592834, for example. Why is it so specific? Because a computer took a measure, recalculated, and took another measure a couple hundred thousand times.
Now, we’re not covering the bias here, just the accuracy. It’s a shame for the GOPers out there that the polls that like them happen to be the least accurate. But, whose fault is that? It certainly isn’t ours…
Finally, there are a few other things that Nate Silver includes; such as the stock market and the unemployment percentage to influence the incumbent’s numbers. Meaning, a movement there can change the outcome.
Well, if you’ve sat there for this long, you should have learned that many polls are better than one. Now, if there’s only one poll available for the race you want to examine, well, that’s life. But, if there’s more than one, you need to read them all, because a polling aggregate is just better than one poll. Again…in one baseball game, anything can happen. Look at a whole season, and you’ll know which team is better.
Some people just don’t want to hear it. They “know” that their man is winning and won’t accept anyone who disagrees with them. But…they’re going to lose that battle, because this model really does count everybody, along with their opinions. If a poll is published, it gets included. Included the polls which aren’t designed to measure opinion accurately, if you know what I mean.
So, we have an argument. It’s the people who understand and embrace the new method (composite polling) and the stubborn people who won't. This isn’t new…this debate comes up all the time. Find people who invested in slide rules, 8-track tapes, or who scoffed at the notion that radio could come with pictures. They’ll always be there.
You, however have a choice. Whether you wanted Obama to win or not, composite polling was shown to be right, again, in this election. Are you going to take that to heart? Are you going to see one poll’s results and ask, “Why aren’t they showing others? Where can I find them?” Or are you going to swallow what some idiot pundit is peddling this time?
Pundits…I imagine I know how they’ll go. Me? I’m going to get with the times.
UPDATE: Spotlight? Wow. Thanks.
Complaints of lack of visual aids (MOAR GRAFFS!) are completely founded. Hat tip to ptboya for providing a good link. It's also here if yo'd like to see it.