Skip to main content

Nate Silver is getting hammered by the right for daring to use math-based math rather than faith-based math. In his defense, I thought it might not hurt to take a quick review, for non-math types, of the mathy side of things.

This will be old hat for some of you. No quiz, I promise. More after the orange ampersands' act of public indecency.

Update: Uh-oh, Community Spotlight! Now I have to fix my typos!


Suppose you've got a cubic yard of M&Ms of various colors, and you want to know how many are blue. The only way to know for sure is to count them all, kinda like that study of the four thousand potholes in Blackburn, Lancashire that John Lennon mentions.

But you can make a statistical assumption - that if you look at a smaller sample, and count the number of blue M&Ms in that, you can derive an estimate of the total, as long as you're willing to allow for some wiggle room. In a sense, the statistics of polling is all about defining what "wiggle room" means.

That wiggle room comes in two flavors: margin of error, and level of confidence. Here's exactly what that means.

AP recently issued the results of a poll showing that 18% of Americans think Obama is a Muslim. Up at the top of the first page, in tiny print, we're told: the sample size is 1037, the margin of error is ±3.04, and the confidence is 95%.

What that means, in English, is this: "We can't ask everybody, but we asked enough people (1037) that, according to the standard mathematics of sampling, we can conclude that there is a 95% chance that the actual number we'd get if we did ask everybody would be somewhere between 18-3.04 and 18+3.04."

These three numbers -- sample size, margin of error, and level of confidence -- are all related, but in a way that requires calculus rather than algebra to define, so I won't drag you through it. The basic trends, though, are what you'd expect: increase the sample size, and you lower the margin of error; to increase the level of confidence, you need to increase the sample size; given the margin of error you want, you can calculate the sample size you need; geekery geekery geekery abounds.

These three numbers are the Patty, Maxine, and Laverne of polling; each only makes sense in the context provided by the other two. But there is also a convention that says, if a confidence level is not mentioned, it's the default value of 95%.

What 95% also means is that even the bestest of the bestest poll is going to get it wrong one time out of twenty. Even the best poll throws a piston every now and then. It's the price we pay for not having to count all the M&Ms. And it's also the reason statisticians tell you never to take any single poll as The Truth, because you might be looking at one of those thrown pistons.

Sampling Error

We've got a built-in assumption here that the sample we're counting really reflects the whole. And this is where polling methodology matters. It's easy to, say, shake up a cubic yard of M&Ms (if you have ready access to a fork lift), and then grab a sample. With people, it's harder. How do you make sure your sample really does match your general population?

This is where variables -- human poster versus robot, land lines versus cell phones, time of day, etc. etc. -- come into play. Generally, the sample population isn't going to match the general population exactly, and polling firms have to apply corrections, based on their own assumptions of what the general population is actually like. Different polling firms make different assumptions.

Those assumptions, along with the variables I mentioned above, give each polling firm a "house effect" -- that is, a bias. If you know that, for example, Rasmussen is always going to lean two percent more Republican than other polls, you can do one of two things. You can decide to chuck Rasmussen results altogether, or -- Nate's approach -- you can estimate the house bias and subtract it for any given poll result. The same holds true for left-leaning polls.

So what Nate's doing when he figures out, in any given day, how things are in, say, Ohio, he's not averaging the direct poll results, as the Republican talking points suggest he is, but he's calculating the results based on the corrected poll results, after the house effect has been estimated and removed from each poll.

Why Nate Silver is The Bomb

So then there's a pile of states, each of them having a calculated percent of Obama support and a margin of error. Once you've got that, and throw in some basic stuff about the Gaussian bell curve, you can calculate the odds that Obama wins a given state.

How does that stack of probabilities turn into a single probability figure for the whole country? The answer there is to use a technique that comes from -- not making this up -- the bomb makers at Los Alamos. Faced by a calculation that was too tricky to work out via pure math on the chalkboard, they tried another approach: doing repeated simulations, with a certain amount of randomness thrown in, and then averaging the results. The more simulations you try, the smaller the margin of error in your final result. Given that each calculation includes, by design, a slice of random chance, In 1946 the Los Alamos mathematician Stanislaw Ulam named it the Monte Carlo method, after the casinos there.

So Silver runs an experiment. He randomly generates election results for each state consistent with the polling data, counts the resulting EVs for each state, and checks to see who won. Then he does it again, with different results that are both random and consistent with the polling data. And again and again, until he's got ten thousand trial elections, all different. The percentage of those trials that Obama wins? That's Silver's top-level number. It won't be right on the nose, but its margin of error can be calculated. (There is a related level of confidence here too - the more trials, the less blurry the result.)

He can also check those ten thousand trials for questions like: did Romney win the popular vote but lose the electoral college? Would this result have been different if Ohio went the other way (i.e. was Ohio the tipping point)? Will Obama lose any states he won in 2008? And by checking how many of those ten thousand trials fit the criterion, he can estimate what their probabilities are.

There's another level

The way I've described it presumes that every state is independent -- that is, that the way Wisconsin goes has nothing to do with the way Ohio goes, or vice versa. But there is a correlation. In the real world, the odds of, say, Florida going blue are going to be related to the odds that Ohio goes blue; a red Ohio means a blue Florida is less likely than if there were a blue Ohio.

So Nate also includes for each state factors like this, although I don't think he's described it in detail. What matters is that it isn't being done subjectively but numerically; it's not something by definition works for or against Democrats.

Nate also has some guides on how the undecided voters are going to finally pull the proverbial lever, based on a variety of political and economic indicators.

The Republican Attack

Since the results are showing a pretty solid probability of an Obama win -- by definition, the methodology can at best generate probabilities -- various faith-based math types are trying to attack Silver's methodology as being inherently biased somehow. This seems to be the year that the GOP taught its low-information voters the word "oversampling," for example. They're also claiming some ridiculous numbers for how independent voters are breaking. And they are in turn attacking Nate for the assumptions he's making on how the undecided are going to land.

But, again, the assumptions Nate makes aren't partisan, but based on analysis of previous elections, and he's gone to some trouble over the last few months to spell these assumptions out.

But think like a Republican for a moment. You've spent four years complaining about the Marxist Kenyan Welfare-State Food-Stamp Socialist in the White House. Along comes Mister Math Guy saying: it's gonna be four more years! Now, which is easier to do at that moment - accept hard reality about the President You Hate So Much, or just attack the math guy, 'cuz you were never really all that fond of math guys in the first place?

Originally posted to zemblan on Thu Nov 01, 2012 at 09:02 AM PDT.

Also republished by Community Spotlight.


The odds of the first poll item winning the poll

99%575 votes

| 576 votes | Vote | Results

Your Email has been sent.
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags


More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
(The diary will be removed from the site and returned to your drafts for further editing.)
(The diary will be removed.)
Are you sure you want to save these changes to the published diary?

Comment Preferences

Thumb, Lestatdelc, Sylv, chuck utzman, Best in Show, dwellscho, Chance the gardener, tundraman, texaslucy, Sparhawk, Trendar, AaronInSanDiego, Liberal Thinking, Geenius at Wrok, Rolfyboy6, Emerson, karlpk, OtherDoug, Shockwave, mlharges, DCDemocrat, celdd, VictorNJ, frsbdg, Mumon, MarkInSanFran, forkush, RennieMac, bec, dweb8231, dpc, RubDMC, bara, jfeichter, sponson, Bring more on, susakinovember, whenwego, CoolOnion, mkfarkus, roses, ivote2004, retLT, edie haskell, Cedwyn, artebella, Chrisfs, SnyperKitty, aitchdee, wader, revsue, jdmorg, Redfire, SneakySnu, ksuwildkat, ninothemindboggler, astronautagogo, HeyMikey, wdrath, The Zipper, papercut, MeToo, Exurban Mom, Madison County Mandy, ybruti, Sembtex, Curt Matlock, tomjones, zadarum, Wife of Bath, jcrit, xyz, mdsiamese, Dave in RI, sebastianguy99, My Philosophy, sawgrass727, Skennet Boch, humphrey, ExStr8, tinfoilhat, yuriwho, caul, mismolly, OpherGopher, PBen, Geronimo, Flint, kitchen sink think tank, citizenx, llywrch, EJP in Maine, eru, Beetwasher, Sun Tzu, where4art, Phil S 33, Rydra Wrong, Aint Supposed to Die a Natural Death, reddbierd, alrdouglas, Nowhere Man, happynz, occams hatchet, hlee1169, rcbowman, alefnot, MeMeMeMeMe, Yellow Canary, rhetoricus, zesty grapher, StrayCat, Rosaura, happy camper, doinaheckuvanutjob, MBNYC, CA Nana, Persiflage, Clive all hat no horse Rodeo, The Lighthouse Keeper, Compostings, Hedwig, Temmoku, Nulwee, Pandoras Box, Thinking Fella, Cat Whisperer, Loudoun County Dem, camlbacker, gloriana, sfbob, bnasley, gchaucer2, Librarianmom, Wreck Smurfy, uciguy30, Mark Wallace, rmonroe, Ari Bronstein, rogerdaddy, mconvente, also mom of 5, TX Freethinker, Cordwainer, bill warnick, Jim M, elwior, billvb, Buckeye Nut Schell, Akonitum, jamess, monkeybrainpolitics, LearningCurve, Lujane, pamelabrown, smartdemmg, Jake Williams, alnep, glendaw271, petulans, emidesu, dmhlt 66, lostboyjim, JBL55, oldliberal, Aaron Krager, Bule Betawi, WhizKid331, rsmpdx, greengemini, hummingbird4015, carolyn urban, moonbatlulu, TheOpinionGuy, papahaha, stevenwag, sfarkash, vadasz, Little Flower, Tortmaster, nancat357, astral66, Clyde the Cat, sneakers563, smileycreek, brooklyns finest, FogCityJohn, willynel, David PA, LaughingPlanet, estreya, aklib, fidellio, Giles Goat Boy, FrankCornish, 2questions, gulfgal98, bradams, googleimage, rja, nirbama, martini drinking atheist, ericlewis0, Otteray Scribe, USHomeopath, not4morewars, Philly526, WineRev, LordRobin, Bob Duck, Seitanist, MrSpock, Araguato, princesspat, miscanthus, peacevehicle, molunkusmol, Fab2008, Lorikeet, Ricochet67, Cinnamon Rollover, thomask, noodles and doodles, teloPariah, dakinishir, MRA NY, Fire bad tree pretty, illegal smile, PhilJD, Andrew F Cockburn, DRo, Chitown Kev, Mentatmark, Auriandra, ParkRanger, Davui, No one gets out alive, rosette, nominalize, anodnhajo, cwsmoke, Invictus88, Mindful Nature, Siri, Jakkalbessie, Philosoraptor, pimutant, exatc, deanarms, renewables, ricklewsive, avsp, arizonablue, T C Gibian, George3, mumtaznepal, JAM11, RonK, CalBearMom, nomandates, doesnotworkorplaywellwithothers, Dallas L, ebailey, LiberalSage, Says Who, weck, mtnlvr1946, Alhambra, MBishop1, shinobi9, thatpj, markc9503

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site