Probability for Kossacks: the fallacy of the inverse

by Caj

Community

(This content is not subject to review by Daily Kos staff prior to publication.)

Wednesday, Dec. 09, 2009 Wednesday, Dec. 09, 2009 at 10:24:57am PST

In the blogosphere, we often see arguments like this:

The odds of [some event] happening by random chance are incredibly low. Therefore [event] was rigged or influenced to happen that way.

I've seen this argument applied to variations in voting patterns, timelines of terror attacks, the timing of bad news, and coincidences in general. People are prone to suspect an evil actor behind unwanted events, and a mathematical argument seems to confirm our suspicions. It's a very compelling argument.

It is also bogus, as you can observe any time you play cards. Shuffle a deck of cards and look at the outcome: the probability of that shuffle happening by random chance is 1 in 80658175170943878571660636856403766975289505440883277824000000000000. Obviously this was rigged by a conspiracy while you weren't looking!

This is a common logical fallacy known as the confusion of the inverse. I explain below.

The fallacy centers around the ambiguity of this question:

What is the probability that event X happened by chance?

This question has two confusable meanings:

Under random chance, what is the probability of X happening?

If X happens, what is the probability that random chance is the cause?

To write this conditionally, question (1) asks for Pr[X|Chance] and question (2) asks for Pr[Chance|X]. In probability, a "|" means "given that"; or to put it another way, everything to the left of the bar is the thing you are wondering about, and everything to the right of the bar is stuff you either know, or assume to be true.

These expressions are not the same: they are inverses of one another, and have very different meanings and often very different values. The conspiratorial blogger should be making a decision based on (2), but almost always people mistakenly compute (1), yielding an impressively tiny number that doesn't really mean anything, but which can sway an audience of laypeople.

Look everyone! Little tiny numbers!

But this fallacy is bigger than simply using the wrong formula. It also employs pseudoscience and argument from incredulity. First the incredulity: you're supposed to believe that an event couldn't have happened because its probability is so impressively tiny. What is missing is the context: incredibly tiny is in fact perfectly normal.

In reality, most anything that ever happens by chance has probability well nigh 0. You spill some salt, and the scattering of grains takes on one of inexhaustibly many different outcomes, each with infinitesimal probability. (1) is small! And yet that incredibly improbable outcome did happen by chance, without any reason to suspect the unseen aid of space lasers.

Even more mind-numbing is the fact that it will never happen again: any individual card shuffle is so unlikely that, once seen, you can be guaranteed that it will never be seen again, for the rest of the lifetime of the universe, assuming the shuffling is fair. There is something weirdly unintuitive about observing an event happen right in front of you, and immediately declaring that it can never happen.

This can be counter-intuitive because when we hear "probability 0," or even "odds of 1 in a million," we think "this can't have happened." But that's not what probability means. A low probability does not mean an event can't have happened; it does mean that if you predict that specific event to happen, in advance, then you are not going to be right.

If that confuses you, just remember the old joke about the farmer who would shoot the side of his barn, and then paint a bullseye wherever he hit. The probability of landing on the bullseye is only low if you declare the bullseye in advance. That's what the number Pr[X] ultimately describes.

Now, multiply by the probability of aliens

Okay, the second fallacy behind the confusion of the inverse: usually, the probability Pr[Chance|X] can't even be computed, because it doesn't have a well-defined value. So making a mathematical argument is pseudo-scientific to begin with.

Often we can compute formula (1) (this is why so many people mistakenly use that value,) but to get formula (2) you need to know numbers that you can't possibly know. We can see this using Bayes Rule: the expressions (1) and (2) are related as follows:

Pr[Chance|X] = Pr[X|Chance](Pr[Chance]/Pr[X])

The right side contains parts that often have no meaningful value. Pr[X] is the probability of X happening by any cause, from coincidence to conspiracies to space lasers. You don't know that number. And Pr[Chance] is the overall probability of no conspiracy, no space lasers. If an election is going to happen tomorrow, what is the probability that it will be in some way rigged? 1? 0.1? 0.001? How do you know?

How would we even get an estimate of that number? By examining previous elections? That would only tell us the odds of election-rigging happening and being caught. And what elections do we count? All the races in that same district? There aren't enough elections, and election personnel and machines change too rapidly, to get any useful estimate of the odds of a conspiracy. It's like trying to tell if a coin is fair by observing three coin flips.

It is possible to compute this type of number in very carefully designed experimental circumstances, which is what scientists do. You compare results of drug XYZZY versus the results of a placebo, and you design the experiment so that you know the precise numbers of each case. Likewise, in engineering, we can use these formulas because we know all the probabilities of everything---having built everything. If you want to decide if a received signal represents a 0 or a 1, you know Pr[0] and Pr[1], because you built the transmitter. But none of this applies if you try to analyze, after the fact, an event that happened in the wild.

Adjust your brain resolution

There's one more mistake behind the fallacy of the inverse: the idea that we can draw a box around an event and determine the odds. The odds of what? Where do you draw the box? How much detail do you include?

For example: I roll six dice and get 3, 4, 1, 5, 6 ,2. What are the odds? One in 46656? Well, maybe if you only look at the numbers on top. Suppose you consider the exact positions where the dice fell, or their orientation---what are the odds of them landing like that? Much smaller, certainly.

When people examine a real-world event and try to compute the odds of it happening, they have to choose what detail to include and what not to include. This event resolution can make the probability much smaller, much larger, and that much more meaningless as a number.

So now that you know how it's done...kids...don't do it

I hope this diary will innoculate your brain against a common mathematical misconception. If ever you see someone arguing over whether the Governor's memo was intentional and they bust out the odds, remember that those odds are often meaningless, and any argument based on them is largely pseudo-quantitative.

I guess the moral of this diary is this: mathematics is a tool for evil, and if any of us try to convince you of something using mathematics, you should assume the opposite is true.

Ha ha, just kidding, the preceding sentence is false. Happy Wednesday.