For those who missed it, here is a link to Episode One
That diary ended with a discussion of availability as one of the key judgment biases too which we poor limited humans fall prey.
This diary will discuss the second judgmental bias/heuristic, that of representativeness.
Join me for today's lesson below the fold...
But first, a question...
Here is the set up....
Imagine that you are on Let's make a deal and are offered the choice between three doors. Behind one door is a great prize, and behind the other two are junk. After you have chosen your door, Monty looks at you and says... Let's show you what's behind one of the doors you didn't choose....
Surprise, surprise, it's junk.
Then he says.. "Would you like to switch to the other door or keep yours?"
So - which is it?
Representativeness:
The probability that event X belongs to set Y (or X generated by Y) is judged on the basis of how similar X is to the stereotype of Y.
So, for example, when I ask my class for a stereotypical bird, they say "Robin", or "Crow", when actually the most common bird in the world is a chicken.
Why?
Well, chickens don't look like a "representative" of the bird family becuase:
- They are large
- They don't fly
- We eat them
This judgmental error comes into play in day to day activities in a number of ways. But first, a quick review of probability theory...
Let p(X) represent the objectve likelihood of X being true.
- For any event A, 0≤ p(A)≤1.
- If the set of events of a given type is denoted by S, then p(S) = 1.
- Addition: p(A or B) = p(A) + p(B) - p(A and B)
- If A and B are mutually exclusive, then p(A or B) = p(A) + p(B), as p(A&B) is zero
- Conditional Probability p(A|B) = p(A&B)/p(B)
- Multiplication Rule: p(A&B) = p(A) * p(B|A) = p(B) * p(A|B)
- Bayes' Rule used for updating beliefs with new data p(A|B) = p(A&B)/p(B)
7a. Using the multiplication rule can be rewritten as:
p(A|B) = p(A) p(B|A) / p(B)
7b. Using disjoint decomposition of p(B), this can be rewritten as
p(A|B) = p(B|A) * p(A) / (p(B|A) * p(A) + p(B|~A) * p(~A))
So, here's an example (and the answer may surprise you):
Suppose you were starting work and had to pass a drug test. The test is very reliable: The test will state the presence of an illegal drug 95% of the time when it is present and the test will be negative 95% of the time when no illegal drugs are present
Assume 5% of the population are regular drug users
If your test comes up positive, what is the ACTUAL probability that you are using drugs?
Let pos = event that test is positive (~pos = negative)
Let user = even that you are a drug user
We want p(user|pos)
p(user) = 0.05
p(pos|user) = 0.95 (specificity TP / TP + FN)
p(~pos|~user) = 0.95 (sensitivity TN / TN + FP)
Bayes’ calculation
p(user|pos) =
p(pos|user)*p(user) / p(pos|user) * p(user) + p(pos|~user) * p(~user)
= (.95 * .05) / (.95 * .05) + (.05 * .95) = ONE HALF.
Just as reliable as a coin toss!
I imagine you were as surprised as most of my students.
Why were you surprised? Because 95% accuracy just sounds so good! The problem is, with a low base rate (5%), it takes an EXTREMELY accurate test to move far away from that base rate.
As a rule, there are three factors to consider in estimating P(A|B)
What is the prior probability of A (base rate)?
How strong is the association between B and A?
How reliable is the information?
Most people ignore 1 and 3 and just use 2. That is the representativeness heuristic (subtitled, people like a good story).
Another example, to see if you're paying attention.
A taxi cab was involved in a hit-and-run accident at night. Two cab companies operate in the city, the Green and the Blue. You are given the following data:
85% of the cabs in the city are Green, 15% Blue.
A witness identified the cab as blue. The court tested the witness’s ability to identify cabs under the same conditions that existed the night of the accident. When presented with a sample of cabs, (half of which were Blue, the rest Green) the witness made correct identifications in 80% of the cases and erred in 20% of the cases.
Question: What is the probability that the cab involved in the accident was Blue?
Don't do the math right away, but what do you "think" it is.
Well - here are some results from my past classes:
Modal Response: 80%
Percentage of class responding with:
The witness rate (80%) 43%
The base rate (15%) 7%
Median response 72.5%
For those of you who didn't jump right into the math, the answer is 41.3%
Why are so many people so bad at this? They take the witness' testimony at face value and use it as their judgement. Basically, they are confusing the probability that the witness said it was blue given that it was blue, with what they want, which is the probability of it being blue given that the witness said it was blue.
One more example, then on to something else. This one is far more severe, and may generate some debate.
100 doctors were told this hypothetical scenario...
Suppose you have examined a woman for breast cancer. The woman has a lump in her breast, but based on many years of experience, you estimate the odds of a malignancy as 1 in 100 (for a woman of this age with these symptoms). Just to be safe, however, you order a mammogram. The mammogram correctly diagnoses about 80% of malignant tumors and about 90% of benign tumors. The test comes back and, much to your surprise, it is positive.
What is the probability that your patient has cancer?
Frighteningly, 95 of the doctors said: 80%
What's the real answer? So that you don't have to do the math, it's .075 (7.5%). So, a number of doctors would scare the crap out of this woman and 93% of the time, they would be WRONG.
Let's look at this on a frequency scale...
Of 1000 women given additional screening based upon an abnormal examination (e.g. lump)
Assumptions: Cancer Base Rate= 8%, True positive hit rate =92% true negative 88%
| Cancer | No cancer | Total |
Pos. test | 74 | 110 | 184 |
Neg. test | 6 | 810 | 816 |
Total | 80 | 920 | 1000 |
So, what does this mean?
False Positive Rate: 110/184 = 60%. So, you scare the crap out of a lot of people that didn't need to be scared. Now, I am not advocating against scans, but you really have to be concerned about the accuracy of tests for small base rate events.
OK - Last bit....
But first, more laws of probability:
A compound event p(A & B) cannot be more likely than either of its components p(A) or p(B).
So, let's hear about Linda...
Linda is 31 years old single outspoken and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice and also participated in anti-nuclear demonstrations.
Please rank the following statements by their probability, using 1 for the most probable and 8 for the least probable.
a. Linda is a teacher in a primary school.
b. Linda works in a bookstore and takes Yoga classes.
c. Linda is an active feminist
d. Linda is a psychiatric social worker
e. Linda is a member of Women Against Rape
f. Linda is a bank teller
g. Linda is an insurance salesperson.
h. Linda is a bank teller and is an active feminist
Given the rules of probability, h CANNOT be more likely than c or f, however, in the general ranking of these the order goes c, h, f.
WHY?
Because people like a good story. Linda SOUNDS more like h than c. The more specific you are about people, the more likely others think that the description is accurate, although it cannot be so.
Anyway - I will close with a piece of advice.
When estimating x, start with the base rate of x. The less you know, the closer your estimate should be to the base rate. Extreme forecasts should only be given from very reliable predictors.
I will leave it to commenters to discuss why this is so important in our soundbite ADHD culture.