Read the following description of a person.
Tom W. is meek and keeps to himself. He likes soft music and wears glasses. Which profession is Tom W. more likely to be? 1) Librarian. 2) Construction worker.
If you picked librarian without thinking too hard, you used the representativeness heuristic - you matched the description to the stereotype, while ignoring the base rates.
Ideally, you should have examined the base rate of both professions in the male population, then adjusted based on his description. Construction workers outnumber librarians by 10:1 in the US - there are likely more shy construction workers than all librarians!
More generally, the representativeness heuristic describes when we estimate the likelihood of an event by comparing it to an existing prototype in our minds - matching like to like. But just because something is plausible does not make it more probable.
The representativeness heuristic is strong in our minds and hard to overcome. In experiments, even when people receive data about base rates (like about the proportion of construction workers to librarians), people tend to ignore this information, trusting their stereotype matching more than actual statistics.
(Shortform note: even after reading this, you might think - but what about self-selection? Don’t meek people tend to seek library jobs and stay away from construction jobs? Isn’t it possible that all the shy librarians outnumber all the shy construction workers, even though there are 10 times more construction workers than librarians? This just goes to show how entrenched the representativeness heuristic is—you seek to justify your stereotype rather than looking at the raw data.)
Here’s another example:
Someone on the subway is reading the New York Times. Is the stranger more likely to have a PhD, or to not have a college degree?
Again, by pure number of people, there are far more people in the latter group than the former.
Representativeness is used because System 1 desires coherence, and matching like to like forms a coherent story that is simply irresistible.
The representativeness heuristic works much of the time, so it’s hard to tell when it leads us astray. Say you’re shown an athlete who’s thin and tall, then asked which sport he plays. You’d likely guess basketball more than football, and you’d likely be correct.
(Shortform note: the representativeness heuristic causes problems when your System 1 forms a coherent story that is inaccurate. Common problems involve stereotypes that cause incorrect snap judgments:
System 1 desires a coherent story. Take away this convenient story, and you engage System 2. In the Tom W. question above, when students are asked to estimate the % of the population working in construction or libraries, the guesses are far more accurate. System 1 no longer has a stereotype to be led astray.
In general, the way to overcome the representativeness heuristic is to use Bayesian statistics. Start by predicting the base rates, using whatever factual data you have. Then consider how the new data should influence the base rates.
For example, in the New York Times example above, start by estimating the % of people who have a PhD and the % who have a college degree. Say you think that 2% of people have a PhD, and 50% have a college degree. Therefore, any random person would be 25 times more likely to have a college degree than a PhD. Then, when you receive the new information that the person is reading the New York Times, think about how that would influence movement from the base rates you estimated. You’ll likely end up with a more accurate estimate than if you didn’t estimate the base rate.
(Shortform note: to counter stereotypes, think about what factors matter, and how you’ll measure whether someone matches those factors. For example, when hiring for a job, think about what skills you need in the job and how you’ll measure whether a job candidate shows those skills. With these objective criteria, you’ll avoid relying on stereotypes.)
Related to the representativeness heuristic is the conjunction fallacy. The best way to illustrate this is with an example.
Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.
Which is more probable?
Linda is a bank teller.
Linda is a bank teller and is active in the feminist movement.
If you guessed 2, you fell for the conjunction fallacy. 1 is clearly a broader option than 2—there are many bank tellers who aren’t active in the feminist movement—so 1 should always be more likely. However, 2 explicitly mentioned a coherent story and thus seemed more representative of Linda, even though it’s more statistically unlikely.
(If you fell for this, don’t feel bad—over 85% of undergraduate students chose the second option.)
Here’s another example. Pick which event is more likely:
A massive flood in North America in which more than 1,000 people drown.
An earthquake in California that causes a flood in which more than 1,000 people drown.
The latter sounds more plausible because of the vividness of its detail—you can picture the cause of the flood. But it’s certainly less probable.
This is a problem when listening to forecasters—adding details to scenarios makes them more persuasive, but less likely to come true.
(Shortform note: if you’re interested, here’s a link with more examples.)
Surprisingly, this fallacy is not invoked when a story doesn’t have coherence:
Which is more probable?
Mark has hair.
Mark has blond hair.
System 1 doesn’t have a chance to build an overall narrative here, so it becomes a pure statistical problem, and System 2 takes over. Many more people get the correct answer here.
Interestingly, there is a way to remove the bias by focusing people to name specific quantities, rather than just estimate percentages. Using the Linda example again, the questions would change to:
There are 100 people who fit the description above. How many of them are:
Bank tellers? __ of 100
Bank tellers and active in the feminist movement? __ of 100
When presented this way, many people realize the fallacy and change their answers to be statistically valid. Kahneman suggests this might be because this framing causes people to visualize 100 people in a room, and then they realize it’s clearly a mistake to have the group of feminist bank tellers be larger than the group of bank tellers.
More antidotes to conjunction fallacy:
The last example in the theme of representativeness is how the average value of a set of items can confuse us about its total value. Here’s an example:
Which is more valuable?
24 dinner plates
30 dinner plates, 5 of them broken
When viewed like this, the question is easy. The second option with 5 broken plates should be strictly more valuable because it has 25 intact dishes, whereas the first option only has 24.
But when viewed separately, people who view only option 1 are willing to pay more than people who view only option 2. The presence of broken dinner plates “pollutes” the set, and people average the whole set less.
As discussed earlier, System 1 is good at considering the average of items, but not so good at calculating the sum of items. Here, people use the heuristic—what is the average value of the plate in each set?—rather than considering the total value of all plates.