Kahneman transitions to Part 2 from Part 1 by explaining more heuristics and biases we’re subject to.
The general theme of these biases: we prefer certainty over doubt. We prefer coherent stories of the world, clear causes and effects. Sustaining incompatible viewpoints at once is harder work than sliding into certainty. A message, if it is not immediately rejected as a lie, will affect our thinking, regardless of how unreliable the message is.
Furthermore, we pay more attention to the content of the story than to the reliability of the data. We prefer simpler and coherent views of the world and overlook why those views are not deserved. We overestimate causal explanations and ignore base statistical rates. Often, these intuitive predictions are too extreme, and you will put too much faith in them.
This chapter will focus on statistical mistakes - when our biases make us misinterpret statistical truths.
The smaller your sample size, the more likely you are to have extreme results. When you have small sample sizes, do NOT be misled by outliers.
A facetious example: in a series of 2 coin tosses, you are likely to get 100% heads. This doesn’t mean the coin is rigged.
In this case, the statistical mistake is clear. But in more complicated scenarios, outliers can be deceptive.
Case 1: Cancer Rates in Rural Areas
A study found that certain rural counties in the South had the lowest rates of kidney cancer. What was special about these counties - something about the rigorous hard work of farming, or the free open air?
The same study then looked at the counties with the highest rates of kidney cancer. Guess what? They were also rural areas!
We can infer that the fresh air and additive-free food of a rural lifestyle explain low rates of kidney cancer; we can also infer that the poverty and high-fat diet of a rural lifestyle explain high rates of kidney cancer. But we can’t have it both ways. It doesn’t make sense to attribute both low and high cancer rates to a rural lifestyle.
If it’s not lifestyle, what’s the key factor here? Population size. The outliers in the high-cancer areas appeared merely because the populations were so small. By random chance, some rural counties would have a spike of cancer rates. Small numbers skew the results.
Case 2: Small Classrooms
The Gates Foundation studied educational outcomes in schools and found small schools were habitually at the top of the list. Inferring that something about small schools led to better outcomes, the foundation tried to apply small-school practices at large schools, including lowering the student-teacher ratio and decreasing class sizes.
These experiments failed to produce the dramatic gains they were hoping for.
Had they inverted the question - what are the characteristics of the worst schools? - they would have found these schools to be smaller than average as well.
When falling prey to the Law of Small Numbers, System 1 is finding spurious causal connections between events. It is too ready to jump to conclusions that make logical sense but are merely statistical flukes. With a surprising result, we immediately skip to understanding causality rather than questioning the result itself.
Even professional academics are bad at understanding this - they often trust the results of underpowered studies, especially when the conclusions fit their view of the world. (Shortform note: Kahneman clearly had a problem with this himself with the priming studies from Part 1!)
The name of this law comes from the facetious idea that “the law of large numbers applies to small numbers as well.”
The only way to get statistical robustness is to compute the sample size needed to convincingly demonstrate a difference of a certain magnitude. The smaller the difference, the larger the sample needed to get statistical significance on the difference.
Consider this result: “In a telephone poll of 300 seniors, 60% support the president.”
If you were asked to summarize this in a few words, you’d likely end with something like “old people like the president.”
You don’t react much differently if the sample were with 150 people or 3000 people. You are not adequately sensitive to sample size.
Obviously, if the figures are way off (6 seniors were asked, or 600 million were asked), System 1 detects a surprise and kicks it to System 2 to reject. (But note weaknesses in small sample size can also be easily disguised, as in the common phrasing “6 out of 10 seniors”.)
Extending this further, you don’t always discriminate between “I heard from a smart friend” and “I read in the New York Times.” As long as you don’t immediately reject the story, you tend to accept it as 100% true.
People tend to expect randomness to occur regularly. For coin flips yielding heads or tails, the following sequences all have equal probability:
HHHTTT
TTTTTT
HTHTTH
However, sequence 3 “looks” far more random. Sequence 1 and 2 are more likely to trigger a desire for alternative explanations. (Shortform note: the illusion also occurs because there is only one such sequence of TTTTTT, but hundreds of the type like the third that we don’t strongly distinguish between.)
Corollary: we look for patterns where none exist.
Other examples:
Evolutionarily, the tendency to attribute patterns to randomness might have arisen out of a margin of safety for hazardous situations. That is, if a pack of lions suddenly seems to double, you don’t try to think about whether this is just a random statistical fluctuation. You just assume there’s a cause and impending danger, and you leave.
Individual cases are often overweighted relative to statistics. In other words, even when we get accurate statistics about a situation, we still tend to focus on what individual cases tell us.
This was shown to great effect when psychology students were taught about troubling experiments like the Milgram shocking experiment, where 26 of 40 ordinary participants delivered the highest voltage shock.
Students were then shown videos of two normal-seeming people. These people didn’t seem the type to voluntarily shock a stranger. The students were asked: how likely were these individuals to have delivered the highest voltage shock?
The students guessed a chance far below 26/40, the statistical rate they had just been given.
This is odd. The students hadn’t learned anything at all! They had exempted themselves from the conclusions of experiments. “Surely the people who administered the shocks were depraved in some way - I would have behaved better, and normal people like these two folks would as well.” They ignored the statistics in favor of the individual cases given to them.
The antidote to this was to reverse the order - students were told about the experimental setup, shown the videos of the two people, and only then told the outcome of how 26 out of 40 ordinary participants had delivered the maximum shock. Their estimate of the failure rate of the two individuals became much more accurate.
Over repeated sampling periods, outliers tend to revert to the mean. High performers show disappointing results when they fail to continue delivering; strugglers show sudden improvement.
In reality, this is just statistical fluctuation. However, as the theme of this section suggests, we tend to see patterns where there are none. We come up with cute causal explanations for why the high performers faltered, and why the strugglers improved.
Here are examples of reversion to the mean:
Reversion to the mean occurs when the correlation between two measures is imperfect, and so one data point cannot predict the next data point reliably. The “phenomena” above can be restated in these terms: “the correlation between year 1 and year 2 of an athlete’s career is imperfect”; “the correlation of performance of a mutual fund between year 1 and year 2 is imperfect.”
In other words, when we ignore reversion to the mean, we overestimate the correlation between the two measures. When we see an athlete with an outlier performance in one year, we expect that to continue. When it doesn’t, we come up with causal explanations rather than realizing we simply overestimated the correlation.
These causal explanations can give rise to superstitions and misleading rules (“I swear by this treatment for stomach pain.”)
Antidote to this bias: when looking at high and low performers, question what fundamental factors are actually correlated with performance. Then, based on these factors, predict which performers will continue and which will revert to the mean.