Jonathan J. Koehler (1994) Base Rates and the "illusion Illusion". Psycoloquy: 5(09) Base Rate (9)

Volume: 5 (next, prev) Issue: 09 (next, prev) Article: 9 (next prev first) Alternate versions: ASCII Summary
PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).
Psycoloquy 5(09): Base Rates and the "illusion Illusion"

Reply to Ayton, Gregson, Hamm, Koonce, McCauley, McKenzie & Spellman
on Koehler on Base-Rates

Jonathan J. Koehler
Department of Management Science and Information Systems
University of Texas at Austin
Austin TX 78712-1175


None of the commentators so far has been willing to defend the base rate fallacy. Instead, most offered evidence, theories and criticism reinforcing the view that the existing base rate research program is misleading and in need of examination in ecologically relevant environments.


Base rate fallacy, Bayes' theorem, decision making, ecological validity, ethics, fallacy, judgment, probability.
1. Surprisingly, none of those who have commented on my target article (Koehler, 1993a) so far has seriously disputed the premise that we have been oversold on the base rate fallacy. At the descriptive level, commentators (a) provided additional evidence that people use base rates (Koonce, 1993), (b) offered theories for base rate use (Spellman (1993; see also Ayton's discussion of Kahneman and Lovallo, 1993), and (c) criticized studies that concluded that base rates are ignored (McCauley, 1994; McKenzie, 1994). Likewise, Hamm (1994) -- who was quite critical of people's performance in probabilistic inference tasks -- agreed that the literature does not support the conclusion that people typically neglect base rates (see also McCauley). Even Gregson (1993), who began his commentary with a dismissive remark, referred to the target article as "exhaustive and definitive." Why, then, have so many others reached different conclusions about the base rate fallacy? Where are their voices? Where are those who will defend the oft-repeated conclusion that people ignore base rates, and that the fallacy is a matter of established fact? Where are those who will say that we should continue to study base-rate neglect using the same narrow problems and performance standards that have been used continuously for more than twenty years?

2. Two commentators, McCauley and McKenzie, identified problems associated with determining what constitutes a base rate. McKenzie argued that the classic base rate neglect study by Kahneman and Tversky (1973) fails to distinguish between base rates and prior probabilities, and that the methods used were inadequate for determining whether either were ignored. McCauley argued that there is no clear theoretical distinction between a base rate and individuating information. Using the social judgment literature on stereotyping, McCauley showed that the base rate may be any of several different conditional probabilities in the typical social judgment study. Consequently, violations of Bayesian logic cannot be traced to an underweighting or overweighting of particular cues.

3. To my mind, the McKenzie and McCauley commentaries point up the dangers of mechanically applying Bayes' theorem to determine exactly how much weight people assign or should assign to base rates. None of the probabilities that Bayes' Theorem integrates to identify posterior probabilities is properly identified as base rates. As I have noted elsewhere (Koehler, 1993b), base rates may translate into prior probabilities for obvious reasons, for subtle reasons, and sometimes, for no good reason at all. McCauley has extended this reasoning by showing that base rates may feed into Bayesian likelihoods as well. These translation problems challenge the normative component of the base rate fallacy.

4. In light of this problem, care should be taken to show that the assumptions of base rate studies are defensible. Subjects' prior probabilities should be identified and recorded as a check on the often unjustified assumption that subjects' priors will equal the experimenter-supplied base rate. Then, if Bayesian performance standards must be used, they should be based on subjects' individual responses. Those investigators who have taken care to do this, report that subjects appear to pay a great deal of attention to their priors. For example, when Rasinski, Crocker and Hastie (1985) repeated the Locksley, Borgida, Brekke and Hepburn (1980, Study 2) experiment (discussed by McCauley) taking into account the subjects' own stereotypes, the stereotypes were not disregarded when accompanied by individuating diagnostic behavioral information. On the contrary, Rasinski et al. (1985) reported that their subjects "seemed to be overcautious in revising their stereotype-based judgments" (p. 322). A similarly substantial impact for base-rate induced priors was obtained by Wells and Harvey (1978) and Gigerenzer, Hell and Blank (1988) when individualized normative criteria were used.

5. In a sense, though, even those studies that apply Bayes' Theorem to show that base rates are not ignored miss the larger point. It is one thing to construct a problem that has a single solution, give this problem to laboratory subjects, then argue that they do or do not make reasonably good use of the various information cues. It is quite another thing to argue from such studies that people "generally" do or do not do such and such. Such a conclusion is linked not only to the number and reliability of studies that support the phenomenon, but also to the ecological validity of those studies. If the stimuli, incentives, performance standards and other contextual features in base rate studies are far removed from those encountered in the real world, then what are we to conclude? Shall we assume that people would be richer, more successful and happier if only they paid more attention to base rates? Shall we tell professional auditors -- a population that already seems to pay substantial attention to base rates (Koonce) -- that they too should pay closer attention to base rates? The risk, of course, is that this recommendation may lead them to attach too little weight to other information.

6. Several commentators offered explanations for people's performance in laboratory base rate studies. Spellman drew on implicit learning theory to explain the observation that people seem to use base rates more in some contexts than others. She noted that base rates seem to be most influential when they are learned observationally (e.g., through feedback trials) and when the learning is demonstrated in an equally implicit way (e.g., appropriate response to subsequent stimulus). There is empirical support for Spellman's theoretical distinction in the base rate literature (Christensen-Szalanski and Beach, 1982; Christensen- Szalanski and Bushyhead, 1981; Lindeman, Van Den Brink & Hoogstraten, 1988; Manis, Dovalina, Avis & Cardoze, 1980; Medin & Edelson, 1988). Whether Spellman's theory adequately explains the massive base rate literature or not, I like it because it reminds us that task structure and task environment matter. This simple point, and the data that support it, challenge the descriptive component of the base rate fallacy.

7. The recent work by Gigerenzer and his colleagues on the distinction between objective probabilities (e.g., relative frequencies) and subjective probabilities (e.g., single-event probabilities; Gigerenzer, 1991, 1992, in press; Gigerenzer & Murray, 1987, chapter 5) is also relevant to this controversy. Gigerenzer forcefully argues that the entire heuristics and biases enterprise (of which the base rate fallacy is a major part) is flawed because it blurs the distinction between these two types of probabilities. This is an important failing because people's use of base rates and other data appears to depend on whether the data and subjects' responses are described in terms of single event probabilities or in terms of relative frequencies (Cosmides & Tooby, in press).

8. On the other hand, Hamm correctly noted that there is much evidence that people in different environments have trouble with overtly probabilistic inference problems. For example, people often confuse P(E|H) with P(H|E). This error may be a common and costly one, but it should not be equated with base rate neglect. Among people who mistake P(E|H) for P(H|E), base rates are no more neglected than any other piece of diagnostic information (e.g., P(E|-H)). Once the error of confusing P(E|H) with P(H|E) is committed, there is no reason to expect people to moderate their estimates of P(H|E) on the basis of P(H).

9. Rather than writing people off as Bad Bayesians and base rate neglecters on the basis of this type of confusion, perhaps we should think more about how to present probabilities in ways that minimize misunderstandings. Reframing probabilistic information in terms of relative frequencies may be helpful, but this is not the only solution. Macchi (1991) showed that when subjects are provided with statements of likelihood information that are verbally dissimilar to their inverses, probabilistic confusions are reduced dramatically.

10. Such research can have profound practical implications. Gregson gave an example of a statistician who testified in a case involving forensic science evidence and concluded that "legal methods of inference and statistical methods of inference are not necessarily compatible." But does this mean that statistical methods of inference should be ignored in cases that include overtly probabilistic evidence? Consider criminal cases that include evidence of DNA matches between suspects and recovered traces of genetic material? Here, statisticians could play an important clarifying role. They could explain that P(E|-H), that is, the chance that a match [ENDNOTE 1] would occur if the suspect were not the source of the trace is not identical to P(-H|E), the chance that a matching suspect is not the source of the trace. This can be demonstrated by way of examples. Such a demonstration might also remind judges and jurors that nonforensic factors are relevant to their legal fact-finding mission. In short, though Gregson is right when he says the methods of law and statistics often conflict, it is probably also true that legal confusions about probabilistic evidence can be averted through the well-placed testimony of a (dare I say) Bayesian expert.

11. Ayton discussed a recent paper by Kahneman and Lovallo (1993) in which they argue that people have a tendency to see problems as unique ("inside view") when they should be viewed as instances of a broader class ("outside view"). This distinction may be helpful in some cases where accurate judgment is the primary goal, though less helpful when process and policy considerations are paramount. But an important caveat remains: even where goals other than accuracy do not exist, decision makers may need to contend with the problem of multiple reference classes. Who's to say which of several competing reference classes people ought to use? Suppose you are the manager of a Major League baseball team. Your team is up to bat with two outs and the bases loaded in the bottom of the ninth inning and the score is tied. You must decide which of two pinch hitters to bring to bat. You know that Spangler is the better hitter, although neither he nor Hickman has ever faced a good forkball pitcher similar to the one on the mound. You also suspect that Hickman is a slightly better hitter in pressure- packed situations, although your supportive data are limited. Can the outside view help you make the right decision? The development of sound prescriptive guidelines in such cases is the type research that deserves far more attention.

12. Gregson wondered whether decision makers are doing something that is not easily captured by "the simple form" of Bayes' theorem. Perhaps their probabilistic reasoning strategies are more complex. However, Hamm suggested that people in general, and medical doctors in particular, don't reason probabilistically at all. Instead, doctors rely on rule-based "mental scripts" to address both common and uncommon medical problems. According to Hamm, they make no calculations, and ignore probabilistic principles except insofar as those principles are already incorporated into the mental scripts. Hamm gave an example in which a mental script pertaining to interpretations of positive AIDS tests appeared to be altered by a base rate argument. But in this example it is hard to know whether the script was altered by the base rate argument (abstract) or by the outcome feedback that supported the base rate argument (concrete). If the feedback is the salient factor, then this may be a variation of the implicit learning Spellman described.

13. In the target article, I argued that the existing research program must be replaced by one that confronts base rate usage in real world tasks and embraces more realistic and flexible performance standards. This program should place greater emphasis on the development of descriptive theory and prescriptive recommendations than on the discovery of inconsistencies with Bayes' theorem in narrow tasks. As Koonce reminded us, the features that define the environments of real world decision makers (e.g., expertise, time-pressure, accountability) can cause their judgments to look quite different from those of laboratory subjects. Moreover, even where it can be demonstrated that people violate normative canons, McKenzie (in press) explains that many of the non-normative strategies people follow perform quite well across a variety of environmental conditions. The apparent failure of these strategies in a laboratory experiment may tell us little about the strategy's success in the real world.

14. Finally, Ayton mused about the possibility of a final cognitive illusion, one he called "the illusion illusion." In a more serious vein, this commentator suggested that psychologists ought to be more concerned with how people think rather than how well they think. I do not entirely agree. Many business students take courses from decision theorists to become more effective decision makers. Descriptive theory is important in such courses, but so too are the normative models, and the prescriptive guidelines that spring from them. Questioning the relevance of those models and guidelines for real world decision making is an appropriate scientific activity. It is also an activity that can lead to constructive changes in our research programs.


1. By "match," I mean a "true match" (i.e., the suspect and the trace truly match), as opposed to a "reported match" (i.e., a laboratory technician SAYS that the suspect and trace truly match). This important difference -- a difference many statisticians recognize and can explain -- has been overlooked by the courts (see Koehler, in press).


Ayton, P. (1993) Base Rate Neglect: An Insider View of Judgment? PSYCOLOQUY 4(63) base-rate.5.ayton.

Christensen-Szalanski, J.J.J. & Beach, L.R. (1982) Experience and the Base-rate Fallacy. Organizational Behavior and Human Performance 29:270-278.

Christensen-Szalanski, J.J.J. & Bushyhead, J.B. (1981) Physicians' Use of Probabilistic Information in a Real Clinical Setting. Journal of Experimental Psychology: Human Perception and Performance 7:928-935.

Cosmides, L. & Tooby, J. (in press) Are Humans Good Intuitive Statisticians After All? Rethinking Some Conclusions from the Literature on Judgment Under Uncertainty. Cognition.

Gigerenzer, G. (1991) How to Make Cognitive Illusions Disappear: Beyond "Heuristics and Biases." European Review of Social Psychology 2:83-115.

Gigerenzer, G. (November, 1992) Where Do We Go From Here: After Heuristics and Biases. Paper presented at the meeting of the Society for Judgment and Decision Making, St. Louis, MO.

Gigerenzer, G. (in press) Why the Distinction Between Single-event Probabilities and Frequencies is Important for Psychology (and Vice Versa). To appear in G. Wright & P. Ayton (Eds.), Subjective Probability. New York: Wiley.

Gigerenzer, G., Hell, W. & Blank, H. (1988) Presentation and Content: The Use of Base Rates as a Continuous Variable. Journal of Experimental Psychology: Human Perception and Performance 14:513-525.

Gigerenzer, G. & Murray, D.J. (1987) Cognition as Intuitive Statistics. Hillsdale, NJ: Erlbaum.

Gregson, R.A.M. (1993) Which Bayesian Theorem Could Be Compared With Real Behaviour. PSYCOLOQUY 4(50) base-rate.2.gregson.

Hamm, R.M. (1994) Underweighting of Base-rate Information Reflects Important Difficulties People Have With Probabilistic Inference. PSYCOLOQUY 5(3) base-rate.7.hamm.

Kahneman, D. & Lovallo, D. (1993) Timid Choice and Bold Forecasts: A Cognitive Perspective on Risk Taking. Management Science, 39, 17-31.

Kahneman, D. & Tversky, A. (1973) On the Psychology of Prediction. Psychological Review 80:237-251.

Koehler, J.J. (1993a) The Base Rate Fallacy Myth. PSYCOLOQUY 4(49) base-rate.1.koehler

Koehler, J.J. (1993b) The Base Rate Fallacy Reconsidered: Normative, descriptive and methodological challenges. Unpublished manuscript.

Koehler, J.J. (in press) Error and Exaggeration in the Presentation of DNA Evidence at Trail. Jurimetrics.

Koonce, L.L. (1993) Base Rate Usage in Accounting. PSYCOLOQUY 4(51) base-rate.3.koonce.

Lindeman, S.T., Van Den Brink, W.P. & Hoogstraten, J. (1988) Effect of Feedback on Base-rate Utilization. Perceptual and Motor Skills 67:343-350.

Locksley, A., Borgida, E., Brekke, N. & Hepburn, C. (1980) Sex Stereotypes and Social Judgment. Journal of Personality and Social Psychology 39:821-831.

Macchi, L. (November, 1991) The Base-rate Fallacy and the Discourse Structure of Problems. Paper presented at the meeting of the Society for Judgment and Decision Making, San Francisco, CA.

Manis, M., Dovalina, I., Avis, N.E. & Cardoze, S. (1980) Base Rates Can Affect Individual Predictions. Journal of Personality and Social Psychology 38:231-248.

McCauley, C. (1994) Stereotypes as Base Rate Predictions. PSYCOLOQUY 5(5) base-rate.8.mccauley.

McKenzie, C.R.M. (1994) Base Rates Versus Prior Beliefs in Bayesian Inference. PSYCOLOQUY 5(1) base-rate.6.mckenzie.

McKenzie, C.R.M. (in press) The Accuracy of Intuitive Judgment Strategies: Covariation Assessment and Bayesian Inference. Cognitive Psychology.

Medin, D.L. & Edelson, S.M. (1988) Problem Structure and the Use of Base-rate Information from Experience. Journal of Experimental Psychology: General 117:68-85.

Rasinski, K.A., Crocker, J. & Hastie, R. (1985) Another Look at Sex Stereotypes and Social Judgments: An Analysis of the Social Perceiver's Use of Subjective Probabilities. Journal of Personality and Social Psychology 49:317-326.

Spellman. B. A. (1993) Implicit Learning of Base Rates. PSYCOLOQUY 4(61) base-rate.4.spellman.

Wells, G.L. & Harvey, J.H. (1978) Naive Attributors' Attributions and Predictions: What Is Informative and When Is an Effect an Effect? Journal of Personality and Social Psychology 36:483-490.

Volume: 5 (next, prev) Issue: 09 (next, prev) Article: 9 (next prev first) Alternate versions: ASCII Summary