Jonathan J. Koehler (1994) Fallacy Under Fire: Round 2. Psycoloquy: 5(21) Base Rate (13)

Volume: 5 (next, prev) Issue: 21 (next, prev) Article: 13 (next prev first) Alternate versions: ASCII Summary

Topic:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 5(21): Fallacy Under Fire: Round 2

FALLACY UNDER FIRE: ROUND 2
Reply to Fletcher, Funder and Macchi on Base-rate

Jonathan J. Koehler
Department of Management Science &
Information Systems
University of Texas at Austin
Austin TX 78712-1175

Koehler@utxvm.cc.utexas.edu

Abstract

Commentators Fletcher, Funder and Macchi question the significance of laboratory studies that purportedly demonstrate failings in human probabilistic judgment. Fletcher and Funder challenge the usefulness of Bayes' theorem for assessing human judgment and Funder offers alternative standards. Macchi does not challenge the standard, but argues that the alleged base rate fallacy arises from ambiguous information transmission in the laboratory problems. These commentaries bolster conclusions drawn in the target article and suggest that we do not yet understand how well people reason in real-world base rate tasks.

Keywords

Base rate fallacy, Bayes' theorem, decision making, ecological validity, ethics, fallacy, judgment, probability.

I. INTRODUCTION

1. In many cases, a response to commentators includes claims of misinterpretation, counterattacks, or, at the very least, clarifications and analyses of major points of disagreement. But I do not seriously dispute the primary theses in the commentaries by Fletcher (1994), Funder (1994) and Macchi (1994). Indeed, their comments increase my confidence in many of the positions taken in the target article.

II. ON THE SILENCE OF LABS

2. In my response to an earlier set of commentators, I expressed surprise at the degree to which the target article's central arguments had gone unchallenged. Surely, it cannot be that everyone agrees with a paper entitled, "The base rate fallacy myth." David Funder actually does agree, but he is less surprised than I was by "The silence of the labs" (his clever phrase, not mine). Funder charges that "error theorists" failed to acknowledge similar criticisms in the past, and that their research continues to be informed by like-minded others only.

3. Although the silence of the labs remains unbroken here, I call attention to a recent exception in the judgment literature: the Gigerenzer-Kahneman exchanges on the heuristics and biases program (Bower, 1994; Gigerenzer, 1992; Kahneman, 1992). I suspect that the provocative exchange between these two skilled orators at the 1992 Judgment and Decision Making Society meeting inspired many others to think about the value and limits of the heuristics framework.

III. THE INSIGNIFICANCE OF LABORATORY ERRORS

4. Fletcher, Funder and Macchi are appropriately skeptical about the significance of certain laboratory demonstrations of error in base rate and other probabilistic reasoning tasks.

III.1 Fletcher

5. Fletcher writes that Bayes' Theorem may be a flawed normative model. One example Fletcher gives is a case in which experimenter assumptions about the correspondence of base rates with prior beliefs have been violated (1994, par. 6). In my opinion, such situations are best thought of not as indictments of Bayes' theorem, but as instances in which the failure to consider the subject's representation of the problem can violate the experimenter's assumptions and produce an inappropriate accuracy criterion.

6. Fletcher also tries to disabuse us of the notion that deviations from normative models in experimental tasks reflect poor thinking. Such a conclusion not only requires knowledge of a subjects' task representation, but knowledge of their goals. Strategies that appear to be suboptimal in the specified task (e.g., conservatism) may be quite sensible when broader goals are considered. As Fletcher indicates, conservative information processing may be a useful general strategy for maintaining belief stability in a world where conflicting data are abundant. Whether or not adherence to such meta-criteria is defensible (and I believe it is), there is evidence that these criteria do influence decisions. For example, Josephs, Larrick, Steele & Nisbett (1992) showed that people base certain risky decisions not only on the objective attributes of choice alternatives, but also on their perceptions of the damage to self esteem that they will incur from bad outcomes.

III.2 Funder

7. In his thoughtful commentary, Funder objects to comparing human judgment against a Bayesian standard and proposes two alternatives. For applied contexts, he recommends a direct comparison approach in which real-world criteria are used instead of the numerical answers given by Bayes. A similar proposal was made in my target article (see section 5.0).

8. Funder also recommends a "construct validity" approach in which accuracy is measured by the extent to which judgments (a) agree with other judgments from other sources, and (b) predict behavior. This would seem to be a fine approach in some cases, although I am not convinced that high agreement and high predictive power should be sufficient to conclude that judgments are "accurate." If you visit me on Saturday afternoon and I give you a piece of cake, you may think there is a high probability that I am "extremely nice." Others who observe this exchange may also think that I am extremely nice. But do we wish to conclude that this judgment is probably accurate when (a) a single, moderately diagnostic act was observed, (b) the base rate for "extreme niceness" is low, and (c) the high observed correlation between your judgment and the judgment of others is due to a reliance on the same cue? As I see it, Funder's approach tells us more about consistency, which is often, but not always, predictive of accuracy.

9. As for Funder's predictive power test -- "the usual gold standard" (1994, par. 7) -- it will be most useful in situations where judges are concerned primarily with judgmental accuracy. But in cases where error costs are asymmetric and loom large, prediction may be beside the point. A physician may choose to treat a patient who probably does not have a serious disease as if he believed the disease were present when failure to treat the disease could prove fatal. Here, the doctor's judgment would be inaccurate, but appropriate.

III.3 Macchi

10. Macchi's skepticism about the base rate fallacy arises from a belief that the available probabilistic information is not communicated effectively to subjects. Macchi argues that because subjects often confuse Bayesian likelihoods with posteriors in abstract base rate tasks, greater care should be taken to clarify the meaning of the relevant probabilistic information. When this is done, Macchi finds that the relative base rate neglect observed in certain classic problems "decreased or disappeared."

11. If Macchi is right, evidence that purportedly supports a base rate fallacy is more appropriately described as evidence of linguistic confusion. What Macchi is reluctant to discuss, however, is the practical importance of this confusion. Are real-world problems structured in ways that more closely resemble, say, Kahneman & Tversky's cab problem or the reworded cab problems in Macchi's studies?

12. Macchi admits that the answer is unclear, but she suggests that "the way the information is experienced [in the real world] will usually disambiguate its meaning" (1994, par. 7). This is an empirical claim which bears some similarity to my own prediction that real-world cue redundancy may reduce the importance of base rate neglect if and when it occurs (target article, par. 5.3). Macchi is less interested in performing the necessary studies than in pointing out that alternative phrasings of the probabilistic information in base rate problems can increase reliance on base rate information. However, I prefer to assess the importance of Macchi's findings by identifying the real-world tasks for which performance may be improved by clarifying the available information along the lines that she recommends.

IV. CONCLUSION

13. Based on the commentaries received thus far, it would appear that the idea that people cannot reason probabilistically is losing support. Fletcher tells us that the problem lies with the inference that false beliefs reflect faulty thinking. Funder tells us that the problem lies with our focus on the results of artificial tasks that show what people cannot do. Macchi tells us that the problem lies with the way critical information is conveyed to subjects. Though a consensus has not yet emerged, it seems clear that the base rate fallacy paradigm that produced all those wonderful classroom demonstrations cannot provide us with the answers to the questions that are now being asked.

REFERENCES

Bower, B. (1994) Roots of Reasons: Our Daily Deliberations Provoke Scientific Debate. Science News, 145, 72-75.

Fletcher, G.J.O. (1994) Assessing Error in Social Judgment: Commentary on Koehler on Base-rate. PSYCOLOQUY (5)10 base-rate.10.fletcher.

Funder, D.C. (1994) Judgmental Process and Content: Commentary on Koehler on Base-rate. PSYCOLOQUY (5)17 base-rate.12.funder.

Gigerenzer, G. (Nov, 1992) Where Do We Go From Here? After Heuristics and Biases. Paper presented at the Judgment/Decision Making Society meeting, St. Louis.

Josephs, R.A., Larrick, R.P., Steele, C.M. & Nisbett, R.E. (1992) Protecting the Self from the Negative Consequences of Risky Decisions. Journal of Personality and Social Psychology, 62, 26-37.

Kahneman, D. (Nov, 1992) Commentary on Gigerenzer's paper. Judgment/Decision Making Society meeting, St. Louis.

Koehler, J.J. (1993) The Base Rate Fallacy Myth. PSYCOLOQUY (4)49 base-rate.1.koehler.

Macchi, L. (1994) On the Communication and Comprehension of Probabilistic Information: Commentary on Koehler on Base-rate. PSYCOLOQUY (5)11 base-rate.11.macchi.

Volume: 5 (next, prev) Issue: 21 (next, prev) Article: 13 (next prev first) Alternate versions: ASCII Summary