The silence of those who might defend the conventional approach to error is not unusual. Judgments can be evaluated against relevant data in terms of their content, as well as in terms of the normativeness of the process by which they were made.
1. In his Response to the first seven commentators, Koehler (1994) points out that they all basically agree that the implications of the putative base-rate fallacy have been drastically oversold; he then poignantly asks, "Where are those who will defend the oft-repeated conclusions about the base rate fallacy?" He is not the first to encounter this problem. The normative models and overwhelmingly pessimistic -- even contemptuous -- evaluations of human judgment that continue to emanate from the literature on judgmental error have undergone increasing criticism in recent years. The usual response from the laboratories where error is studied, and of the error researchers who inhabit them, has been silence.
2. This was certainly true in my own case: I published a critique of the literature on errors in personality judgment in Psychological Bulletin (Funder, 1987), then awaited, with both trepidation and eagerness, the resulting rebuttals. None ever came. And it's not just me. A significant recent book on judgmental error does not cite, much less respond to, that paper, nor any other paper critical of the error literature (see Funder, 1992). As a result, the research field is not having the kind of intellectual exchange it ought to be having. The critics cite the error literature profusely (of course), and assault it on many fronts. The error theorists ignore all this, and address their work to, and apparently are informed in their work only by, each other.
3. The practice of ignoring one's critics might be effective academic politics, but it seems to be a less than ideal style of scientific interchange. Unless and until the practitioners and originators of error research begin to address their critics, the literature as a whole will suffer the kind of intellectual deficit that Koehler notes in the other side's failure to be adequately represented in the commentaries on his target article.
4. A theme that seems to run through this exchange so far is that the way to improve research on the accuracy of human judgment is to come up with new and improved normative models. Thus, we see detailed discussions here about whether a particular approach is or is not normative, how to tighten definitions (e.g., of "baserate"), or how to try alternative computations to allow more precise testing of models' normativeness. An implicit assumption seems to be that once the proper normative model has at last been identified, judgments can be evaluated as erroneous to the degree that they depart from its prescriptions. A slightly deeper implicit assumption seems to be that the way to improve judgment will be to train judges to imitate normative models.
5. There is another way to address accuracy issues empirically, an approach that has not so far been considered in any detail in this exchange. The approach is to appraise accuracy not through comparisons between the process by which a judgment was made and a normative model, but by assessing its outcome and its validity under realistic circumstances. This can be done in two ways, depending on the researcher's interest.
6. If the researcher is interested in applied decision making, then the approach to take is pragmatic. Examine judgments made by different judges, or on the basis of differing information, or according to differing strategies. See which judgments are the most and least likely to be accurate in predicting the criterion of interest, whether it be job performance (e.g., Nilsen, 1991) or the weather (e.g., Lusk & Hammond, 1991). For some time, Hammond (Hammond, Hamm, Grassia & Pearson, 1987) has called this kind of research "direct comparison." A general conclusion from this work is that lessening the use of the judgmental heuristics commonly called "errors" (such as the halo effect) does not improve accuracy according to direct comparison with predictive outcome (e.g, Bernardin & Pence, 1980).
7. A second approach is more theoretical. My own research interest is in personality psychology; I investigate the judgments people make of their own and each other's personality traits. I am not so interested in any particular prediction for any applied purpose as I am in the overall "construct validity" of the inferences that are being made about the traits that characterize the targets of judgment. This leads me to follow a research strategy that evaluates the accuracy of a human judgment of a trait in exactly the same way one would evaluate the accuracy of a new scale alleged to measure that same trait. Routinely, one would ask, does this scale correlate with the other measures with which it should correlate, and not correlate with the other measures with which it should not? And, the usual gold standard: does it predict behavior? Similarly, in my own research (and, increasingly, in the research of others in this area) I assess whether a given acquaintance's judgment of your personality accords with judgments by other acquaintances, with your own self-judgment, and with measures of your behavior that range from videotaped observation in seven laboratory situations to diary and beeper reports of daily activities.
8. From this work, I have come up with a four-fold classification scheme for moderators of accuracy in personality judgment. A wide range of studies has shown that accuracy can be a function of properties of the (1) judge, (2) target, (3) trait that is judged, and (4) information on which the judgment is based (see Funder, 1993, for a review). These four moderators yield six unique interactions. I have recently formulated a process model that tries to explain these moderators and their interactions (Funder, 1994).
9. In conclusion, it is possible to investigate judgmental accuracy without getting too bogged down in the fine details of normative models that are difficult to map onto real life, and without defining as erroneous any judgments that diverge from these always-questionable normative models. The way to do this is to examine what people can do rather than to search for what they cannot do, and to see when they can best do it. There are ways to use data to see whether or not a judgment is correct. If this were not true, none of us could do research at all. Evaluating judgments by examining data relevant to their content, rather than by comparing how they were made with a putatively normative model, can shed useful light on their accuracy whether the researcher's ultimate concern is pragmatic -- e.g., how can I better predict next quarter's sales? -- or more theoretical -- e.g., when is my view of an individual's personality most and least likely to be correct?
Bernardin, H.J. & Pence, E.C. (1980) Effects of Rater Training: Creating New Response Sets and Decreasing Accuracy. Journal of Applied Psychology, 65, 60-66.
Funder, D.C. (1987) Errors and Mistakes: Evaluating the Accuracy of Social Judgment. Psychological Bulletin, 101, 75-90.
Funder, D.C. (1992) Everything You Know Is Wrong [Review of "The Person and the Situation: Perspectives of Social Psychology"]. Contemporary Psychology, 37, 319-320.
Funder, D.C. (1993) Judgments as Data for Personality and Developmental Psychology: Error vs. Accuracy. In D. Funder, R. Parke, C. Tomlinson-Keasey & K. Widaman (Eds.), Studying Lives Through Time: Approaches to Personality and Development (pp. 121-146). Washington, DC: American Psychological Association.
Funder, D.C. (1994) Accuracy Theory: A General Framework for Research on Personality Judgment. Unpublished ms., University of California, Riverside.
Hammond, K.R., Hamm, R.M., Grassia, J. & Pearson, T. (1987) Direct Comparison of the Efficiency of Intuitive and Analytical Cognition in Expert Judgment. IEEE Transactions on Systems, Man and Cybernetics, 17, 753-770.
Koehler, J.J. (1993). The Base Rate Fallacy Myth. PSYCOLOQUY 4(49) base-rate.1.koehler.
Koehler, J.J. (1994). Base Rates and the "Illusion Illusion." PSYCOLOQUY 5(9) base-rate.9.koehler.
Lusk, C. M. & Hammond, K.R. (1991) Judgment in a Dynamic Task: Microburst Forecasting. Journal of Behavioral Decision Making, 4, 55-73.
Nilsen, D. (1991, August) Understanding Self vs. Observer Discrepancies in Multi-rater Assessment Systems. Paper presented at the Annual Meetings of the American Psychological Association, San Francisco.