Robert A. M. Gregson (1993) Which Bayesian Theorem Could be Compared With Real Behaviour?. Psycoloquy: 4(50) Base Rate (2)

Volume: 4 (next, prev) Issue: 50 (next, prev) Article: 2 (next prev first) Alternate versions: ASCII Summary
PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).
Psycoloquy 4(50): Which Bayesian Theorem Could be Compared With Real Behaviour?

Commentary on Koehler on Base-Rate

Robert A. M. Gregson
Department of Psychology,
Australian National University,
Canberra, A C T 0200 Australia


The identifiability of strategies used by subjects in assessing inverse probabilities is low, and not adequately supported by using the simplest form of Bayes' Theorem. If subjects are using more complex strategies, coherently or incoherently, then we cannot readily deduce how they use base rate information, or if they are substituting other information for it.


Base rate fallacy, Bayes' theorem, decision making, ecological validity, ethics, fallacy, judgment, probability.


1. Koehler's (1993) target article on the base rate fallacy seems to be a storm in a teacup. I agree that psychologists have written some strange things about inverse probability, and Koehler's review is exhaustive and definitive. I get the impression that three distinct questions have become confounded, however, and ought to be disentangled.

2. In the legal context, there are questions, at least in Australian courts, as to whether the use of arguments about the worth of circumstantial evidence are valid uses of inverse probability. In a celebrated case where on appeal a convicted murdered was acquitted, it was suggested by a statistician called as an expert witness that the prosecution had invalidly argued where the four probabilities p(H), p(E|H), p(not-H) and p(E|not-H) have all to be considered. The case involved forensic evidence, on physical measures; there was no issue of subjective probabilities. It is still germane to point out that legal methods of inference and statistical methods of inference are not necessarily compatible.

3. Setting the law aside, the data summarised by Koehler indicate that if we compare human judgments, which are elicited to obtain inverse probabilities, with Bayesian inference using the simplest form of Bayes' Theorem as he gives it, then the judgments are suboptimal in the long run. The question is, should the Theorem have been written like that, and if we expanded it into some alternative forms, would it then make better sense to use it as a normative baseline against which to assess what quantitative information, if any, humans can efficiently use?

4. There is another point, however; statistical theory does not stand still, and modern developments extend Bayesian analyses and need consideration. It is now possible (Walley, 1991) to use rigorous and extensive theory to cover the case where the observer has no precise estimates of either prior or conditional probabilities, but can at most trap them within upper and lower bounds. For an experimental psychologist this statistical approach is something like the psychophysical method of limits, where a hypothetical point of subjective equality is approached in turn from below and above. Admittedly these new developments postdate the studies Koehler cites, which peaked in the 1970s, but in looking at the problem now we should get our statistical theory updated.


5. I get two points from Koehler: that subjects can and do bring additional information to bear which was not in the experimental protocol, and that they weight probabilities in different ways, which can render the probability calculus, which is implicit in their behaviour, technically incoherent. Their personal probabilities would not add up to one.

6. Let me distinguish two extended forms of Bayes' Theorem. Using E1&E2 to mean the conjunction of two events from the set {E1,E2,...,Ek} of k mutually distinguishable alternatives, and using c1, c2,.. to represent scalar multipliers, the numerator of the Bayes expression, which is, for mutually exclusive and exhaustive hypotheses, H1 and H2, originally:

    (a)     p(E|H1)&p(H1)

now becomes either:

    (b)     p(E1&E2....|H1)&p(H1)

if subjects add in an indeterminate amount of extra data which the experimenter did not introduce, or:

    (c)     p(E1|H1)&c1&p(H1)

if subjects skew the estimate of the base rate. Obviously, if subjects do both these things -- and float from trial to trial in how they do it, fitting data to the simplest sort of Bayesian expression, where terms are defined exclusively as those given by the experimenter -- this will lead to the suggestion that the base rates are not being used, or are not being used appropriately. However, if c2&p(H2) = 1 - c1&p(H1), then the subject is still using a sort of Bayesian behaviour within each trial. (Note that ci operates on p(Hi)).

7. In addition, rewrite (b) as:

    (d)     p(E1|H1&E1|[E2&H1])&p(H1|E2)

and we heighten the role that the extraneous data can play. What is being formalised here is that the hypothesis H1 is only tenable to the subject if the additional information E2 is incorporated in the scenario. If the alternative hypothesis H2 is only tenable if E3, say, also comes into the picture, then we can be on the way to lexicographic judgments (Gregson, 1963). Given these alternatives, the question is, can a subject who is using (b) quite consistently (but unknown to the experimenter) be misidentified as one who gets base rates wrong? Alternatively, is it possible to leave base rates out altogether and make inferences based solely on expressions such as:

    (e)     p(E1|H1&E1|[E2&H1]&E1|[E3&H1]...)

instead of (a)? I think one still needs a non-null p(H1) in the expression.


8. If we want an ecologically valid use of theories of inverse probability and probabilistic inference, in order to derive statements about the degree to which human subjects fall short of a normative optimum process, then we must define the normative strategy in much more detail than is given in the simplest idealised Bayes form for two hypotheses. Without this, we cannot identify precisely what subjects are doing. What I do not know is whether it is possible to design experiments which from the position of an outside observer can separate (b), (c), and (d).


Gregson, R. A. M. (1963) Some possible forms of lexicographic evaluation. Psychometrika, 28, 173-183.

Koehler, Jonathan J. (1993) The Base Rate Fallacy Myth. PSYCOLOQUY 4(49) base-rate.1

Walley, P. (1991) Statistical Reasoning with Imprecise Probabilities. London: Chapman and Hall.

Volume: 4 (next, prev) Issue: 50 (next, prev) Article: 2 (next prev first) Alternate versions: ASCII Summary