Popple (1996) argues that the inverted Turing test should be seen as a replicable scientific experiment. I suggest that while the research method this implies may be appropriate, the Turing test itself should not be seen as a scientific test.
2. I am grateful to Popple for a chance to clarify the legal analogy (Watt, 1996, par. 6; see also Dennett, 1985, p. 122). She suggests that the Turing test might be better read "as a scientific experiment requiring replication" (Popple, 1996, par. 5). Her criticism of the "one-shot" test I accept. A single failure should not necessarily lead to rejecting the general hypothesis that the system is intelligent, although by definition, failing the Turing test means that in one specific case that hypothesis has indeed been rejected. Unfortunately, failing the test only means that this one specific hypothesis has been rejected; no further scientific conclusions about the system can be drawn (Dennett, 1985). That is: the hypothesis "X can think" is an element of a theory of mind, not of a theory of a science such as psychology, and the two are different kinds of theory (Clark, 1987).
3. The same goes for replication: scientific replication is usually intended to confirm or reject scientific hypotheses, and for this the Turing test is inappropriate. Interpreting replication in a weaker sense, and using many test trials (rather than just one) to gather more inductive evidence, is probably a much better way to run the test, as Popple suggests. I therefore accept the scientific experiment analogy, but not the scientific interpretation that usually accompanies it.
4. I did not want to claim that the inverted Turing test is more exclusive than the Turing test. If there is any difference at all, I'd expect it to be slightly weaker. The reason I proposed the inverted Turing test is to suggest that in many runs of the test, different judges and sets of questions might systematically change the outcome, so by studying and modelling the judge we can indirectly study how people actually ascribe intelligence. This, then, is targeted as a key weakness of the original Turing test, that "as a practical matter the test is of little value in guiding research" (Moor, 1976, p. 256).
5. Popple's last point (1996, par. 6) is about the "alien intelligence hypothesis" at the heart of the paper. Separating fact from science fiction here is difficult. I am suggesting that what we know about the origin and structure of a system affects our ascription of intelligence. Even if a system's intelligence is judged without evaluating its naive psychology, our naive psychology is used in that judgement. Human naive psychology, then, is still implicated in our recognition of alien intelligence. We cannot be objective about intelligence.
6. So while I am happy to accept that the scientific experiment analogy might be more meaningful than the legal one, especially regarding replication, I am not convinced that the Turing test is, or could be, in any sense a scientific test "for" intelligence.
Clark, A. (1987) From Folk Psychology to Naive Psychology. Cognitive Science 11:139-154.
Dennett, D.C. (1985) Can Machines Think? In: How we know, ed. M. Shafto, Harper and Row.
Moor, J.H. (1976) An Analysis of the Turing test. Philosophical Studies 30:249-257.
Popple, A.V. (1996) The Turing Test as a Scientific Experiment. PSYCOLOQUY 7(15) turing-test.2.popple.
Watt, S.N.K. (1996) Naive Psychology and the Inverted Turing Test. PSYCOLOQUY 7(14) turing-test.1.watt.