Stuart Watt (1996) Naive Psychology and the Inverted Turing Test. Psycoloquy: 7(14) Turing Test (1)

Volume: 7 (next, prev) Issue: 14 (next, prev) Article: 1 (next prev first) Alternate versions: ASCII Summary

Topic:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 7(14): Naive Psychology and the Inverted Turing Test

NAIVE PSYCHOLOGY AND THE INVERTED TURING TEST
Target Article by West on Turing Test

Stuart Watt
Department of Psychology
The Open University
Walton Hall, Milton Keynes, UK. MK7 6AA

s.n.k.watt@open.ac.uk

Abstract

This target article argues that the Turing test implicitly rests on a "naive psychology," a naturally evolved psychological faculty which is used to predict and understand the behaviour of others in complex societies. This natural faculty is an important and implicit bias in the observer's tendency to ascribe mentality to the system in the test. The paper analyses the effects of this naive psychology on the Turing test, both from the side of the system and the side of the observer, and then proposes and justifies an inverted version of the test which allows the processes of ascription to be analysed more directly than in the standard version.

Keywords

False belief tests, folk psychology, naive psychology, the "other minds" problem, theory of mind, the Turing test.

I. INTRODUCTION

1. In 1950, Alan Turing considered the question "Can machines think?" but almost immediately threw it away as "too meaningless to deserve discussion" and proposed to replace it with a more empirical test -- the test that has since become known as the "Turing test."

2. Turing derived his test from a party game called the "imitation game" which has a human observer trying to guess the sex of two players, one of which is a man and the other a woman, but while screened from being able to tell which is which by voice or appearance. One of the players might help the observer by being truthful, where the other may try to deceive the observer by pretending to be of the other sex. Turing suggested putting a machine in the place of one of the humans and essentially playing the same game. If the observer can't tell which is the machine and which the human, this can be taken as strong evidence that the machine can think.

3. Ever since Turing's original paper, the test and its criticisms and countercriticisms have surfaced from time to time (e.g., Block, 1981; Harnad, 1991; Hauser, 1993; Moor, 1976; Searle, 1980; Weizenbaum, 1976), but for all the criticisms, the Turing test hasn't gone away. This is because, as Dennett (1985) says, "there are real world problems that are revealed by considering the strengths and weaknesses of the Turing test."

4. This paper will look at some of these real world issues, relate the test to the human psychological faculties that underpin it, and show that these strengths and weaknesses can provide some deep insights into the nature of intelligence. The next section reviews some of background issues raised by the Turing test, and with sections III and IV, shows how it is so deeply influenced by human psychology. Sections V and VI use this to bring out the main theme of the paper -- to propose and justify a modified test which overcomes some of the problems with the original. Section VII draws some conclusions about the utility of these results.

II. WHAT, IF ANYTHING, IS WRONG WITH THE TURING TEST?

5. One of the main reasons for the occasional resurfacing of the Turing test is that nobody really agrees on what it actually means; there are, therefore, a number of illuminating criticisms of the test, which hint at fundamental aspects of how we assess mental states in people and non-human systems. But in general I'll assume a "no tricks" interpretation, with no restriction on the domain, a reasonable duration, and where the observer has the time and skills needed to accurately discriminate. This is in line with Collins's (1990) protocol for the "Ultimate Turing test," and with Harnad's (1992) and Dennett's (1985) interpretations. By this protocol, no system has yet passed the Turing test.

6. Perhaps the most familiar criticism of the Turing test is that it is "unashamedly behaviouristic and operationalistic" (Searle, 1980), but an operational interpretation, by defining intelligence, fails to tell us anything about what we originally wanted the test to assess. It is much better to take it as a "source of good inductive evidence" (Moor, 1976). Dennett suggests an analogy with legal practice: "any computer that can regularly or often fool a discerning judge in this game would be intelligent -- would be a computer that thinks -- beyond any reasonable doubt" (Dennett, 1985). Because the Turing test constitutes a legal kind of proof rather than a logical one, it is simple and practical, and yet it is fallible.

7. So perhaps the test is too easy. Block (1981) and Weizenbaum (1976) both argue that the test can be passed by "fooling" the judges sufficiently well. Clearly a test which relies on faking the evidence would be dangerously misleading, but this is against the spirit of Turing's proposal, if not the letter (Harnad, 1992). But this criticism does highlight the ease with which judges can sometimes be fooled, and this is an important potential flaw in the test (Caporael, 1986). Something can make the observer ascribe intelligence even to mindless machines -- something which can bias the test to false positives. But, curiously enough, the test can also be too hard. False negatives can happen too: people, even without pretending to be machines, have been known to fail the test (Hofstadter, 1985) -- and this cannot be put down to limitations in the technology.

8. A third point of disagreement is about the length of the test. Some interpret the test as Turing originally -- and almost casually -- suggested, lasting about five minutes. Dennett (1985) defends this "quick-probe assumption," saying, "nothing could pass the Turing test by winning the imitation game without being able to perform indefinitely many clearly intelligent actions." He argues that because the cost of passing even a relatively quick test by cheating is prohibitive, so the test rests on "a very unrisky gamble." Others suggest a much longer duration, in Harnad's (1992) case up to a lifetime, but in all cases the duration is intended to be comparable to normal human interaction.

9. This is a big hint that the Turing test is closely connected to the philosophical "other minds" problem. Harnad (1991), for one, points out the "very close connection between the philosopher's old 'other minds' problem and the modern computer scientist's problem of determining whether artificial devices have minds." So, despite Searle's (1980) rather casual attempts to brush other minds under the carpet there really is an affinity between the problem of deciding whether another human has a mind and that of deciding whether a machine has a mind. Put simply, a system passes the Turing test if the observer believes that it has a mind.

10. This is where naive psychology comes in. "Naive psychology" is the term given by Clark (1987), following Hayes (1979) to the natural human tendency and ability to ascribe mental states to others and to themselves -- in short, to recognise and understand other minds; the psychological solution to the philosophical problem. Naive psychology belongs in a category of psychological concepts rooted in an evolved "natural psychology" (Humphrey, 1976), a real psychological faculty that offers an evolutionary advantage to animals living in complex societies. It is closely related to "folk psychology," but Clark uses the term both to relate it to Hayes's (1979) "naive physics" and to distance it from being taken, as folk psychology sometimes is, to be a false protoscientific theory (Churchland, 1981; Stich, 1983).

11. Our interest in naive psychology stems from this. If there is a real natural faculty involved in understanding other minds, then this natural faculty is strongly connected to the Turing test. There are two sides to this connection: first, the Turing test needs to be strong enough to assay naive psychology in a system; and second, biases in the test due to the observer's naive psychology need to be carefully monitored and controlled (Caporael, 1986; Collins, 1990).

III. IS NAIVE PSYCHOLOGY REQUIRED TO PASS THE TURING TEST?

12. It is certainly logically possible for a system to pass the Turing test without having naive psychology. This is a variation on a standard objection to the Turing test: that it is logically possible for a mindless machine to pass it by generating the appearance of intelligence sufficiently well. Block's (1981) argument, for one, denies the necessity of naive psychology just as it denies the necessity of any other functional part of a real intelligence.

13. Block's argument is based on a hypothetical system using as a finite but very large table containing all the reasonable next sentences for all possible conversations up to that point. This implementation, even for short tests, shows a combinatorial explosion that makes his implementation physically impossible, although his argument remains logically correct. But since the table is vastly bigger than the human brain could possibly store, we can take it that there must be better ways to pass the test.

14. A less formal argument is that simply understanding a sonnet like "Shall I compare thee to a summer's day?" calls for naive psychology, let alone writing one. As Haugeland (1985) comments on Turing's (1950) imaginary dialogue: "the student has displayed not only competence with the English language, but also a passable understanding of poetry, the seasons, people's feelings, and so on." I would argue, then, that the Turing test, even in its original form, does touch on naive psychology.

15. So while we can't state unequivocally that naive psychology is logically necessary for passing the test, we can state that the appearance of naive psychology is necessary: without it the system would show an inability to perceive, recognise, and respond to human mental states that would make it trivially distinguishable from any real human.

16. Leaving the observer's part of the test aside for the moment, we are left with two possibilities. First, it may be possible for a system to have an "alien intelligence" which doesn't necessarily include any naive psychology, or at least any naive psychology which is human or human equivalent. A system with alien intelligence could have no competence with human mental states. The second possibility is that intelligence implicitly means human intelligence, and that a human equivalent naive psychology is required by a system if it is to be called intelligence.

17. There are actually two questions hidden in the possibility of alien intelligence. First, there is the question of the logical or empirical possibility of the existence of an alien intelligence, and second, there is the question of whether it could be recognised by us humans even if it did exist. Alien intelligence certainly seems to be a logical possibility: "may not machines carry out something which ought to be described as thinking but which is very different from what a man does?" (Turing, 1950). But alien intelligence isn't identifiable unless it can be recognised as intelligence, and it may be that the only kind of intelligence that is recognisable by humans is human intelligence. If we admit alien intelligence then computers could already be fully conscious, intelligent, and thinking beings, but we might be unable to recognise them as such.

18. A second problem is that every intelligence is to an extent an alien intelligence, with the (possible) exception of our own: a solipsist could claim that it is impossible for one person to decide whether or not a second is intelligent, being unable to really know what it is like to be that person. In practice what humans tend to do is guess that because others appear physically and behaviourally to be human, they are probably intelligent along the same lines.

19. French (1990) puts forward a convincing argument that the Turing test itself is a test "not of intelligence, but of culturally-oriented human intelligence." His argument is based on an "essential inseparability of the subcognitive and cognitive levels" -- because there isn't a clear frame around the competences that are evaluated by the test, any aspect of human society, psychology, physiology, or behaviour can be implicitly touched on by the test. If this is true, then the Turing test at least would not be sufficient to recognise alien intelligence as intelligence.

20. It is, then, entirely possible that the only kind of intelligence that counts is the kind of intelligence that can be recognised as such by us humans. Perhaps there is no such thing as "intelligence in general" (French, 1990) because we only accept intelligence if we can recognise it as such, and if we humans can recognise it, it has in a sense become part of human intelligence. As Moor (1976) puts it: "I believe that another human being thinks because his ability to think is part of a theory I have to explain his actions." The ascription of intelligence depends on the observer as well as the behaviour of the system. Intelligence, like the "beauty" Turing tried to screen out of his test, may truly be in the eye -- or the mind -- of the beholder. And if it is bound into the psychology of the observer, we need to look at this in any complete understanding of the Turing test.

IV. SOME EFFECTS OF THE OBSERVER'S NAIVE PSYCHOLOGY IN THE TURING TEST

21. When naive psychology is taken into account, the observer's role in the Turing test is not quite as passive as it might have seemed. The psychological baggage of the observer plays an important role in the Turing test, implicitly if not explicitly. Often this shows up as a disposition to see systems as intelligent even when they aren't (Caporael, 1986; Collins, 1990), a phenomenon that was particularly striking in the evaluations of ELIZA (Weizenbaum, 1976) and PARRY (Colby, 1981). Another surprising example is that of Garfinkel (1967), whose advice system actually behaved completely at random, but nevertheless its users were quick to see significances in the answers; even when the answer contradicted a previous one "the underlying pattern was elaborated and compounded over the series of exchanges and was accommodated to each present 'answer' so as to maintain the 'course of advice'" (Garfinkel 1967).

22. There are many factors involved in deciding whether something else has a mind or not, but perhaps the two most influential are physical similarity and familiarity (Eddy et al., 1993). Among other things, people quickly probe for a system's similarity to themselves and use this as a predictor. This is the Cartesian argument from analogy, but as a mechanism not as an argument: instead of being a reply to the philosophical "other minds" problem, it is a solution to the psychological "other minds" problem -- in the form of a mechanism for ascribing mental states to others who look similar. The logical invalidity of the philosophical argument cannot be doubted (Hauser, 1993), but logical validity isn't the point, psychologically this is part of what people do -- not deliberately or consciously -- it is just the natural behaviour of naive psychology.

23. If this is a psychological phenomenon, then it ought to be possible to study it empirically. It would show itself as a tendency to ascribe mentality and mental states to others in proportion to their similarity to the ascriber. One likely candidate phenomenon is anthropomorphism. Very few psychological studies of anthropomorphism have been made, but one (Eddy et al., 1993) shows a strong tendency to preferentially ascribe mental states to some forms over others in a way that broadly correlates with the (false) "evolutionary ladder." If we imagine this a bit like a very short Turing test, so short that no proper interaction is possible, it seems the physical form does have an effect, but no studies have been carried out on the extent to which these prejudices carry forward into assessments of behaviour over longer periods of interaction.

24. To overcome these prejudices Turing suggested using a teleprinter to mediate communication between the participants. He used this as "a screen that would let through only a sample of what really mattered" (Dennett, 1985) making the actual physical form of the system inaccessible to the observer: "we do not wish to penalise the machine for its inability to shine in beauty competitions" (Turing, 1950). Harnad (1992): "neither the appearance of the candidate nor any facts about biology play any role in my judgement about my human pen-pal, so there is no reason the same should not be true of my [Turing test]- indistinguishable machine pen-pal."

25. But a screen like this has its costs and needs to be justified. If form acts as a cue to the observer, a cue whose substitution has a significant effect on the efficacy of the test, then a better screen may be needed. It seems probable that there are biological factors which do play a role in exactly this judgement. If the pen-pal is known (or assumed) to be human, that matters for the test. Using a teletype link as a screen doesn't just mask the form, it changes the cues to the observer, and this can fundamentally affect the patterns of interaction.

26. In nature, if not in principle, form is bound in with behaviour and it does affect our ascription of mental states. Hofstadter (1985) describes a reversed version where a student over a teletype simulated an artificial intelligence but "simply had acted himself, without trying to feign mechanicalness in any way." Despite this, none of the observers seemed to suspect that they were interacting with a human rather than a program. The group was drawn together by Zamir Bavel, who afterwards summarised this "by saying that his class was willing to view anything on a video terminal as mechanically produced, no matter how sophisticated, insightful, or poetic an utterance it might be" (Hofstadter, 1985). Hofstadter's views were different: "although I don't think it matters for the Turing test in any fundamental sense, I do think that which type of 'window' you view another language-using being through has a definite bearing on how quickly you can make inferences about that being" (Hofstadter, 1985).

27. So using a teletype connection isn't just a screen, it changes the interaction modality fundamentally. One hypothesis could be that it sets a kind of cultural context -- the subjects in this experiment were computer literate and would mostly have used teletypes to interact with computer programs. This context could then affect the judgement. With different subjects -- and technically similar but culturally different modalities such as electronic mail -- the result could have been very different.

28. What is it that makes it so easy to ascribe mental qualities to machines? Perhaps it isn't truly a property of the system at all! Perhaps it is a joint property of the system and the observer and their interaction, and is, in some sense, measured by the extent to which the observer's naive psychology is activated by the behaviour and appearance of the system through the medium of interaction. Could this really be what we mean by "intelligence"?

V. TESTING FOR NAIVE PSYCHOLOGY

29. These two sides of the Turing test, its sensitivity to naive psychology both as an aspect of the system's behaviour and as an aspect of the observer's decision, show how important the connection between the test and naive psychology really is. But the Turing test is undiscriminating as far as naive psychology is concerned. We need to be able to focus more precisely on the effects of naive psychology before we can fully clarify its importance to the test.

30. There are already tests for naive psychology. Building on the work of Premack and Woodruff (1978), Wimmer and Perner (1983) and then Baron-Cohen et al. (1985) designed false belief tests to evaluate children's ability to ascribe mental states to others. In these tests puppets are used to act out a story in front of a child subject, so that at the end of the story, one of the characters represented by a puppet should believe something the child knows to be false. False belief tests allow the mental states the child ascribes to others to be distinguished from the child's own mental states.

31. This distinction is analogous to the core problem in the Turing test -- the apparent leakage between the observer's naive psychology and those mental phenomena the test is supposed to be evaluating. Although the analogy is not perfect, there does seem to be an affinity between the development of tests for naive psychology and the development of the Turing test. Initially the tests for naive psychology depended on deception (Premack & Woodruff, 1978), but this proved susceptible to the methodological problem of distinguishing between the experimenter's naive psychology and the phenomena that were being evaluated. This was overcome by using the false belief tests which separate the two more fully, by "setting traps" (Caporael, 1986) for leaks from the the subject's naive psychology.

32. This analogy gives us a hint of how we might overcome the problem of biases in the ascription of intelligence in the Turing test. We can make the same transition. Instead of evaluating a system's ability to deceive people, we should test to see if a system ascribes intelligence to others in the same way that people do. This is what I mean by the "inverted Turing test."

VI. REVERSING THE ROLES: AN INVERTED TURING TEST

33. As this shows, much of the Turing test's power rests on the observer's naive psychology, and this natural faculty biases the test, showing up as false positives or negatives. The real power of the test is in the role of the observer (Collins, 1990), not in the behaviour of the system under test. We can either accept this, and the weakening of the test that it entails (French, 1990) or we can overcome it by building a test that puts the system in the role of the observer. This is our proposed inverted Turing test, which checks if the system's powers of discrimination are equivalent to those of an expert human judge. That is, a system passes if it is itself unable to distinguish between two humans, or between a human and a machine that can pass the normal Turing test, but which can discriminate between a human and a machine that can be told apart by a normal Turing test with a human observer.

34. This variation on the Turing test borrows from the false belief tests for naive psychology by combining the false belief format of testing the ascription of mental states to others with the standard Turing test. Instead of evaluating a congruence between the linguistic behaviour of a system and a person, it evaluates a congruence between a system's and a person's ascription of mental states. To "pass" the inverted Turing test, a system must show the same regularities and anomalies in the ascription of mental states that a person would -- regularities and anomalies that can be investigated psychologically for comparison. This will include, for example, anthropomorphism, as well as the pattern of false positives and negatives shown by human judges in the standard Turing test.

35. Just like every other examination of the Turing test, however, this is open to criticism, and there are some immediate problems that need to be attended to. First, as with the normal Turing test, the expected behaviour could in principle be simulated without any guarantee of validity. Simulating a total inability to discriminate is trivial, but simulating an ability to discriminate which is equivalent to that of a human is a far from trivial problem, and seems to require all the same background and common sense knowledge, and all the same skills, that passing the normal Turing test does. It is, therefore, a very stringent test in its own right.

36. The second problem is one of identity. If the system is asked to discriminate between a human and a system that is identical to itself, it has special access which flaws the test. This is not usually addressed in the standard Turing test, which potentially has similar problems. If the observer has advance knowledge of a participant, they can use this knowledge to bypass the discrimination with questions like "what is your birthday?" If we assume the Turing test prohibits this as unfair -- as we normally do -- we can also assume the inverted version prohibits it. Unfortunately, there is a second case of identity to tackle. Even if the observer doesn't have advance knowledge of either system under test as individuals, it can have indirect knowledge through identity of form. Even in the standard Turing test we shouldn't be asking "are you physically like me?," except in so far as we normally ask it implicitly when ascribing mental states to others. With respect to identity the problems of the inverted Turing test are the same as those of the standard Turing test, so the participants should be chosen to prevent the test being biased in this way.

37. A third, and far more serious, argument is that the inverted Turing test is redundant because all its power of discrimination is available in the standard Turing test. Many other stronger variations of the Turing test are also open to this criticism (Hauser, 1993). For instance, Dennett (1985) anticipates and rejects a rough and ready version of Harnad's (1991) Total Turing test. Harnad's proposed extension was to add robotic capacities into the Turing test, claiming that for "total performance indistinguishability, however, one needs total, not partial, performance capacity" (Harnad, 1992). Dennett (1985) comments: "Turing could reply, I am asserting, that this is an utterly unnecessary addition to the test." Dennett follows Harnad's argument but doesn't feel the need to change the Turing test even though robotic capabilities may be required by any system which is to pass it. Dennett's argument is that the Turing test is already strong enough to detect robotic capabilities, when taken in its "unadulterated" form. Dennett, or Turing for that matter, could make the same criticism of this inverted version: an inverted Turing test would be an unnecessary extension if naive psychology can be tested for within the framework of the standard Turing test.

38. This is probably true with respect to the Turing test: a critically evaluated standard Turing test without a time limit would be sufficient to detect the presence of naive psychology. However, given that humans have all these psychological biases in their ascription of mental states, I doubt whether a truly critical version of the Turing test is psychologically possible without some variation in the test. In overcoming this bias, the inverted Turing test does go some way towards compensating for this apparent problem in the standard Turing test. I would argue, then, that the issues raised by the inverted Turing test should be addressed explicitly when looking at the efficacy of different approaches to assessing mentality in other systems.

39. Another possible objection to the inverted Turing test is that it is implicitly recursive; it defines systems indirectly in terms of the ability to recognise their own behaviour. This is true; the test is indeed recursive, but it is not an infinite regress kind of recursion, but a transactional, or temporal regress, kind of recursion, going back through social and biological evolution rather than through logical forms. Because the participants are playing transactional games, guessing at one another's mental states, there is an inevitable mutual recursion between their actions. The point to note is that exactly the same is happening both in the standard Turing test and in normal human interaction in the real world.

40. What would the inverted Turing test look like? One possibility would be a format similar to that of the standard Turing test, but systematically and statistically comparing standard and inverted Turing test judgements of different pairs of humans and machines to ensure that they showed the same regularities. But I propose the inverted Turing test more as a thought experiment than as a serious artificial intelligence development programme. Again, neither the inverted Turing test nor the standard Turing test should be interpreted as defining a system as intelligent. Instead, they provide a format for gathering evidence about whether a system is intelligent in the inductive interpretation (Moor, 1976). Even if systems cannot be distinguished in a Turing test, the real acceptance of a system's being intelligent will be cultural rather than technical. The role of the inverted Turing test is that it offers a new source of inductive evidence, this time on the key criteria involved in ascribing intelligence to others. On that basis, I believe it adds something that is well hidden in the standard Turing test.

41. It might be possible, with the current state of the art, to use a simple set of linguistic metrics that would unambiguously distinguish between people and computer systems. I would regard this as cheating. By comparison with the normal Turing test, even if a system which uses no human psychological principles (a next generation ELIZA, for instance) were to pass, we wouldn't have learnt anything useful. The point of the inverted Turing test is to throw emphasis on a study of how people distinguish between things with and without minds. An important issue here is that if we know how a system works, this significantly (and mostly negatively) affects our ability to ascribe mentality to it. This effect requires and deserves further study -- and indeed, it is exactly the kind of effect that can be studied using the inverted Turing test.

42. The inverted Turing test at first seems to be counterintuitive, but in practice it is a simple test of the ability of a computer to apply naive psychology in the same way that a human can. It also gains an elegance by building on the implicit observer/subject directional bias present in the original and making sure interaction is evaluated in both directions, rather than just the one. This removes what Collins (1990) calls "interpretative asymmetry" from the test. Collins uses this term to describe "the one-way process in which we repair the defects of machines' interactions while machines cannot repair the defects in ours" (Collins, 1990). Collins argues that this ability to repair interactions is an important human social skill, and one that is all too easily ignored in the standard Turing test. The inverted Turing test ensures that this skill is fully exercised.

43. Putting the Turing test through these inversions allows the phenomenon of naive psychology to be seen from both sides, from the side of the observer as well as that of the system. Of course, the role of the observer was implicit from the beginning, but making it explicit does highlight this important and natural aspect of human psychology, and the inverted test does offer a good format for gathering inductive evidence about this. The inverted test should be seen in this light; not as a serious proposal to dismantle the critical interpretations of Turing's original version, but as an intuitive approach to the proper evaluation of those essential behaviours that the standard test was originally intended to assess.

VII. CONCLUSIONS

44. Challenging the Turing test is easy, but it doesn't necessarily move us forward in the right directions. As French puts it: "perhaps what philosophers in the field of artificial intelligence need is not simply a test for intelligence but rather a theory of intelligence" (French, 1990). I suggest that a general theory of "alien intelligence" is probably impossible in principle, and that any theory of human intelligence is at least methodologically dependent on how we humans recognise intelligence. It is here that the Turing test can play an important role. We can use it as a tool to look at how people recognise intelligence -- how people distinguish between things which have minds and things which don't.

45. The challenges and criticisms of the Turing test have several points to make. First, naive psychology is a deeply ingrained and very natural human faculty, and it is an essential and intrinsic part of the behaviours that are evaluated in the test. Second, the active role of the observer in the Turing is rarely stated explicitly, but it must never be ignored because without understanding this side of the test as well as the system's, the test is fundamentally flawed.

46. Because the observer has this natural tendency to ascribe mental states to systems, sometimes without regard to their actual behaviour, an inverted version of the Turing test has been proposed in which the emphasis is no longer on the observer's ability to discriminate between different systems, but on a system's ability to discriminate in its own right. This test evaluates the system's ability to ascribe mentality to others for compatibility with that of skilled human judges by putting it in the role of observer in the test instead of the usual role of the observed. This modified test seems as strong as Turing's original, but it throws a different emphasis on the behaviours that the test is intended to evaluate.

47. I don't want to be interpreted as claiming that the Turing test is obsolete or should be replaced, but I do suggest that many of the criticisms and problems of the test are not mere philosophical quibbles. Instead, they are hints at the "other minds" problem -- hints at the deep issues involved in people's ascription of mental states to one another. Instead of rejecting the test on these grounds, we can use these issues to build a better understanding of the other minds problem and of the Turing test. Through the Turing test, and perhaps the inverted Turing test, we have an excellent way to study these aspects of human psychology. On this basis, the dialogue surrounding the test should be a welcome one.

REFERENCES

Baron-Cohen, S., Leslie, A. M., & Frith, U. (1985) Does the autistic child have a "theory of mind"? Cognition 21:37-46.

Block, N. (1981) Psychologism and behaviourism. Philosophical Review 40:5-43.

Caporael, L. R. (1986) Anthropomorphism and mechanomorphism: two faces of the human machine. Computers in Human Behavior 2(3):215-234.

Churchland, P. M. (1981) Eliminative materialism and the propositional attitudes. Journal of Philosophy 78:67-90.

Clark, A. (1987) From folk psychology to naive psychology. Cognitive Science 11:139-154.

Colby, K. M. (1981) Modeling a paranoid mind. Behavioural and Brain Sciences 4:515-560.

Collins, H. M. (1990) Artificial experts: social knowledge and intelligent machines. MIT Press.

Dennett, D. C. (1985) Can machines think? In: How we know, ed. M. Shafto, Harper and Row.

Eddy, T. J., Gallup, G. G., & Povinelli, D. J. (1993) Attribution of cognitive states to animals: anthropomorphism in comparative perspective. Journal of Social Issues 49(1):87-101.

French, R. M. (1990) Subcognition and the limits of the Turing test. Mind 99:53-65.

Garfinkel, H. (1967) Studies in ethnomethodology. Prentice-Hall.

Harnad, S. (1991) Other bodies, other minds. Minds and Machines 1(1):43-54.

Harnad, S. (1992) The Turing test is not a trick: Turing indistinguishability is a scientific criterion. SIGART Bulletin 3(4):9- 10.

Haugeland, J. (1985) Artificial intelligence: the very idea. MIT Press.

Hauser, L. (1993) Reaping the whirlwind: reply to Harnad's "Other bodies, other minds." Minds and Machines 3(2):219-237.

Hayes, P. J. (1979) The naive physics manifesto. In: Expert systems in the microelectronic age, ed. D. Michie, Edinburgh University Press.

Hofstadter, D. R. (1985) Metamagical themas: questing for the essence of mind and pattern. Basic Books.

Humphrey, N. K. (1976) The social function of intellect. In: Growing points in ethology, eds. P. P. G. Bateson and R. A. Hinde, Cambridge University Press.

Moor, J. H. (1976) An analysis of the Turing test. Philosophical Studies 30:249-257.

Premack, D. & Woodruff, G. (1978) Does the chimpanzee have a theory of mind? Behavioural and Brain Sciences 4:515-526.

Searle, J. R. (1980) Minds, brains, and programs Behavioural and Brain Sciences 3:417-424.

Stich, S. (1983) From folk psychology to cognitive science. MIT Press.

Turing, A. M. (1950) Computing machinery and intelligence. Mind LIX(2236):433-460.

Weizenbaum, J. (1976) Computer power and human reason. W. H. Freeman.

Wimmer, H. & Perner, J. (1983) Beliefs about beliefs: representation and constraining function of wrong beliefs in young children's understanding of deception. Cognition 13:103-128.

Volume: 7 (next, prev) Issue: 14 (next, prev) Article: 1 (next prev first) Alternate versions: ASCII Summary