Michael Thomas (1998) Connectionism is a Progressive Research Programme:. Psycoloquy: 9(36) Connectionist Explanation (29)

Volume: 9 (next, prev) Issue: 36 (next, prev) Article: 29 (next prev first) Alternate versions: ASCII Summary

Topic:

Article:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 9(36): Connectionism is a Progressive Research Programme:

CONNECTIONISM IS A PROGRESSIVE RESEARCH PROGRAMME:
"COGNITIVE" CONNECTIONIST MODELS ARE JUST MODELS
Commentary on Green on Connectionist-Explanation

Michael Thomas
Department of Psychology
King Alfred's University College
Winchester
Hants SO22 4NR

Tony Stone
Division of Psychology
South Bank University
London SE1 0AA

michael.thomas@psy.ox.ac.uk stonea@sbu.ac.uk

Abstract

Connectionist models are cognitive models which can serve two functions. They can demonstrate the computational feasibility of a cognitive theory (in this sense they model cognitive theories), or they can suggest new ways of conceiving the functional structure of the cognitive system. The latter leads to connectionist theories with new theoretical constructs, such as stable attractors, or soft constraint satisfaction. A number of examples of connectionist models and theories demonstrate the fertility of connectionism, a progressive research programme in Lakatos's (1970) sense. Green's (1998a) specificity argument against connectionist theoretical constructs fails because it relies upon a simplistic view of theoretical constructs that would undermine even the "gene" construct, Green's paradigmatic example of a theoretical entity in good standing. This view of theoretical entities is based upon a simplistic Popperian picture of science.

Keywords

artificial intelligence, cognition, computer modelling, connectionism, epistemology, explanation, methodology, neural nets, philosophy of science, theory.

1. In this commentary we defend the claim that connectionism makes a significant contribution to COGNITIVE theory. We will not address its utility as a neural model (Hoffman, 1998; Orbach, 1998), although we will, at the end, touch on the question of the relationship between cognitive connectionist models and neural models.

2. Green's (1998a) original argument runs as follows. He first outlines what he takes to be the requirements of a scientific theory, particularly a SPECIFICITY requirement on the acceptability of theoretical constructs, and then sets out reasons why connectionist models fail to fulfill these requirements, specificity in particular, concluding that connectionist models are not scientific theories.

3. In responding to this argument, it is first necessary to distinguish TWO roles that connectionist models can play in cognitive theory. They can play the particular role of testing the computational feasibility of a cognitive theory. In this role, the nodes and connections from which connectionist models are built are not theoretical constructs. (This point is also made by Goldsmith, 1998, para. 3, and Raftopoulos, 1998, paras. 4 and 5.)

4. But connectionist models can also play the more general role of providing new ways of thinking about the functional structure of cognition, ways that involve, for example, the primitives listed by Goldsmith (1998, para. 10). These primitives are theoretical constructs. We provide a wide range of examples to demonstrate the value of this new way of thinking. We argue that Green's argument against these theoretical constructs, which depends upon his specificity requirement, fails. This requirement will prove to depend upon a naive view of science and scientific procedure.

5. To start with an example of the use of connectionist models to test the computational feasibility of a cognitive theory, consider their use in the investigation of the acquisition of the English past tense. English verbs have either regular past tense forms (stem + ed) or irregular past tense forms (e.g. catch -> caught, bite -> bit, hit -> hit). When children learning English as their native language first learn the past tense of verbs, they appear to follow a U-shaped path of development. Initially, children learn to form the past tense of high frequency irregular verbs (e.g. go -> went) as well as some high frequency regular verbs. However, as learning proceeds and more verbs are learned (including more regular verbs), children begin to regularise the past tense of verbs they had previously formed correctly (e.g. go -> goed).

6. One explanation of this phenomenon is that the past tense production system has two routes. One route is rule-governed and enables a speaker to form the past tense of regular verbs. A second route involves a memory store of irregular past tense forms. An irregular past tense form can only be produced if the appropriate past tense form has been learned and stored in memory and if it can be retrieved before the regular route, operating in parallel, produces an incorrect regularisation. The dip in children's performance is hypothesised to occur when the irregular route is insufficiently strong to prevent the rule-governed regular route from overriding it (Pinker, 1985).

7. We take it that this dual-route model of past tense learning and production is a little theory. Now suppose someone suggests that a single route would be more parsimonious [1]. The first problem such a counterproposal faces is how it is going to explain the U-shaped path of past tense learning. A supporter of the dual-route model might say: "a single route just does not have the resources to explain the data: HOW ELSE could the U-shaped pattern of past tense learning be explained?" Such how-else arguments are prevalent and powerful in cognitive science, because if a given theory has no competitor then that theory is ipso facto the theory to hold by inference to the best (in this case, only) explanation.

8. Rumelhart and McClelland's (1986) single route connectionist model of past tense learning demonstrates that single route models cannot be ruled out by the how-else argument. Now, Rumelhart and McClelland's model turned out to face massive and deadly objections. But other connectionist models were then proposed (e.g. Plunkett and Marchman, 1993) that addressed many of the objections. Moreover, these models also inspired new empirical investigations, and generated testable predictions (Marchman and Bates, 1994). This in turn led to further modelling. It is hard to see why anyone would want to rule out this data -> model -> investigation -> more-data dialectic as unscientific.

9. Green (1998 a, b, c, and d) worries about just what it is that connectionist models model. In the case we have just discussed, it is clear that a connectionist model models a simple single route theory of past tense acquisition. Connectionist models -- in their particular role -- model theories. If this is the case, then the question of the value of cognitive connectionism devolves to the question of the value of connectionist theories: This is what we referred to above as the general role that connectionist models can play.

10. There is considerable debate on the nature and value of connectionist theories of cognition (for a good overview see Horgan, 1997), but consider this scenario: There is some information processing device, Z, that has an input-output function that models some cognitive system in extension. It might then be suggested that the cognitive system works a bit the way Z works. Of course, it is necessary to specify precisely how Z works, but perhaps Z turns out to be a connectionist system that contains structures that are analogous to the mechanisms that implement the human cognitive system, i.e., the brain [2]. A connectionist will now wonder whether we might gain a better understanding of the human mind-brain by constructing theories with concepts that are useful in explaining how Z works. Notice that this is not to construe connectionist theories as theories at the level of the neural hardware. It is simply to claim that cognitive theories formulated in terms of mechanisms analogous to brain mechanisms have more chance of being correct. (Roughly, this is to take McCloskey's (1991) line that connectionist models at the cognitive level can be seen as similar to animal models: informative, but analogous.)

11. Green (1998d, para. 6) rightly points out that "[o]nly if we have reason to believe that connectionist networks have something deeply in common with 'real'... cognitive processes (or their underpinnings) -- something that goes beyond a mere similarity in superficial behaviour -- do we have reason to believe that the one may play a significant role in explaining the other." It does behove the connectionist to be as precise as possible about the similarities between the mechanisms of Z and human cognitive mechanisms.

12. Although there is a lot of work for connectionists to do to satisfy this requirement, we can already begin to list the connectionist mechanisms that might form a part of better cognitive theories. Both connectionist models and the human cognitive system might use sub-symbolic representations. Both might be characterised by nonlinear and interactive processing. Both might exhibit graded representational states created by multiple soft constraint satisfaction operating in parallel, and be sensitive to the statistical structure of the problem space to be negotiated. (See McClelland, 1993, for a fuller characterisation of the key elements of cognitive connectionist models.)

13. It is worth noting here that cognitive connectionists are not instrumentalists. They really DO believe that there is something in the cognitive system which has the functional role of, say, an attractor, or a state space trajectory, which explains some piece of behaviour, and which they discovered through analysing their models of the domain. Although it is open to connectionists to be instrumentalists, as Green (1998e, para. 1) suspects, they are generally realists.

14. Green (1998d, para. 2) also suggests that connectionist models are misnamed: "If the nodes and connections are not to be taken seriously [as theoretical entities], then in what sense is the person who so abandons them still to be considered a connectionist?... If the "real" theoretical entities are higher 'functional properties' then those properties... are the real core of the theory, which is now only incidentally connectionist." Our response is: What's in a name? Consider Z again. We could have obtained the same cognitively relevant higher functional properties if we had construed Z as comprised merely of vectors and matrices (the standard implementation of connectionist networks on serial computers). Following Green's logic, we should then refer to any resulting theories as "vector-matrix" theories. It is not clear how this affects their scientific worth.

15. The appeal of cognitive connectionism, then, is that it might lead to a different form of cognitive theory to that which has been dominant since the "cognitivist" revolution (see Gardner, 1985). This "dominant paradigm" (see Stich and Nichols, 1995) in cognitive science has seen the human mind as rule-governed, with the rules specified over language-like structures. However, such models face difficulties with certain phenomena that can be neatly and economically dealt with by connectionist theories.

16. Consider, for example, the issue of graceful degradation. Many neuropsychologists have pointed out that acquired cognitive disorders are rarely all-or-nothing affairs. Whilst there are spectacular cases of total loss of specific cognitive functions (see Ellis and Young, 1998 for a review), there are many cases where loss of cognitive function does not show such specificity and where the functions lost are graded. Classical theories have problems theorising graceful degradation in a principled way, whereas the distributed representations characteristic of many connectionist models can account for this phenomenon comfortably.

17. We will shortly see that Green wishes to eliminate distributed representations from cognitive models in principle [3]. Before we address that argument, let us consider some of the conceptual contributions of distributed connectionist models in exploring aspects of cognition (a number of them in response to how-else arguments (para. 7 above). In traditional theories of deep dyslexia, in order to account for three of the characteristic types of word reading errors found in patients, it was necessary either a number of simultaneous lesions to different functional parts of the reading system (Morton and Patterson, 1980), or to postulate an entirely separate right hemisphere reading system (Coltheart, 1980). Hinton and Shallice (1991) showed that a distributed attractor network mapping between orthographic and semantic representations could demonstrate all three error types following a lesion at a single site. On Green's view, the explanation based on this model (of pointers to attractors) should be ruled out.

18. According to Chomskian accounts of language processing and language acquisition, grammatical constraints, such as long range dependencies, cannot be learned from (and indeed do not operate over) sequences of words. Grammatical processing operates over the deep structure of sentences. Elman (1990, 1993) demonstrated that a recurrent network could process long range dependencies even when its sole information source was training on sequences of words. On Green's view, we should ignore both what this distributed model showed to be possible, and the explanation that the model provides for its capabilities (in terms of nested trajectories through state space). Presumably we should allow Chomskians to go on claiming that word sequences are too impoverished to support grammatical inferences.

19. Chomskian theorists argue that grammar is too complicated to be learned merely from exposure to samples of language. They couch this argument in formal terms, invoking Gold's (1967) theorem, for example. On the other hand, children take some years to acquire competent language skills, and pass through increasing levels of proficiency. Hyams (1991) suggests that the gradual acquisition of grammar in no way eliminates the logical problem of whether it may be acquired from samples of language. Elman (1993) demonstrated that his recurrent network was able to learn more complex syntactic structures when it was initially trained on simple sentences. Gradual acquisition in this model quite clearly affected the difficulty of the "logical problem" of learning the syntax. On Green's view, the explanation based on this model (of generating initial representations that may later be nested inside more complex trajectories) should not be allowed to interfere with a priori arguments about the difficulty of syntax acquisition.

20. Children appear to demonstrate qualitatively different levels of performance during development. Many developmentalists argue that such 'stages" reflect the emergence of qualitatively different underlying cognitive mechanisms. McClelland's (1989) distributed model of the balance beam problem and Plunkett, Sinha, Moller, and Strandsby's (1992) distributed model of early lexical acquisition demonstrate that qualitatively different levels of performance may be generated from a single system undergoing gradual changes through an incremental learning algorithm. On Green's view, the explanations of each phenomena (for the balance beam model: the need to integrate separate channels processing distance and weight information, allied with early selective attention for weight cues over distance cues; for lexical acquisition: the requirement that representations for concepts be formed before word names may be associated with them) should be disregarded because they are based in distributed models.

21. Double dissociations in neuropsychology have frequently been taken as evidence for independent underlying mechanisms. Plaut and Shallice (1993) demonstrated in a distributed reading model that double dissociations, in this case between concrete and abstract words, were possible following a lesion to a single unified processing mechanism (see Plaut, 1995). On Green's view, the idea that double dissociations may arise in such a way (in this model, due to the differential reliance of abstract words on direct pathways to semantic outputs and concrete words on falling into attractor basins created by the "clean-up" units at output) should be ignored, because distributed models are theoretically inadequate.

22. Green argues that the notions we have taken to be characteristic of connectionist theories, especially distributed representation, are not good theoretical entities. Our argument that connectionist models are models and not theories leaves this criticism intact, as we have agreed with Green that the idea of distributed representation is a theoretical notion that is fundamental to connectionist theorising. But Green would need very compelling arguments, given the above demonstration of the theoretical fertility of connectionist theorising.

23. Green's argument against distributed representations is that they have no specific role to play. He diagnoses whether or not a theoretical entity has a specific role by seeing whether "if it were removed, not only would the performance of the model as a whole suffer, but it would suffer in predictable ways, viz:, the particular feature of the model's performance for which the theoretical entity in question was responsible -- i.e., that which it represented -- would no longer obtain." (Green 1998a, para 10).

24. This specificity requirement seems to call for a simple one-to-one correspondence between a theoretical construct and some piece of actual or possible data. But this requirement is rarely going to be met. Indeed, Green's own best example, that of the gene, does not satisfy it. It is well-known that genes rarely map in a one-to-one fashion onto phylogenetic traits. As Rose (1997, p. 101) states, when Mendel introduced genes as explanatory constructs, [l]ike all good experimenters ... [he] was lucky." He was lucky because "[t]he characters he studied seemed discrete." But matters were much more complex when nondiscrete characteristics were studied, for example, when Galton studied human features "such as height, or strength of grip, or head circumference, or intelligence." But this failure did not lead to the denial of genes as good theoretical entities. Indeed, it is clear that biologists did not follow Green's specificity requirement, for as Rose points out, the failure of the simple Mendelian notions didn't lead to the rejection of this construct. Instead they argued (op. cit., p. 104) that "other factors had to be obscuring the proper functioning of the genes" and so "[g]enes were said to be partially dominant or to show incomplete penetrance." Green's specificity argument would seem to rule out "gene" as a good theoretical entity. We take this to be a reductio of Green's position.

25. Green is right that theoretical entities have to have empirical warrant, but he is wrong that there can be a priori criteria for such warrant. Whether or not some data are evidence for some theoretical construct is, as they say, a matter of judgement. The difficulties Green gets into here stem from an simplistic picture of science. Green appears to rely upon a Popperian conception where a theory can, in principle, receive a straightforward knock-out blow from a crucial experiment. But the Popperian view has come under sustained attack over the past three decades (e.g., Kuhn, 1970; Putnam, 1974).

26. Lakatos's (1970) idea of competing research programmes seems more accurate for the situation in which cognitive scientists find themselves. Lakatos argues that any research programme will have a "hard core" which is essentially irrefutable. In the connectionist research programme, this hard core will include the notions we mentioned in para. 12 above (subsymbolic representation, nonlinear and interactive processing, distributed representation, soft constraint satisfaction, etc.). As far as connectionists are concerned, the idea is to see how far we can get in understanding the mind by using these and similar notions. What matters is whether a new research programme introduces a PROGRESSIVE or DEGENERATE problem shift. Initially a new research programme will be "submerged in an ocean of 'anomalies' (or if you wish, 'counterexamples'), and opposed by the observational theories supporting these anomalies" (Lakatos, op. cit. p. 133). The task of those who wish to take the new research programme forward is to turn counterexamples into "corroborating instances," and then to produce new counterexamples that can be resolved by the research programme. Lakatos suggests that research programmes can be assessed in terms of their "heuristic power" (Lakatos, op. cit. p. 137), that is, the number of new facts they produce and how well they do in responding to data that is anomalous.

27. The examples we have provided show that connectionism is currently a progressive research programme. Thus, when connectionist models first appeared, double dissociations appeared to be decisive counterexamples. As we have tried to show, however, connectionists have gradually shown that matters are not nearly so straightforward. Moreover, as pointed out in para. 8, single route connectionist models of past tense learning have led to new data about how children actually learn the past tense. Moreover, it is an active question whether the kind of box and arrow cognitive modelling that Green supports (Green, 1998a, para. 4) can be considered progressive. Seidenberg (1988), for example, has argued that the continual fractionation of the cognitive system into smaller and smaller components (boxes) renders these models mere redescriptions of the data, lacking any explanatory power.

28. There are two further issues to address: The first concerns the neural plausibility of connectionist models; the second the question of the unconstrained degrees of freedom that connectionist models enjoy.

29. We have been at pains to point out that connectionist models operate at the COGNITIVE level of explanation [4]. However, many theorists think that connectionists care about the details of implementation (e.g., O'Brien, 1998). Cognitive connectionists are not committed to this, but it is unclear why anyone would want to ignore the possibilities opened up by a dialogue between connectionism and neuroscience. Surely it would be nice if one's functional account were consistent with the machinery that implements it. Only under the unrealistic assumptions of the Turing machine are all representations equivalent. Some computations are easier to perform on some sorts of machinery in real time, and such constraints may well "ripple up" to the functional level (Pinker, 1997, p.26). Wouldn't it be nice (perhaps even parsimonious) if our low level and high level accounts shared the same characteristics of computation?

30. Two examples illustrate how we may be encouraged that such a project is feasible. Each example is based on a pair of models, one neural, one cognitive, that use essentially identical connectionist computational mechanisms. Rolls (1995) uses detailed evidence about neural circuitry in a computational model of the storage and recall of episodic memories in the hippocampus. At the heart of this model is a recurrent autoassociator which re-instantiates full memories from partial cues. Kawamoto's (1993) cognitive connectionist model explains a wide range of behavioural empirical data on the recognition of ambiguous words. At its heart there is again a recurrent autoassociator which re-instantiates full memories (of the word's meaning, grammatical class, pronunciation, etc.) from a partial cue (the word's visual form).

31. Miller, Keller, and Stryker (1989) demonstrate how a computational model of ocular dominance columns in the visual cortex of mammals can explain the patterns of development under a range of conditions. At the heart of the model is a self-organising network of which specific areas specialise in processing certain visual inputs. Schyns (1991) uses a self-organising cognitive connectionist network to explain a range of empirical effects in children's acquisition of concepts (e.g. prototype effects and mutual exclusivity in early lexical categories). At the heart of the model there is likewise a self-organising network in which specific areas of the network specialise in processing certain (in this case, conceptual) inputs. We would suggest that it is a positive feature of cognitive connectionist models that this sort of parallelism with neural models is in the offing.

32. Last, we wish to comment on a useful point that Green makes. He suggests that cognitive connectionist models as they stand have a vast number of degrees of freedom, without even an appeal to brain structure to justify them (Green, 1998f). In this regard, he is quite right. Models must have constraints if they are not to be mere curve-fitters. Massaro (1988) alerts us to the fact that connectionist networks are in danger of this, if left to their own devices. Now all cognitive connectionists will agree that simulation alone is not explanation. Connectionist models must have constraints, and those constraints must be supported by empirical data. There should be justifications for input representations, output representations, training regimes, learning rules, and architecture. In all cases, the justifications may be problematic, but for the first three, at least, we may collect empirical evidence concerning behaviour and the structure of the learning environment. The last two are trickier.

33. Green alludes to the last element, architecture. It is true that architecture, in terms of number of internal units, is often unconstrained. The arrangement of layers and connectivity is usually determined by an estimation of the minimum that the task requires. Eventually we will want a better account than this. In terms of the number of processing units within each layer, it is important for connectionists to constrain their models. The interim approach of using a rule of thumb to maximise both performance on the training set and generalisation beyond the training set is surely living on borrowed time.

34. For cognitive connectionism, there is hope that modellers will be able to give networks the ability to determine their own number of units and connections, depending on characteristics of the task domain. This may be done by adding units as they are required (Mareschal and Shultz, 1996), by starting with too many units and connections and then either pruning weights (Hanson, 1990) or encouraging the network to use as few as units as possible (Thomas, 1997). Alternatively, cognitive connectionists from the nativist school may appeal to evolution to determine a given domain-specific network architecture in the cognitive system. (Pinker's (1994) associative network in the exception route of his past tense model might be an example.) Elman et al (1996) explore how nativist constraints may lead directly or indirectly to specification of network architectures on different scales. As long as the principles of resource allocation are justified, we have a purchase on the degrees of freedom problem in connectionist modelling.

35. We have argued that connectionist models are models not theories, but that the theories they lead (or will lead) to are useful. It is wrong to mount a priori arguments against distributed representations at this stage. Perhaps when symbolic and connectionist models are tied in a dead heat, this issue can be revisited. Green is right, however, to point out that the degrees of freedom of a connectionist model is an important issue, and that connectionist modellers must seek constraints for the representational resources they assign to their models.

FOOTNOTES.

[1] Of course, a single route model could turn out on investigation to be less parsimonious than a dual route connectionist model, in terms of the components required to implement it, but we leave this complication aside and make some more general comments on connectionism and parsimony later.

[2] It could, of course, have turned out that Z was a symbolic device in which operations occurred over syntactic structures.

[3] This is evidenced by the commentaries, which argue that localist connectionist models should be exempted from Green's criticisms (Grainger and Jacobs, 1998; Watters, 1998).

[4] In this regard, Smolensky's (1988) postulation of a subsymbolic level of description in between the symbolic and neural level is misleading. Subsymbolic models operate at the same (algorithmic) level of description as symbolic models (see Goldsmith, 1998, para. 12) but offer cognitive theories of a finer granularity.

REFERENCES

Ellis, A.W. and Young, A.W. (1998). Human Cognitive Neuropsychology: A Textbook with readings.

Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179-211.

Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48, 71-99.

Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., and Plunkett, K. (1996). Rethinking Innateness: A connectionist perspective on development. MIT Press.

Gardner, H. (1985). The Mind's New Science: A History of the Cognitive Revolution. Basic Books.

Gold, E. M. (1967). Language identification in the limit. Information and control, 16, 447-474.

Goldsmith, M. (1998). Connectionist modeling and theorizing: Who does the explaining and how? PSYCOLOQUY 9(18) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.18.connectionist-explanation.15.green.

Grainger, J. and Jacobs, A. M. (1998). Localist connectionism fits the bill. PSYCOLOQUY 9(10) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.10.connectionist-explanation.7.green.

Green, C. D. (1998a). Are connectionist models theories of cognition? PSYCOLOQUY 9(4) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.04.connectionist-explanation.1.green

Green, D. C. (1998b). Connectionist nets are only good models if we know what they model: Reply to Lee, Van Heuveln, Morrison, and Dietrich. PSYCOLOQUY 9(23) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.23.connectionist-explanation.20.green

Green, C. D. (1998c). Statistical analyses do not solve connectionism's problem: Reply to Medler and Dawson on Connectionist-Explanation. PSYCOLOQUY 9(15) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.15.connectionist-explanation.12.green

Green, D. C. (1998d). Higher functional properties do not solve connectionism's problems: Reply to Goldsmith on Connectionist-Explanation. PSYCOLOQUY 9(25) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.25.connectionist-explanation.22.green.

Green, D. C. (1998e). Realism, instrumentalism, and connectionism. Reply to Young on Connectionist-Explanation. PSYCOLOQUY 9(13) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.13.connectionist-explanation.10.green.

Green, D. C. (1998f). The degrees of freedom would be tolerable if nodes were neural. Reply to Lamm on Connectionist-Explanation. PSYCOLOQUY 9(26) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.26.connectionist-explanation.23.green.

Hanson, S. J. (1990). Meiosis networks. In D. S. Touretzky (Ed.) Advances in neural-information processing systems II. San Mateo: Morgan Kaufman. Pp 533-542.

Hinton, G. E. and Shallice, T. (1991). Lesioning an attractor network: Investigations of acquired dyslexia. Psychological Review, 98, 74-95.

Hoffman, W. C. (1998). Are neural nets a valid model of cognition? PSYCOLOQUY 9(12) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.12.connectionist-explanation.9.green.

Horgan, T. (1997). Connectionism and the Philosophical Foundations of Cognitive Science. Metaphilosophy, 28, 1-30.

Hyams, N. (1991). Seven not-so-trivial trivia of language acquisition: Comments on Wolfgang Klein. In Eubank, L. (Ed.) Point counterpoint: Universal Grammar in the second language. Amsterdam: John Benjamins.

Kawamoto. A. H. (1993). Non-linear dynamics in the resolution of lexical ambiguity: A parallel distributed processing account. Journal of Memory and Language, 32, 474-516.

Kuhn, T. S. (1970, 2nd ed.) The Structure of Scientific Revolutions (Chicago: Univ. of Chicago Press)

Lakatos, I (1970) Falsification and the Methodology of Scientific Research Programmes. In I Lakatos and A Musgrave (eds.) Criticism and the growth of Knowledge (Cambridge: Cambridge University Press)

Marchman, V. and Bates, E. (1994). Continuity in lexical and morphological development: A test of the critical mass hypothesis. Journal of Child Language, 21(2), 339-366.

Massaro, D. W. (1988). Some criticisms of connectionist models of human performance. Journal of Memory and Language, 27, 213-234.

McClelland, J. L. (1989). Parallel distributed processing: implications for cognition and development. In Parallel distributed processing: implications for psychology neurobiology (ed. R. Morris). Clarendon Press, Oxford.

McClelland, J. L. (1993) The GRAIN model: A framework for modelling the dynamics of information processing. In D. E. Meyer & S. Kornblum (Eds.), Attention and performance XIV: Synergies in experimental psychology, artificial intelligence, and cognitive neuroscience. Hillsdale, NJ: Lawrence Erlbaum Associates, 655-688

McCloskey, M. (1991). Networks and theories: The place of connectionism in cognitive science. Psychological Science, 2, 387-395.

Medler, D. A. and Dawson, M. R. W. (1998). Connectionism and cognitive theories. PSYCOLOQUY 9(11) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.11.connectionist-explanation.8.green.

Miller, K. D., Keller, J. B., and Stryker, M. P. (1989). Ocular dominance column development: Analysis and simulation. Science, Vol. 245, 605-615.

O'Brien, G. J. (1998). The role of implementation in connectionist explanation: PSYCOLOQUY 9(06) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.06.connectionist-explanation.3.green.

Orbach, J. (1998). Do wires model neurons? PSYCOLOQUY 9(05) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.05.connectionist-explanation.2.green.

Pinker, S. (1985). Why the child holded the baby rabbits. In Gleitman, L. and Liberman, M. (Eds.) An Invitation to Cognitive Science, Volume 1, Language. MIT Press.

Pinker, S. (1994). The Language Instinct. Penguin.

Pinker, S. (1997). How The Mind Works. Allen Lane. The Penguin Press.

Plaut, D. C. (1995). Double dissociation without modularity: Evidence from Connectionist Neuropsychology. Journal of Clinical and Experimental Neuropsychology, Vol. 17, No. 2, 291-321.

Plaut, D. C. and Shallice, T. (1993). Deep dyslexia: A case study of connectionist neuropsychology. Cognitive Neuropsychology, 10, 377-500.

Plunkett, K. and Marchman, V. (1993). From rote learning to system building: Acquiring verb morphology in children and connectionist nets. Cognition, 48, 21-69.

Plunkett, K., Sinha, C., Moller, M. F., and Strandsby, O. (1992). Symbol grounding or the emergence of symbols? Vocabulary growth in children and a connectionist net. Connection Science, 4, 293-312.

Putnam, H (1974) The 'corroboration' of theories. In P. A. Schilp (ed.) The Philosophy of Karl Popper. La Salle, Illinois: Open Court.

Raftopoulos, A. (1998). Can connectionist theories illuminate cognition? PSYCOLOQUY 9(24) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.24.connectionist-explanation.21.green.

Rolls, E. T. (1995). A model of the operation of the hippocampus and entorhinal cortex in memory. International Journal of Neural Systems, 6 (supplement), 51-70.

Rumelhart, D. E. and McClelland, J. L. (1986). On learning the past tense of English verbs. In J. L. McClelland, D. E. Rumelhart, and the PDP Research Group (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 2: Psychological and Biological models. Cambridge, MA: MIT Press, pp. 216-271.

Mareschal, D. and Shultz, T. R. (1996). Generative connectionist networks and constructivist cognitive development. Cognitive Development, 11(4), 571-603.

Rose, S. (1997). Lifelines: Biology, Freedom, and Determinism. Allen Lane.

Seidenberg, M. (1988). Cognitive neuropsychology and language: The state of the art. Cognitive Neuropsychology, 5, 403-426.

Smolensky, P. (1988). On the proper treatment of connectionism. Behavioural and Brain Sciences, 11, 1-74.

Stich, S and Nichols, S (1995) Folk Psychology: Simulation or Tacit Theory. In M. Davies and T. Stone (eds.) Folk Psychology. Blackwell.

Thomas, M. S. C. (1997). Connectionist networks and knowledge representation: The case of bilingual lexical processing. Oxford D.Phil. Thesis.

Watters, P. A. (1998). Cognitive theory and neural model: The role of local representations. PSYCOLOQUY 9(20) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.20.connectionist-explanation.17.green. 1