Athanassios Raftopoulos (1998) Can Connectionist Theories Illuminate Cognition?. Psycoloquy: 9(24) Connectionist Explanation (21)

Volume: 9 (next, prev) Issue: 24 (next, prev) Article: 21 (next prev first) Alternate versions: ASCII Summary

Topic:

Article:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 9(24): Can Connectionist Theories Illuminate Cognition?

CAN CONNECTIONIST THEORIES ILLUMINATE COGNITION?
Comment on Green on Connectionist-Explanation

Athanassios Raftopoulos
Assistant Professor of Philosophy and Cognitive Science
American College of Thessaloniki
Anatolia College
P.O. BOX 21021,
55510 PYLEA
Thessaloniki, GREECE

maloupa@compulink.gr

Abstract

In this commentary I attempt to show in what sense we can speak of connectionist theory as illuminating cognition. It is usually argued that distributed connectionist networks do not explain brain function because they do not use the appropriate explanatory vocabulary of propositional attitudes, and because their basic terms, being theoretical, do not refer to anything. There is a level of analysis, however, at which the propositional attitude vocabulary can be reconstructed and used to explain the performance of networks; and the basic terms of networks are not theoretical but observable entities that purport to refer to terms used to describe the brain.

Keywords

connectionism, cognition, explanation, philosophy of science, theory, theoretical terms.

1. It is not rare to find in the literature articles like Green's (1998) that attempt to refute connectionism's claim that its theory offers proper explanations of cognition. Such authors usually argue that distributed connectionist networks do not provide an understanding of the way the mind works when it performs some cognitive task. The mere fact that a network can be trained to perform a task does not provide any understanding of how the mind performs the task, since it does not offer any insights into the inner workings of the mind. This is because the only information we gain from the success of the network is that its specific units can do the trick if they are appropriately trained. Since "none of the units correspond to ANY particular aspect of the performance of the network, there is no particular justification for anyone of them" (Green 1998). In the same vein, Ramsey, Stich and Garon (1991) stress that distributed encodings, unlike classical concepts/symbols, lack propositional modularity, which is a prerequisite for understanding cognition. (As Ramsey et al.'s main purpose is to discuss eliminativism, an issue that I will not address here, I will concentrate on Green's recent criticism.)

2. Green's point seems clear and quite plausible. The units of neural networks that use distributed representations do not have semantic content, hence none of them can be said to perform a certain function with a specific contribution toward the implementation of the task at issue. It follows that the success of the network does not shed any light on how the mind works, in so far as explanations of cognition must be in terms of propositional attitudes, such as beliefs, expectations, and so forth, and the units in distributed networks do not represent propositional attitudes (Green, 1998).

3. This commentary will argue that connectionist theory can provide explanations of cognitive phenomena. The explanation, however, is not to be found in the units and connections of the networks but in higher-level descriptions of the way these units and connections function when a network is trained to learn a cognitive domain. To this end, it will first be shown how connectionist networks can be used to model the cognitive activities of the brain. The units in a network are not unobservable entities represented by theoretical terms; they are observable in an acceptable philosophical sense and purport to model the neurons in the brain. In view of the objections to this modelling, the plausibility of this parallelism will be discussed. Last, the view that connectionist theory can provide explanations of cognition embedded in classical cognitivist terms (beliefs, concepts, dispositions, etc.) will be defended.

4. Green's point is that PDP networks that simulate cognitive abilities fail as the explanation of these activities. Green is quite right. But he is beating a dead horse, because no one in the connectionist camp has ever claimed that the networks provide real insight into cognition. The PDP networks are, like the cognitive abilities themselves, phenomena to be explained. In other words, a neural network plays the role of the explanandum and not the explanans. It requires as much analysis as the cognitive ability it purports to simulate. As Smolensky (1995, 363) has indicated: "the model [the connectionist network which purports to simulate the brain] constitutes an object of analysis, not an analysis in and of itself." But then what theory could explain the behavior, and eventually the success, of a PDP network? The answer, of course, is connectionist theory. Though this may sound trivial, it is very important, because it indicates where the explanation of PDP networks should be sought: connectionist theory is the appropriate place. This means that the explanatory terms will be formulated by using the resources of connectionist theory rather than the terminology of connectionist networks.

5. Researchers in the connectionist camp use networks to simulate cognitive abilities. They build networks that are trained to acquire linguistic capacities, to solve problems, recognize faces, answer specific questions, learn patterns, retrieve images from degraded input, etc. In this sense, these networks are construed as models of cognitive mechanisms that perform these same tasks. Consequently, if certain conditions are met, a theory that sheds light on the way the networks work when they learn some tasks may also shed light on the way the modelled cognitive mechanisms work. To use the terminology of model theory: the networks constitute the base and the cognitive mechanisms are the targets to which the conclusions from the base will be transferred. Analogy is widely construed as structure mapping; the condition that must be met in order to apply the analogy is that the structure of the two domains must be similar enough to allow knowledge transfer.

6. The structure of a connectionist network can be analyzed at three different levels, that of the units, the local architecture, and the global architecture (Elman et. al. 1996, 27-29). At the unit level, the architectural characteristics of a network include traits such as node activation functions, learning rules, momentum, etc. At the local level, these characteristics include the layering of the network, the kind of network (as determined by the way the signal propagates; recurrent or feedforward networks), etc. At the global level, we have the means by which the various subnetworks of which a network eventually consists are linked. The same analysis can be applied to the brain. Thus, at the first level, the structural characteristics of the brain include the specification of the types of neurons, the response characteristics of the neurons, the kind of neurotransmitter, whether it is inhibitory or excitatory, the nature of synaptic changes, etc. At the local architectural level, the structural characteristics of the brain include the differences in the number of layers, the packing density of the cells, the pattern of interconnectivity, and the nature of this interconnectivity. Finally, at the global level, we have the ways in which the various modules of the brain are interconnected. The structural similarity between the brain and network architecture seems obvious enough not to require an argument in its support.

7. Is structural similarity enough to warrant transfer of knowledge from one domain to the other? After all, the structure may be similar but the terms or entities to which this structure applies may be totally different. This is not a problem for analogical reasoning, since philosophers of science have often stressed that knowledge transfer from one domain to another supplies only an initial tentative meaning to the target-terms, one which usually changes upon further research. "Analogies are used to give temporary meaning to a vague, unarticulated conception, and they are also used to assist in the construction of its meaning" (Nersessian 1984, 147). Thus, as Braithwaite (1953, 93) warns us, we should avoid identifying the entities of the base with those of the target. But if connectionists want to argue, as most do, that connectionist simulations shed light on cognition, then they need something stronger than simple structural similarity. They need to ground both target and base on the same observable basis, that is, they need roughly the same observable entities in both domains. In this case, the networks are not mere models of the brain. They are not just one among several possible interpretations of brain functioning "but a filling out of the original interpretation" (Spector, 1965). Connectionists do this by claiming that the networks' units are the analogues of neurons in the brain (the reader should note the word "analogue").

8. This claim is highly controversial, but before analyzing its merit, some philosophical considerations: As just noted, the units (and the neurons) are the observable entities of the connectionist theory. This goes against Green's (1998) claim to the effect that "[e]ach of the units, connections, and rules in a connectionist network is a theoretical entity." Note that, first, there are no theoretical entities, only theoretical terms which may or may not refer to unobservable entities. Second, the theory/observation distinction is a notoriously thorny issue in the philosophy of science; we will not dwell upon it further, simply accepting Green's "scientific common sense" conception of the issue.

9. Green's favourite examples of theoretical terms are the "long term" and "short term" memory stores. These are theoretical because we cannot observe them and we can only infer their existence from clinical observations and experiments. Other theoretical terms include the rules that control the building and interpretation of grammatical sentences. We do not observe them anywhere, but they are terms used by a linguistic theory in its attempt to explain linguistic competence. All these theoretical terms nevertheless represent something in particular, an unobservable entity. In contrast, Green argues, the units in a connectionist system and their connections do not represent anything known to exist in the cognitive domain the network models.

10. Now, it goes against this last thesis of Green's that the structure of connectionist networks seems similar enough to that of the brain to warrant the use of the former as a model, albeit a very simplified one, of the latter. In a few paragraphs we will discuss why, in addition to this similarity, the units of networks can be construed as the neurons in the brain, a move which grounds both domains on the same observational basis. In the end I will argue that connectionist theory, and not connectionist models, can propose explanations in terms of concepts, beliefs and all the variety of classical cognitivism. First, we will examine why the units and their connections are observable entities.

11. If one looks carefully for Green's argument regarding the theoretical character of the terms referring to the units and their connections, one finds none. Although Green points out several times that these theoretical terms refer to nothing in the cognitive domain that the networks purport to model, he does not justify his claim regarding the theoretical character of these entities. Extrapolating from his discussion regarding genes and memory store, one can infer that units and their connections are unobservable because one cannot observe them directly. They are just descriptive terms in a theory. Now, one can ask why the units are unobservable. It is not clear exactly what Green has in mind, but one possible reason may be the fact that we do not even have real connectionist networks: We build them by using simulations in serial computers. The units and their connections are just parts of certain algorithms; they are not real entities at all. As such, they are unobservable.

12. There are two rebuttals to this line of argument. First, researchers can do quite a number of things with these unreal entities. They can assign biases and threshold values, they can add or remove units, they can even build networks, called cascade correlation networks (Shultz and Schmidt, 1991; Shultz et. al., 1995), that enlist units by themselves depending on the demands of the task, or nets that prune their units by themselves (so-called meiosis networks; Hanson, 1990). They can manipulate these units by changing their parameters to observe differences in performance by the ensuing networks, and they can do all these things directly. As Hacking (1983) argues, if a scientist manipulates an entity, then this entity is an observational one in his system. Second, real connectionist systems have been developed. There are chips that contain silicon units that the designer designed and the manufacturer constructed. This means that the units are observational entities.

13. It is at this point that Green's seemingly common sense distinction between observable and unobservable fails him. For it is now widely accepted in philosophy of science (e.g., Einstein, 1968; Shapere, 1982; Hacking 1983; Fine, 1984) that this distinction is not absolute but depends upon current theory, that is, in some theoretical contexts a fact may be observable whereas in another not. To use Shapere's (1982) example, scientists talk of "observing" the interior of the sun by using neutrinos emitted by a solar fusion processes. According to scientific practise, a law is observational if it is used to describe rather than explain the behavior of a system. A law which purports to explain is called theoretical or fundamental (Cartwright, 1983).

14. Let us reconsider connectionist theory in this light. A network's description includes the number and types of units, biases and thresholds, the type of connectivity and learning rule, as well as information regarding its training. All the statements rendering these facts are observational in the context of connectionist theory. The theoretical sentences include attempts at understanding the behavior of the system and the analysis of the behavior and transformation of its units (principal component analysis, cluster analysis, learning analysis). As we shall see, these methods provide higher level constructs that purport to elucidate the network's function.

15. This concludes the argument that the units and their connections are not treated as unobservable but as observable entities in connectionist theory, supporting the claim that by using units to simulate neurons the connectionist model of the brain and the brain modelled are grounded on the same observational basis, which, if read along with my claim regarding the structural similarity of the two domains, allows the transfer of knowledge from the one domain to the other. We will now turn to the plausibility of simulating neurons by using units in a network.

16 . The reasons underlying the appeal of using units in connectionist networks to simulate the neurons in the brain are well known and will not be rehearsed here. We will proceed directly to the reasons that prompt many researchers to doubt the plausibility and validity of such a move. The more widely known objection is that in artificial networks the units both inhibit and excite one another. Crick and Asanuma (1986) note that such neurons are not found in the human brain, in which the neurons either inhibit or excite other neurons, but not both. Another problem is that in the cortex the initial segment of axons may receive synapses, and dendrites may form synapses onto other dendrites. This trait is not found in the networks that use the standard propagation rule (the reader should recall the distinction between the activation and propagation rule), according to which the connections multiply their weights with the outputs from units and the signals from each connection are summed up. Finally, a major problem besetting not the simulation of neurons by units, but the simulation of real learning by learning in networks, is that the backpropagation rule, which is widespread in the connectionist literature, does not seem to be implemented in the brain, since we do not have evidence for the existence of either adjustment signals or the real pathways to convey them backward. The discussion here will be restricted to these problems.

17. It is true that connectionist networks rely heavily on units sending both inhibitory and excitatory signals to other units. This does not mean, however, that the absence of such connectivity in the brain poses an insurmountable obstacle to connectionism. As Crick and Asanuma themselves point out, inhibition can be construed as being the consequence of excitation. More specifically, a neuron which sends only excitatory signals may excite and activate an inhibitory neuron, that is, a neuron which inhibits only other neurons. Thus, the original neuron excites other neurons directly and inhibits them indirectly, through the inhibitory neuron. There is some neurophysiological evidence that this may be the case (Milner, 1957).

18. Neurobiological details of course come at a price for connectionism, which must abandon simple, general, all-purpose kinds of architecture in which units neatly distribute into separated layers that interact starting with random weights and random propagation of excitatory or inhibitory activity in all directions. The constraints on the actual types of propagation in the brain impose restrictions on the architecture of artificial networks, as does the fact that in the brain there are qualitatively different neurons which perform different functions. The presence in the brain of purely excitatory and inhibitory neurons forces connectionists to construct much more complex architectures that will incorporate this constraint. This is not an obstacle in principle; future research will simply have to deal with this increasing complexity. Indeed, increasingly complex networks with modular architectures are already appearing in the literature (e.g., Murre 1992).

19. The second problem identified above -- namely, that brain propagation rules in the brain seem to differ from the standard rule used in connectionist networks, whereby connections multiply their weights by the outputs of units and the signals from each connection are summed up -- can be addressed by means of the sigma pi units that are currently used in many simulations (Feldman and Ballard, 1982; Rumelhart, Hinton and McClelland, 1986; Mel, 1990; Poggio and Girosi, 1990). These units allow the signal gating in the network to be carried out as is sometimes done in the cerebral cortex, in so far as they use multiplicative synapses.

20. The last problem concerns the biological plausibility of the backpropagation rule. Though there are other learning rules for which we have ample evidence as operating in the brain (Hebbian learning, for instance), as well as rules that do not require the presence of a global teacher, the implausibility of backpropagation does pose serious problems for connectionism. The situation is not desperate, however. There have been attempts to extend or modify the rule so that it becomes biologically plausible (Durbin and Rumelhart, 1989), and the brain abounds in re-entrant connections to allow for the descending pathways that are needed for the backward propagation of the signal. As with other problems concerning the biological plausibility of network units, the task of developing networks with units that have the characteristics of real neurons is still in its infancy. The important thing to note is that as research progresses, networks with increasing biologically plausibility are being developed.

21. Let us grant that the units in networks satisfactorily simulate the neurons in the brain. What kind of explanation does connectionist theory, and not the networks, offer for cognition? More specifically, if one looks for explanations embedded in the vocabulary of folk psychology, could connectionist theory provide accounts of performance in a cognitive task in terms of beliefs, concepts, and so on? At first glance, it would seem that no such explanation is forthcoming. Networks using distributed representations are not semantically transparent. According to Clark (1991, 18), a system is said to be semantically transparent if there exists a mapping between a symbolic semantic description "of the system's behavior and some projectible semantic interpretation of the internally represented objects of its formal computational activity." In other words, in a semantically transparent system there must be a correspondence between the syntactic constituents of the system and parts of natural languages propositions. In connectionist networks this means that each unit, or set of units, should correspond to a concept in such a way that each appearance of the unit (set of units) is accompanied by the appearance of the concept and each occurrence of the concept is accompanied by the same activation, or pattern of activation, of the unit, or set of units.

22. Now, it is precisely this property that distributed representations lack, and thus units or sets of units cannot be said to stand for concepts, or for any propositional attitudes, in the way symbols stand for concepts in classical cognitivism. Propositional attitudes exhibit "propositional modularity," that is, they are discrete, semantically interpretable, and play a causal role in inducing other states (Ramsey, Stich and Garon, 1991). Distributed connectionist networks do not seem to be able to afford representations (if any) that have these properties. Consequently, it seems as if the connectionist networks lack structure, in that their function cannot be described in a compositional and systematic manner (which is exactly the criticism of connectionism of Fodor and Pylyshyn [1988], and Fodor and McLaughin [1995]). But things are not so catastrophic. As Clark remarks (1995, 345), "connectionist models are actually more structured... and hence visibly compatible with the requirements of propositional modularity." But where does this structure come from? To discuss this, one must first try to establish a meaningful correlation between something in connectionist-style representations and propositional attitudes (concepts, or beliefs, for example), for without this correspondence the networks are bound to remain black boxes defying any attempts at understanding how they work.

23. Smolensky (1995, 360) claims that "connectionist theory requires a concept like belief, and that indeed such a concept can be constructed using existing theory"; to account for this correlation he proposes the Semantic Level Principle, according to which "[s]emantically interpretable aspects of distributed connectionist models reside at the higher level defined by activation patterns or vectors, and weight vectors -- not at the lower level of individual units and connections. That is, semantic elements are non-local: individual semantic elements are defined over shared, spatially distributed, regions of the network" [p. 359]. The higher level at which one must look for the connectionist version of beliefs is the space of weights of the connections between units.

24. To analyze this space, Smolensky proposes weight analysis, a technique which performs a simple mathematical analysis of the activations of the units and of the weights and their evolution when a network learns some task. The result of this analysis is the establishment of the concepts of C- and L-belief (Smolensky 1995, 359), which can be used to characterize the state of a network which answers whether some sentences are true or false. A network able to answer such questions has acquired some knowledge. This result of learning can be described in terms of C- and L-beliefs, which are defined as regions in a weight space. These beliefs play the functional role of the beliefs of folk psychology and can be said to have the property of propositional modularity, in so far as they are semantically interpretable and functionally discrete.

25. A similar result can be obtained if one looks closely at the results of cluster analysis of the activation space of trained networks, a kind of statistical analysis of the activation patterns of the units, and especially the hidden units of a network, which is where networks are thought to build their representations. This method examines the way the weights of the hidden units partition the activation space defined by the hidden units in a way suited to the task at hand. Most interesting results have been found with respect to the NETtalk network of Sejnowski and Rosenberg (1986), the network of English past tense of Plunkett and Marchman (1991), and the mines and rocks network of Churchland (1989).

26. The cluster analysis of NEttalk reveals that when the network learns the speech to sound transformations by acquiring the appropriate weights, the hidden units' activation space is partitioned into a tree structure in which items that are pronounced in similar ways are grouped together. In the case of the past-tense network we find the activation space to be partitioned into areas that correspond to the regular and irregular verbs, respectively, the irregular verb area itself being partitioned into subparts that correspond to the different ways that the past tense of irregular verbs is formed. The cluster analysis of the mines and rocks network, finally, reveals that the space is partitioned into two areas, one for the mines and the other for the rocks. When the input to the system is such that the activation pattern of the hidden units falls, say, within the mine area, then the mine output is activated. It is in this sense that we can say that the network has learned the concepts "mine" and "rock" by analyzing the features, some of which defy semantic description, present in the input signal of the training phase. Adopting Smolensky's analysis one could say that the network answers the question "is this a mine or a rock?" by analyzing the specific input and by forming the appropriate L-belief.

27. It must be noted, however, that the concepts embedded in networks are not equivalent to their classical cousins. They exhibit context sensitivity, meaning that a concept does not correspond to a unique activation or weight vector but to an area in the respective spaces. Thus, the same coffee mug will evoke different activation patterns, but these fall within the same space partition. Thus, distributed networks still defy semantic transparency, which presupposes a one to one correspondence between patterns of activation and semantic states.

28. I have argued that distributed parallel processing displays, at the right level of analysis, those characteristics that are deemed necessary for any adequate explanation of cognition. More specifically, such networks can be assigned states that are propositionally modular in a restricted sense, that is, they are discrete and semantically interpretable, but do not play the causal role assigned to them by classical cognitivism. Now, although it is true they cannot play the exact causal role that classical propositional attitudes play, it is doubtful whether cognition actually exemplifies the kind of causality typically attributed to it. (This issue cannot be pursued further here; see Smolensky, 1990; Clark, 1991; and Ramsey, Stich, Garon, 1991, for relevant discussions.)

29. Fodor and Pylyshyn (1988), in their criticism of connectionism, cite another reason why connectionist networks cannot be used to model and explain cognition. Cognition is compositional, that is, the content of a complex representation is a function of the content of its atomic constituents and of the way they are combined. Connectionist networks, on the other hand, cannot account for the compositionality of our representations, since compositionality relies on combinatorial syntactic and semantic structure, two properties that connectionist networks lack.

30. It has just been argued that semantic structure can be found in networks at a higher level of analysis. This leaves open the issue of combinatorial syntax. What this means is roughly what we all know to happen when we combine simple sentences to build complex ones. There are rules that dictate what combinations are allowed and how the complex sentence is built out of the simpler constituents. In symbolic logic, for instance, from p, q, and r you can build the sentence [(p and q) or r], following the appropriate rules. This kind of compositionality is called concatenative (Van Gelder 1990; Clark, 1993), because the atomic constituents concatenate to form the complex sentence and are present in it. According to cognitivism, the only possible representations of structure-dependency are those that are syntactically structured, that is, those that contain tokens of their constituents. In other words, according to cognitivism, compositional representations are only those in which all atomic elements are explicitly preserved.

31. It is well known that the distributed nature of connectionist representations precludes the possibility of the higher level representations being syntactically structured, but this does not imply that connectionist representations are not compositional. Pollack (1990) shows how connectionist networks can represent recursive data structures, such as trees and lists. His network develops compact distributed representations for such compositional structures, which can in turn be analyzed back to their constituents. The new complex representations, however, do not contain tokens of their constituents. Instead, they combine "apparently immiscible aspects of features, pointers, and symbol structures." (Pollack 1990, 77). The system incorporates an effective procedure which allows the coding and decoding from one level of representation (the constituents' representations) to the other (the complex representation). This is called functional compositionality in the literature and is distinguished from the concatenative compositionality of cognitivism. Connectionist representations, thus, exhibit compositionality, albeit a functional kind.

32. It must be emphasized that although there are higher level explanations of cognition involving a form of propositional attitudes, this must not be taken to imply that connectionism is not that different from cognitivism after all and that it just provides a way of implementing classical computational systems. As researchers in the connectionist camp have been stressing all along (Churchland, 1989; Clark, 1991; 1993; 1995; 1997; Smolensky 1995 to mention just a few), one should resist the temptation to think of the concepts thus described as long-term syntactic items in the network (as they are in cognitivism). Such symbolic items simply do not exist, since connectionist theory has no place for symbols (concepts) that are bearers of some fixed content and that are preserved when higher level representations are built by computational processes. As has been noted, connectionist encodings are context-sensitive and exhibit the connectionist properties of graceful degradation, interference, memory content addressability, etc. And this is as it should be, so long as we seek to explain these properties that cognition has and that classical cognitivist computational systems lack.

ACKNOWLEDGEMENTS

I would like to thank Dr. Debbie-Brown Kazazis for her assistance and comments, and Prof. Demetriou for giving me the opportunity to write this paper.

REFERENCES

Achinstein, P. (1968). Concepts of Science. Baltimore, MD: The Johns Hopkins University Press.

Braithwaite, R. B. (1953). Scientific explanation. Cambridge: Cambridge University. Press.

Cartwright, N. (1983). How the Laws of Physics Lie. Oxford: Oxford Un. Press.

Churchland, P. (1989). A Neurocomputational Perspective: The Nature of Mind and the Structure of Science. Cambridge, MA: The MIT Press.

Clark, A. (1991). Microcognition: Philosophy, Cognitive Science, and Parallel Distributed Processing. Cambridge, MA: The MIT Press.

Clark, A. (1993). Associative Engines: Connectionism, Concepts, and Representational Change. Cambridge, MA: The MIT Press.

Clark, A. (1995). Connectionist minds. In: Debates on Psychological Explanation, eds. MacDonald, C., and MacDonald, G. , Oxford: Blackwell, 339-356.

Clark, A. (1997). From text to process: Connectionism's contribution to the future of cognitive science. In: The Future of Cognitive Revolution, eds. Johnson, D. M., and Erneling, C. E., Oxford: Oxford University Press, 169-186.

Crick, F. H. C., and Asanuma, C. (1986). Certain aspects of the anatomy and physiology of the cerebral cortex. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 2: Psychological and Biological Models, eds., McClelland, J. L., Rumelhart, D. E., and the PDP Research Group. Cambridge, MA: The MIT Press, 333-371.

Durbin, R., and Rumelhart, D. E. (1989). Product Units: A computationally powerful and biologically plausible extension to backpropagation networks. Neural Computation, 1: 133-142.

Elman, J. L., Bates E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., and Plunkett K. (1996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge, MA: MIT Press.

Feldman, J., and Ballard, D. (1982). Connectionist models and their properties. Cognitive Science, 6: 205-254.

Fine, A. (1984). And not anti-realism either. Nous, 18: 51-66.

Fodor, J. A., and Pylyshyn, Z. W. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28: 3-71.

Fodor, J. A., and McLaughin B. P. (1995). Connectionism and the problem of systematicity: Why Smolensky's solution doesn't work. In: Debates on Psychological Explanation, eds. MacDonald, C., and MacDonald, G., Oxford: Blackwell, 199-222.

Green, C. D. (1998). Are connectionist models theories of cognition? PSYCOLOQUY 9(4) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.04.connectionist-explanation.1.green

Hacking, I. (1983). Representing and Intervening. Cambridge: Cambridge University Press.

Hanson, S. J. (1990). Meiosis networks. In: Advances in neural- information processing systems II, ed., Touretzky, D. S., San Mateo: Morgan Kaufman, 533-542.

Mel, B. W. (1990). The sigma-pi column: A model for associative learning in cerebral neocortex. Pasadena, CA: California Institute of Technology, Computational and Neural Systems Program.

Milner, P. M. (1957). The cell assembly: Mark II. Psychological Review, 64: 242-252.

Murre, J.M.J. (1992) Precis of: Learning and categorization in modular neural networks. PSYCOLOQUY 3(68) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1992.volume.3/ psycoloquy.92.3.68.categorization.1.murre

Nersessian, N. J. (1984). Faraday to Einstein: Constructing Meaning in Scientific Theories. Dordrecht: Kluwer Academic Publishers.

Plunkett, K., and Marchman, V. (1991). U-shaped learning and frequency effects in a multilayered perceptron: Implications for child language acquisition. Cognition, 38: 43-102.

Poggio, T., and Girosi, F. (1990). Regularizing algorithms for learning that are equivalent to multilayer networks. Science, 247: 978-982.

Pollack, J. B. (1990). Recursive distributed representations. In: Connectionist Symbol Processing, ed., Hinton, G, Cambridge, MA: The MIT Un. Press, 77-105.

Ramsey, W., Stich, S. P., and Garon, J. (1991). Connectionism, eliminativism and the future of folk psychology. In: Philosophy and Connectionist Theory, eds., Ramsey, W., Stich, S. P., and Rumelhart, D. E., Hillsdale, NJ: Erlbaum , 199-228.

Rumelhart, D. E., Hinton, G. E. and McClelland, J. L. (1986). A general framework for parallel distributed processing. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 1: Foundations, eds., J. L., Rumelhart, D. E., McClelland, and the PDP Research Group. Cambridge, MA: The MIT Press, 45-76.

Sejnowski, T., and Rosenberg, C. (1986). NETtalk: A parallel network that learns to read aloud. Johns Hopkins University, Technical Report JHU/EEC-86/01.

Shapere, D. (1982). The concept of observation in science and philosophy. Philosophy of Science, 49: 231-267.

Shultz, T. R, and Schmidt, W. C. (1991). A Cascade-Correlation model of balance scale phenomena. Proceedings of the Thirteenth Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Lawrence Erlbaum, 635-40.

Shultz, T. R, Schmidt, W. C., Buckingham, D., and Mareschal, D. (1995). Modeling cognitive development with a generative connectionist algorithm. In: Developing Cognitive Competence: New Approaches to Process Modeling, eds., Simon T. J., and Halford, G. S., Hillsdale, NJ: Lawrence Erlbaum, 205-262.

Smolensky. P (1995). On the projectible predicates of connectionist psychology: A case for belief. In: Connectionism: Debates on Psychological Explanation, eds. MacDonald, C., and MacDonald, G. , Oxford: Blackwell, 357-394.

Spector, M. (1965). Models and theories. The British Journal for the Philosophy of Science, reprinted in: Readings in the Philosophy of Science, eds. Brody, B. A., and Grandy, R. E., Englewood Cliffs, NJ: Prentice Hall., 1989, 44-57.

Van Gelder, T. (1990). Compositionality: A connectionist variation of a classical theme. Cognition, 14: 355-384.

Volume: 9 (next, prev) Issue: 24 (next, prev) Article: 21 (next prev first) Alternate versions: ASCII Summary