Risto Miikkulainen (1995) Symbolic and Subsymbolic Cognitive Science. Psycoloquy: 6(04) Language Network (13)

Volume: 6 (next, prev) Issue: 04 (next, prev) Article: 13 (next prev first) Alternate versions: ASCII Summary

Topic:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 6(04): Symbolic and Subsymbolic Cognitive Science

SYMBOLIC AND SUBSYMBOLIC COGNITIVE SCIENCE
Reply to Dror & Young on Language-Network

Risto Miikkulainen
Department of Computer Sciences
The University of Texas at Austin
Austin, TX 78712

risto@cs.utexas.edu

Abstract

Symbolic and subsymbolic cognitive science can be seen as not competing but complementary approaches, serving different roles. Even though they are perhaps based on incompatible foundations, symbolic research can serve as a guideline for developing subsymbolic models, pointing out ways in which a large cognitive process could be broken apart and made tractable with current techniques.

Keywords

computational modeling, connectionism, distributed neural networks, episodic memory, lexicon, natural language processing, scripts.

I. INTRODUCTION

1. By their very nature, the symbolic and subsymbolic approaches to cognitive science appear to be incompatible. The main difference is that symbolic representations, such as lisp structures, are concatenative: it is possible to access and change them part by part. On the other hand, distributed representations, such as associations stored in the weights of a backpropagation network, cannot be modified without affecting all other information in the network (see also van Gelder, 1990). This leads to very different learning and performance properties for the two approaches. Symbolic systems tend to be better in processing structure and building abstractions, whereas neural networks naturally discover surface-level regularities and perform robustly under minor variations.

2. It may be that eventually all of cognition can be understood in terms of neural processes operating at the subsymbolic level in the brain. However, this would by no means render the symbolic approach irrelevant at this point. I agree with the possibility Dror and Young (1994) outline in their review of Subsymbolic Natural Language Processing (Miikkulainen, 1993; 1994), namely that the two approaches may co-exist for a long time in cognitive science, serving distinctly different roles. An often-used analogy is that of Newtonian physics and relativity: It is sometimes necessary to take into account the low-level neural mechanisms in explaining a particular phenomenon, whereas in other cases a higher-level symbolic description is a sufficient approximation and a more elegant and clear way of describing the process.

II. SYMBOLIC AND SUBSYMBOLIC PROCESSES

3. Humans appear to have two types of processes at their disposal for performing cognitive tasks (see also Smolensky, 1988). It is possible to solve problems by conscious reasoning, following rules and algorithms. On the other hand, much of everyday processing appears to be immediate and intuitive, based on past experience. Often while a person is learning a task, such as driving a car, he goes through a phase of conscious application of rules. After the skill has been perfected, the rules disappear and performance becomes intuitive.

4. When the cognitive process is based on conscious rule application, the symbolic approach is an elegant way to model it. The knowledge structures and algorithms can be made explicit, and perhaps even used to teach such skills to people. It may well be that these tasks turn out to be implemented by subsymbolic processes when we really look, but the subsymbolic properties do not play a significant role in these behaviors. As long as our goal is to understand the behavior, or perhaps replicate it on a computer or other people, the symbolic descriptions are more appropriate than neural.

5. However, there are also processes whose foundations are clearly subsymbolic. Most of everyday language processing appears to be this way: it is based very strongly on associations that are opaque and immediate. Processing appears effortless and intuitive. These properties are very difficult to capture in the symbolic framework. It seems that a complete account of linguistic performance will have to include subsymbolic mechanisms.

6. Script processing is a case in point. Although scripts can be approximated with symbolic structures such as causal chains (Schank and Abelson, 1977), such systems cannot capture the full power of the idea. Scripts are simply regularities in the experience; they can be of any length in time, and of varying strength. Any co-occurrence of events forms a basis for a script. Some of them are reinforced more than others and become stronger, other accidental associations die away. Such generalization of the script idea can be very naturally implemented in neural networks, and was the main motivation for building the DISCERN system.

III. ARE SYMBOLIC RESULTS USEFUL IN SUBSYMBOLIC RESEARCH?

7. Dror and Young claim, however, that any idea that comes from the symbolic approach, such as scripts or lexicon, even when implemented in terms of subsymbolic processes, limits the exploration of the new foundation of cognitive science on neural networks. I would like to argue that although this may be philosophically true, in the practice of cognitive science using results of the symbolic approach to guide subsymbolic research is an effective way to make progress.

8. Completely abandoning our old ideas about how cognition is put together and rebuilding everything on connectionist foundations is a mighty task. Computationally, several small experiments such as learning the past tense forms (Rumelhart and McClelland, 1986) and pronunciation (Sejnowski and Rosenberg, 1987) indicate that indeed revolution might be underway but it is far from clear whether these unstructured, completely distributed systems will scale up. When it comes down to building models of high-level cognitive processes, the "connectoplasm" approach breaks down. We simply do not have the techniques for learning and self-organization at present that would give us a working model of, for example, sentence processing without building in at least some constraints.

9. The "pure" subsymbolic approach might work if we were able to build a model of the entire cortex, including just the right mechanisms and structures, and feed it the entire human learning experience. Even then the computational task would be prohibitive; it is probably no accident that it takes years for a human to learn a language. Therefore, we need shortcuts. We need ideas on how to build a cognitive architecture, how to break it down into manageable parts.

10. This is where the symbolic approach comes in. Much of the symbolic research has concentrated on building taxonomies and outlining the components of the cognitive system, and it could give valuable insight into how a connectionist model should be structured. Many of the symbolic models are strongly supported by psychological research, especially at the high level. For example, scripts have been studied extensively experimentally. It would seem ill-advised to throw that idea away in a connectionist language processing system.

11. However, I do agree with Dror and Young in that we should not go too far in imitating symbolic models. Blindly reimplementing symbolic architectures is unlikely to lead to new insights about cognition (Touretzky and Hinton, 1988). Using concepts from symbolic research should be done in terms natural to subsymbolic systems. For example, the distinction of syntactic and semantic constraints would be very difficult to maintain in a connectionist system, where all constraints are represented as regularities in the co-occurrence of words. It would therefore be a bad idea to try to enforce that distinction in the neural network architecture.

12. Also, as stronger learning algorithms become available, it may be possible to build models with fewer assumptions. For example, the current DISCERN model is very strongly influenced by the idea of a script as a slot-filler structure. The correct slots and fillers are enforced by the teacher during training. Perhaps in the future it would be possible to reimplement DISCERN with stronger self-organization algorithms that would allow the system itself to discover what the useful slots are, and this way gradually make progress towards models that would be less and less constrained by symbolic concepts. In this sense, I think of the results of symbolic research more like crutches rather than obstacles in the exploration of connectionist cognitive models.

REFERENCES

Dror, I.E. and Young, M. (1994) The Role of Neural Networks in Cognitive Science: Evolution of Revolution? PSYCOLOQUY 5(79) language-network.6.dror.

Miikkulainen, R. (1993) Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon, and Memory. Cambridge MA: MIT.

Miikkulainen, R. (1994) Precis of: Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon and Memory. PSYCOLOQUY 5(46) language-network.1.miikkulainen.

Rumelhart, D.E. and McClelland, J.L. (1986) On learning past tenses of English verbs. In D.E. Rumelhart and J.L. McClelland, editors, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 2: Psychological and Biological Models, 216-271. Cambridge, MA: MIT Press.

Schank, R.C. and Abelson, R.P. (1977) Scripts, Plans, Goals, and Understanding: An Inquiry into Human Knowledge Structures. Hillsdale, NJ: Erlbaum.

Sejnowski, T.J. and Rosenberg, C.R. (1987) Parallel networks that learn to pronounce English text. Complex Systems, 1:145--168.

Smolensky, P. (1988) On the Proper Treatment of Connectionism. Behavioral and Brain Sciences, 11:1-74.

van Gelder, T. (1990) Compositionality: A Connectionist Variation on a Classical Theme. Cognitive Science, 14:355--384.

Touretzky, D.S., and Hinton, G.E. (1988) A distributed connectionist production system. Cognitive Science, 12:423--466.

Volume: 6 (next, prev) Issue: 04 (next, prev) Article: 13 (next prev first) Alternate versions: ASCII Summary