Risto Miikkulainen (1995) Computational Constraints and the. Psycoloquy: 6(03) Language Network (12)

Volume: 6 (next, prev) Issue: 03 (next, prev) Article: 12 (next prev first) Alternate versions: ASCII Summary

Topic:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 6(03): Computational Constraints and the

COMPUTATIONAL CONSTRAINTS AND THE
ROLE OF SCRIPTS IN STORY UNDERSTANDING
Reply to Reilly on Language-Network

Risto Miikkulainen
Department of Computer Sciences
The University of Texas at Austin
Austin, TX 78712

risto@cs.utexas.edu

Abstract

The computational constraints obtained from symbolic models (as well as large, integrated connectionist models) can serve to guide research in subsymbolic cognitive science where neuroscience and psycholinguistics do not provide enough data. One such idea is scripts, which turns out to have a natural and powerful implementation in the subsymbolic framework as regularities in experience. Scripts serve an important role in story processing in that they allow us to leave out details and focus on the important events. Story processing beyond scripts, however, seems to involve mechanisms whose connectionist implementation would require a high-level monitor on top of the statistical components.

Keywords

computational modeling, connectionism, distributed neural networks, episodic memory, lexicon, natural language processing, scripts.

I. INTRODUCTION

1. One of the goals of DISCERN (Miikkulainen, 1993; 1994) was to show to the AI community that connectionist models are capable of large-scale natural language processing (NLP) at the level of symbolic models. Reilly (1994) argues that this goal has directed the model away from cognitive neuroscience and psycholinguistics, which constitute a better guideline for the connectionist approach, and in that sense DISCERN "represents a missed opportunity".

II. NEURAL AND COMPUTATIONAL CONSTRAINTS

2. I do agree that the primary motivation and inspiration for connectionist models has to originate from neuroscience and psycholinguistics. However, the constraints that can be obtained that way are very sporadic. They may be enough to build connectionist models for isolated, low-level tasks (such as recognition of words vs. nonwords, learning past tense forms or pronunciation, and storage and retrieval of random items from memory). But if the goal is to understand how larger sections of the cognitive system work together in a task such as processing multi-sentence text, such constraints are hardly enough. Once the psycholinguistically best-understood subprocesses such as lexical access and regularity-based inference have been built, several computational issues still remain that need to be resolved before we have a working model.

3. It is here where the ideas from symbolic Artificial Intelligence (AI) are most useful. They should not always be taken literally, but as a guideline, a source for ideas about how to bridge the gaps in our knowledge about the computational processes in the brain. Implementing some of these ideas may take some work, but if they turn out successful, they lead to hypotheses and predictions that can be tested in experimental cognitive neuroscience and psycholinguistics, and thereby result in better understanding of the cognitive system. Therefore, I would like to argue that it is precisely in these areas, where the ideas come not from neuroscience but from AI, that the connectionist modeling potentially can have the largest impact.

III. CONSTRAINTS FROM INTEGRATED MODELS

4. One can argue that perhaps we should not be building models of such large scale if there are not enough solid constraints from neural and related sciences. However, it is very important to test the models of the individual components, such as the lexical system, in a larger context. If the model is based on only a few psycholinguistic phenomena, it may do a very good job in explaining certain idiosyncracies of the task and do it in a plausible manner, but it may be not be able to support the general task of language processing.

5. For example, the simple recurrent network (SRN) models that are trained to predict the next word in the sentence (e.g., Elman 1990) are great at discovering regularities in the linguistic input, but they can hardly play a useful role in the parsing system. To test whether language understanding can be based on such processes, it is necessary to try to build a larger system of natural language understanding with the SRN network as one component. This was one of the main sources of motivation for DISCERN.

II. SCRIPTS IN STORY UNDERSTANDING

6. Reilly also points out that script processing is only part of understanding stories, and does not capture what is really important in the story. What makes a story worth telling is a deviation from the ordinary; if the story consists of just one script sequence after another, it is not interesting.

7. Although this is certainly true, script processing is essential and fundamental in any understanding of natural language. It is the foundation where the story representation is built, and it allows us to focus on the interesting events (this was indeed one of the principles of the original symbolic script processing systems; Schank and Abelson, 1977). Because we as readers share a lot of the same scripts, it is not necessary to repeat all the events every time a script occurs. They can be filled in automatically, and the actual text of the story can concentrate on the interesting parts. To understand how people process stories, we have to understand how they learn and use script-based knowledge.

8. There is a good amount of psychological evidence on script processing. People generally agree upon the characters and events and event order, and they tend to remember stories according to the script rather than what was actually said (Bower et al., 1979). The view of scripts as rather rigid chains of events, however, is an artifact of the symbolic implementation and does not seem to match psychological reality. People form new scripts all the time, based on regularities in experience. Any number of co-occurring events can form a script, and their associations can be of varying strengths.

9. It turns out that this more general view of scripts as regularities in experience is consistent with the neural network approach, and much more natural to implement than scripts as rigid causal chains. This is an important distinction: scripts in DISCERN, for example, are not just reimplementations of symbolic structures, but are based on a fundamentally different way of representing information. Scripts are a good example of the principle I put forth above: The results of symbolic research can be used as a guideline, especially where they are psychologically valid, but they do not have to limit our imagination or lead to implausible architectures and solutions.

III. BEYOND SCRIPTS IN NEURAL NETWORKS

10. However, as Reilly points out, DISCERN does not go beyond scripts, and therefore is not a complete model of human story processing. How to extend the subsymbolic approach beyond scripts is currently an open question. It seems that simple pattern transformation mechanisms that work so well with scripts may not be extensible to processing deviations and other less regular information. The main problem is that even though DISCERN is able to deal with high-level language, its processes still consist of simply a series of reflex responses.

11. It seems that language understanding at the level of interesting stories would require an internal control, or a "conscious" monitor that follows what the system is doing, detects deviations from the ordinary, and builds hypotheses about characters, plans, goals, failures, plot twists and so on. No such systems have been built in the subsymbolic framework so far (although the rudimentary control module of the SPEC parsing system (Miikkulainen, 1995) is a start in this direction). I believe the next generation of subsymbolic NLP models will have to address the issue of high-level control, although at the low level they may still be based on pattern transformation mechanisms and architectures such as those in DISCERN.

REFERENCES

Bower, G.H., Black, J.B. and Turner, T.J. (1979) Scripts in memory for text. Cognitive Psychology, 11:177-220.

Elman, J.L. (1990) Finding structure in time. Cognitive Science, 14:179-211.

Miikkulainen, R. (1993) Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon, and Memory. Cambridge MA: MIT.

Miikkulainen, R. (1994) Precis of: Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon and Memory. PSYCOLOQUY 5(46) language-network.1.miikkulainen.

Miikkulainen, R. (1995) Subsymbolic Processing of Embedded Structures. PSYCOLOQUY 6(02) language-network.11.miikkulainen.

Reilly, R. (1994) DISCERN as a Cognitive Model and Cognitive Modelling Framework. PSYCOLOQUY 5(78) language-network.5.reilly.

Schank, R.C., and Abelson, R.P. (1977) Scripts, Plans, Goals, and Understanding: An Inquiry into Human Knowledge Structures. Hillsdale, NJ: Erlbaum.

Volume: 6 (next, prev) Issue: 03 (next, prev) Article: 12 (next prev first) Alternate versions: ASCII Summary