Ronan Reilly (1994) Discern as a Cognitive Model and Cognitive. Psycoloquy: 5(78) Language Network (5)

Volume: 5 (next, prev) Issue: 78 (next, prev) Article: 5 (next prev first) Alternate versions: ASCII Summary

Topic:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 5(78): Discern as a Cognitive Model and Cognitive

DISCERN AS A COGNITIVE MODEL AND COGNITIVE
MODELLING FRAMEWORK
Book Review of Miikkulainen on Language-Network

Ronan Reilly
Dept. of Computer Science
University College Dublin

rreilly@nova.ucd.ie

Abstract

In this review, I evaluate the degree to which I feel Miikkulainen has achieved the goals he set himself in developing DISCERN. On the positive side, I argue that DISCERN is a significant achievement in that it demonstrates the feasibility of building large-scale connectionist systems that exploit the capabilities of distributed representations. The techniques embodied in the FGREP component are important and should prove useful in other applications. On the negative side, however, I consider that DISCERN represents a missed opportunity. I do not believe that scripts tell us much that is useful about real natural language processing, nor do I believe that a system designed to process scripts can be an informative basis for cognitive modelling.

Keywords

computational modeling, connectionism, distributed neural networks, episodic memory, lexicon, natural language processing, scripts.

I. INTRODUCTION

1. Risto Miikkulainen's book, Subsymbolic Natural Language Processing (1993, 1994), describes the results of a research project that is impressive in both its scope and achievement. Miikkulainen sets out to address a number of complementary goals. In the first instance he is interested in building a connectionist NLP system of more than the usual toy size and with capabilities that exploit the properties of distributed connectionist representations. To achieve this he must address the problem of scaling; a problem that has proved particularly limiting for connectionists, though also one from which symbolic AI has not entirely escaped. A second goal that Miikkulainen sets himself is to explain a variety of high-level cognitive phenomena such as word learning, stereotypical or "scripted" language behaviour, and episodic memory, using the sub-symbolic properties of distributed representations. A third goal is to develop a methodology for the design and construction of sub-symbolic cognitive models. In this review I will discuss the degree to which I feel that Miikkulainen has achieved these three goals, concentrating primarily on those relating to cognitive modelling issues.

II. BUILDING A LARGE-SCALE CONNECTIONIST NLP SYSTEM

2. As a competitor of the conventional symbol-based script systems of, say, SAM (Cullingford, 1978) or FRUMP (de Jong, 1979), DISCERN comes in a not too distant third. Although Miikkulainen did not build a system that could out-perform its symbolic competitors, he demonstrates the feasibility of doing so by building one powerful enough for this to seem an attainable goal. DISCERN is capable of processing script-based stories, of paraphrasing them, and of answering questions about them. It has a number of advantages over conventional approaches, which are those that usually accrue to connectionist-based systems employing distributed representations, namely, an ability to generalise and to perform automatic inferencing based on the statistical regularities of the training data. These abilities are difficult to implement in symbolic systems, but "come for free" in connectionist systems. On the other hand, DISCERN, like all connectionist systems, cannot deal effectively with events that are statistically exceptional. Furthermore, DISCERN cannot deal with more than one script being active at a given time, or with the interaction and blending of scripts. While the latter are problems that can be dealt with more readily within a symbolic approach, Miikkulainen argues that DISCERN could be extended to deal with multiple, simultaneously active scripts (p.304). Overall, as a script-processing NLP system, DISCERN is comparable to symbolic-based approaches. However, as I will argue in a later section, this may not be an entirely desirable goal to have aimed for in the first place.

3. In my view, the significant breakthrough that DISCERN represents is the development of techniques for scaling up connectionist systems to a useful level. The overall strategy that Miikkulainen adopts, and one that seems to be biologically plausible, is that of modularity. The task of natural language processing is broken down into a set of autonomous, but interacting sub-tasks. The division of labour in the overall system is along traditional lines: lexicon, sentence parser, story parser.

4. As with all modular approaches, a mechanism is required to allow modules to communicate with each other. In symbolic systems, this is achieved by means of a blackboard or similar mechanism (e.g., Erman, Hayes-Roth, Lesser & Reddy, 1980). In a blackboard system, static representations generated by one module are written to the blackboard where they can be read by another, collaborating, module. In the case of DISCERN, the lexicon serves a purpose roughly equivalent to that of the blackboard. In conventional systems, sharing a lexicon would not take you very far, but when the lexicon comprises a set of distributed concept representations, it is possible to exploit some of the computational features specific to this type of representation. Because the distributed representations are simultaneously created in the service of each module's task, they simultaneously encode information about their role in sentence parsing, story parsing, generation, and so on. This is an elegant way of exploiting one of the most powerful features of distributed representations; their ability to superimpose multiple representations on a single, fixed-width vector of unit activations.

5. This technique of representation-sharing, or FGREP as it is called, is something that could usefully be exploited in a variety of different applications, and is an important result of the DISCERN project.

III. DISCERN AS A COGNITIVE MODEL

6. Where I have some difficulty is with the status of DISCERN as a cognitive model. There is always a problem in trying to satisfy a number of constituencies simultaneously. On the one hand, Miikkulainen wishes to present DISCERN as model with a scope and capability on par with conventional AI models. On the other, he wishes to set his model apart from the conventional symbolic models by claiming a degree of psychological plausibility. This can give rise to problems, and in Miikkulainen's case I believe it has.

7. The script-based approach is one devised and developed within the symbolic paradigm and, as such, carries with it a substantial number of implicit theoretical assumptions that may impede the full exploitation of connectionism, or involve a decomposition of the NLP problem into a form not amenable to a connectionist solution. In the first place, I don't believe that script-based language use is typical of language use in general. Scripts capture one important feature of language use: the way related knowledge is structured into packages and action sequences. However, it can be argued that the central thrust of language-based communication is to convey novelty or deviation from the expected (Miikkulainen makes this very point, p. 253). The kind of language processing that can be captured with scripts is the kind you get for free by virtue of the way information is organised in people's brains. Indeed, the strength of DISCERN is derived from approximating these brain representations. The combination of distributed representations and feature maps permits the filling-in of missing cases in sentences and scripts, without the need to invoke complex rule-based inferencing. Nonetheless, real language use, while built on this foundation, is necessarily more than this. For example, a visit to a restaurant is really only newsworthy if the food has been exceptionally good or bad for the type of restaurant involved, or some incident out of the ordinary occurred during the visit. While the restaurant script provides the background, the foreground information is what is important, and this, by definition, must deviate from the script.

IV. DISCERN AS A METHODOLOGY

8. At the more general level, there is the issue of the architecture of DISCERN. Miikkulainen (p. 271) states that his model is not a hybrid one, but I would argue that to a substantial degree it is. The basic representational elements of DISCERN may be more or less purely connectionist, but many of its architectural features are isomorphic with those of symbolic models. For example, DISCERN distinguishes between types and instances in much the same way a symbolic system does. Furthermore, the way that scripts are represented in DISCERN, as a hierarchy of scripts, tracks, and role-bindings, is similar to how they are handled in symbolic systems. A hybrid approach, even a relatively weak one as instanced in DISCERN, imports a view of language processing that may at best be distorted and at worse erroneous. This is primarily because conventional AI approaches to modelling language processing have serious problems in dealing with semantics and context (Lakoff, 1987; Shanon, 1993). It is by no means clear that a purer, more biologically grounded connectionist approach offers a better alternative, but I believe that it is a potentially more fruitful route to take given the inadequacies of conventional approaches. As part of this alternative strategy, we need to look to the theories and data of cognitive neuroscience and psycholinguistics rather than those of artificial intelligence and formal linguistics.

V. CONCLUSION

9. DISCERN is a significant achievement in that it demonstrates the feasibility of building large-scale connectionist systems that exploit the capabilities of distributed representations. The techniques embodied in FGREP are important and should prove useful in other applications. I also feel, however, that DISCERN represents a missed opportunity. I do not believe that scripts tell us much that is useful about real natural language processing, nor do I believe that a system designed to process scripts can be an informative basis for cognitive modelling.

REFERENCES

Cullingford, R.E. (1978). Script application: Computer understanding of newspaper stories. PhD Thesis, Dept. of Computer Science, Yale University, New Haven, CT. Technical Report 116.

de Jong, G.F. (1979). Prediction and substantiation: A new approach to natural language processing. Cognitive Science, 3, 251-273.

Erman, L.D., Hayes-Roth, F. , Lesser, V., & Reddy, D. (1980). The HEARSAY-II speech-understanding system: Integrating knowledge to resolve uncertainty. Computing Surveys, 12, 213-253.

Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind. Chicago, IL: University of Chicago Press.

Miikkulainen, R. (1993) Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon and Memory. Cambridge, MA: MIT Press.

Miikkulainen, R. (1994) Precis of: Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon and Memory. PSYCOLOQUY 5(46) language-network.1.miikkulainen.

Shanon, B. (1993). The representational and presentational: An essay on cognition and the study of mind. Hemel Hempstead, UK: Harvester Wheatsheaf.

Volume: 5 (next, prev) Issue: 78 (next, prev) Article: 5 (next prev first) Alternate versions: ASCII Summary