Ben Goertzel (1994) Hierarchical Feature Maps and Beyond. Psycoloquy: 5(56) Language Network (2)

Volume: 5 (next, prev) Issue: 56 (next, prev) Article: 2 (next prev first) Alternate versions: ASCII Summary
PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).
Psycoloquy 5(56): Hierarchical Feature Maps and Beyond

Book review of Miikkulainen on Language-Network

Ben Goertzel
Computer Science Department
Waikato University
Hamilton, New Zealand


Hierarchical self-organizing feature maps, as implemented in the context of DISCERN, represent an important step toward a computational model of human memory. However, as Miikkulainen points out, these maps lack the ability to incorporate novel information on an ongoing basis (1993). It is suggested that in order to transcend this limitation the distinction between memory and processing will have to be eliminated in favor of a more system-theoretic point of view.


computational modeling, connectionism, distributed neural networks, episodic memory, lexicon, natural language processing, scripts.


1. Miikkulainen's (1993) DISCERN system is important for three reasons. First, it is a definitive proof that connectionist ideas are adequate for script processing in particular and natural language processing in general. Second, it is a strong piece of evidence in favor of modular design of connectionist networks. And third, it is a significant step toward a model of self-organizing associative memory that is adequate both psychologically and computationally. In this commentary I will focus mainly on the third issue. What does DISCERN tell us about memory? And which of DISCERN's shortcomings can be traced to limitations of its underlying memory model?


2. The DISCERN architecture consists of eight modules, each one implemented according to connectionist principles: a sentence parser and a story parser, a sentence generator and a story generator, a cue former, an answer producer, a lexicon and an episodic memory. These modules act in concert, exchanging information along predefined connections to answer simple questions regarding simple stories which it has been told. By far the most innovative of the modules is the episodic memory, which incorporates a variation of Teuvo Kohonen's (1988) self-organizing feature map called the hierarchical feature map.

3. Kohonen's feature map is a function from multidimensional memory item space into two-dimensional artificial neural network (ANN) space. It assigns each memory item a neuron in a 2-D ANN in such a way that related memory items will tend to be stored in nearby neurons. However, it does not explicitly incorporate the hierarchical structure that is so prominent both in human memory and in artificial databases. Miikkulainen's hierarchical feature map remedies this deficit; it consists of a hierarchy of coarser and coarser 2-D ANNs. Nodes of networks on higher levels correspond to more abstract concepts; and the children of a node represent specializations of that node's concept.


4. It is interesting to compare Miikkulainen's memory design, which combines hierarchical and associative structure, with the "dual network" memory structure proposed by Goertzel (1993, 1994). A dual network is a network of processes which is connected both hierarchically and heterarchically, in the sense that nodes are connected to nodes with which they are associatively related, and also to nodes with which they are hierarchically related. The hierarchical feature map also has this dual structure. However, the two memory designs differ greatly in their dynamics. And these dynamical differences, I suggest, are pertinent to one of the greatest shortcomings of the DISCERN system, which is that, to use Miikkulainen's own words, "there is no mechanism for reorganization to incorporate novel traces" (p. 160).

5. The hierarchical feature map is trained in top-down fashion. That is, the top level map is formed first and the categories implicit in the top level map are used to guide the formation of the map one level down from the top... and so forth until the process reaches the bottom. This is a plausible psychological process, but it is obviously, at most, only half of the story. Very often higher level concepts are recognized as abstractions from lower level concepts. Miikkulainen's design makes no provision for this.

6. Miikkulainen's memory training algorithm has the great benefit of being computationally tractable. However, it is precisely because this algorithm is so psychologically artificial that it does not connect naturally with any process of memory reorganization. In this way, it would seem, the inability to deal with novelty is built into the training algorithm.

7. In the dual network model, by contrast, each concept in memory is associated with its own pattern recognition process. When its children change, it changes as well, because it has and uses the capacity for abstraction. And reorganization is accomplished by provisional swapping of nodes with one another. If, after a certain period of time for adaptation has expired, the swap is unsuccessful, then the nodes are swapped back.

8. The dual network model has not yet been completely implemented, and thus it cannot be fairly compared with the hierarchical feature map; but the point is that it is possible, at least theoretically, for training and reorganization to be done in a unified way. The key is to override the distinction between memory and processing -- a distinction which DISCERN, despite its connectionist methodology, rigidly upholds.


9. The centerpiece of the DISCERN architecture is the hierarchical feature map. Future implementations, it is hinted, may incorporate hierarchical feature maps in the lexicon as well as the episodic memory. One suspects, however, that these future implementations will still suffer from a fundamental inability to adapt and learn. The problem is the imposition of an artificial distinction between memory, perception and control.

10. This artificial distinction is something which Miikkulainen has borrowed from symbolic, rule-based AI, and imposed on top of lower-level connectionist algorithms. But, although computationally useful, this distinction is not psychologically sound. One suspects that in order to do truly effective natural language processing, one must allow self-organization on the level of networks of intelligent processes, rather than merely on the level of networks of neurons or neuron-like units (Minsky, 1986; Kampis, 1991). It is fair to conclude that, although DISCERN is a large step in the direction of "subsymbolic language processing," it has not completely overcome the psychological errors of symbolic AI.


Goertzel, Ben (1993) The Evolving Mind. New York: Gordon and Breach.

Goertzel, Ben (1994) Chaotic Logic: Language, Thought and Reality from the Perspective of Complex Systems Science. New York: Plenum.

Kampis, George (1991) Self-Modifying Systems in Biology and Cognitive Science. New York: Pergamon.

Kohonen, Teuvo (1988) Self-Organization and Associative Memory. New York: Springer-Verlag.

Miikkulainen, Risto (1993) Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon and Memory. Cambridge, MA: MIT Press.

Miikkulainen, Risto (1994) Precis of: Subsymbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon and Memory. PSYCOLOQUY 5(46) language-network.1.miikkulainen.

Minsky, Marvin (1986) The Society of Mind. New York: Simon and Schuster.

Volume: 5 (next, prev) Issue: 56 (next, prev) Article: 2 (next prev first) Alternate versions: ASCII Summary