In his book Learning and Categorization in Modular Neural Networks (1992a), Murre proposes a building block for neural networks based on unsupervised learning. In this commentary, I will present results of research which extends Murre's approach. It will be shown that this new work may solve some of the problems former reviewers have exposed.
In his book Learning and Categorization in Modular Neural Networks (1992a), Murre proposes a building block for neural networks which can solve several problems that hamper most artificial neural networks. These problems are lack of speed and stability, the inability to both learn and discriminate between and generalize over patterns, and retroactive interference. Murre gracefully adopts certain known neurobiological principles to designate the so-called CALM (Categorization And Learning Module). A CALM module is characterized by a fixed internal structure in which a competitive process is induced by the use of inhibitory links and random activations. This module can then be used to connect to other modules with modifiable links. Learning thus happens outside of the modules.
2. Most reviewers are annoyed by CALM's shallow ties to biology. Indeed, CALM is not intended as a detailed model of processes and structure in the brain (Murre, 1992a). However, I agree that the motivation for the use of biological principles is rather weak. I think that the tendency of connectionist researchers to use these principles is initiated by the disappointment with the results in the field of symbolic AI. Researchers hungrily looked for other methods in the modelling of cognition. The CALM approach is, I think, better motivated by arguing that serial computational techniques did not give us what was promised and that dynamic, chaotic, and parallel models may be a better option, because cognition usually deals with incomplete information and processes it mostly in parallel fashion (to mention just a few features). And the brain itself is a parallel working system characterized by chaotic and dynamic processing (e.g., Gray & Singer, 1989; Sompolinsky, Golomb & Kleinfeld, 1991).
3. In this review I will not criticize the psychological theories Murre postulates in his book, rather, my aim is to extend CALM by showing other work on CALM that may remove the uneasy feeling most reviewers had when reading the book. One line of work comprises the design of several principles which may guide the development of multi-module architectures (Happel & Murre, 1994). Another line of work extends CALM to the so-called CALM Map which is capable of self organization (Phaf, Tijsseling & Lebert, submitted). This new work may be an improvement of the CALM approach and may prove useful in modelling some cognitive phenomena.
4.In his book, Murre devotes a paragraph (3.5) to self-organization. He extends CALM to CALSOM by introducing graded recurrent lateral inhibition in the cross weights. Arranged in a one-dimensional array, the cross-weights have their inhibition increased with a constant value. Although there is self-organization in CALSOM, this is not optimal: the number of uncommitted nodes is quite large, including the boundary nodes, and there are also twists in representations. Phaf, Lebert, and I improve upon this by using a convex inhibition gradient instead of the cross- and down-weights. This gradient is calculated using the following formula:
2 (i-j) - ----- 2 2s
h = A * e - B ij
where hij is the inhibitory weight between the ith R-node and the jth V-node, and i-j is the distance between the two nodes. A > 0, B > 0, (A < B), and s (sigma) are constants determining the form of the Gaussian. A and B are usually set to 8.8 and 10.0 respectively, and s is dependent on the size of the module according to square root of N divided by N-1 (N is the size of the module), truncated to the nearest half. To allow for a smooth distribution of weights, changes were made in the learning rate mu and the weight from mu to the E-node.
5. We also consider ring-topologies, that is, in a module the first and the last R-nodes are treated as direct neighbours. A lot of simulations have been conducted to investigate the functioning of CALM Map. These simulations tested the influence of the Euclidean distance between patterns, the overlap between patterns, the size of the module, and the role of interference. Compared to CALM the results were very promising. Overall, CALM Map performed better and it also exhibited additional features. One of these features is that stretching of representations continued with increasing presentations, whereas CALM would commit itself to a once obtained suboptimal categorization. Another feature is that when the size of the module is much larger (twice as much as the number of patterns (Powers, 1993)), the representations are separated maximally over the entire module such that committed and uncommitted nodes alternate over the nodes of the module. The uncommitted nodes actually interpolated between the representations of the neighboring nodes (this does not occur in CALM or in Kohonen maps).
6. Finally, CALM Map does not suffer from catastrophic retroactive interference. If a pattern set is learned by a CALM Map network and then sequentially an interpolated new set is presented, memory for the first set is not lost. The uncommitted nodes, after learning of the first set, interpolated between the representations of the neighbouring nodes. Patterns of the second set are then immediately committed to nodes with the interpolated representations.
7. CALM Map not only exhibits self-organization, it also solves some problems with CALM. One of these problems occurs when highly correlated patterns are presented as input. Separation of patterns in CALM is a complicated function of the distance between and overlap of different patterns as well as the size of the module and the dimension of the patterns. The problem particularly exists in larger modules and with larger patterns, even a large Euclidean distance between patterns may not be sufficient for a stable distinct categorization. This property of CALM may be particularly damaging when modular networks are exposed to ecologically plausible stimuli, because such stimuli will tend to contain information on many different attribute dimensions (e.g., Phaf, van der Heijden & Hudson, 1991), of which only a few may differ sufficiently to discriminate stimulus objects. A second problem is the randomness of the search of new representations in CALM. If a new pattern differs sufficiently from the already represented patterns, a new representation is selected at random. Because the separation criterion more or less acts as an absolute threshold, after the initial comparison of all patterns, further presentation of the data set will not increase separation substantially in CALM. CALM Map thus improves upon CALM by offering a solution to these problems through the introduction of an inhibition gradient. More information and description and results of the simulations can be found in Phaf et al. (submitted).
8. One of the main criticisms of Murre's book is that there is no mention of how to design architectures built with more modules (see e.g., Aitken, 1993). This is a very serious shortcoming because the main role of CALM is performed within multi-modular architectures. Using multiple modules also avoids the danger of having only localized representations. Within a multi-modular architecture, representations are semi-distributed because representations of input patterns are in fact distributed over as many nodes as the number of modules involved in processing the pattern (Murre, 1992a). Loss of a representation in one module therefore does not have to imply that the pattern is entirely forgotten by the network, it may reside in more modules.
9. In Happel and Murre's article (1994), several design principles are given that have shown to generate networks with a stronger categorization performance compared to single-module networks. These principles are based on those governing the architectural organization of the brain like the hierarchical layered structure, the presence of multiple processing pathways, and the formation of neural assemblies. The first principle, the principle of structural compatibility, states that learning and categorization improve as the induced categories are more compatible with the cluster structure of the task domain. For example, a coarse categorization in a small module can interactively facilitate a more fine-grained categorization in a larger module, when the input patterns are hierarchically organized in clusters containing smaller subclusters.
10. The replication-of-structure principle maintains that the use of multiple pathways, that is, the use of more equivalent subsystems, improves performance. This is because categorization in a subsystem that is compatible with the induced cluster structure proceeds faster than incompatible categorizations in other pathways. As a result, they determine the overall categorization in converging architectures. According to the last principle, the principle of recurrence, an even better processing is gained by the installation of recurrent, convergent connections. In this way, fast, optimal clustering in one pathway can correct slower (and initially suboptimal) clusterings in other pathways.
11. Happel and Murre (1994) explain these principles in terms of circuits of R-nodes. Each pattern presentation results in the network converging on a set of activated R-nodes that are connected by learning connections. Learning ensures the formation of stable circuits of associated R-nodes. They refer to Hebb's (1949) idea of neural assemblies, namely association circuits as major determinants of neural coding. The initial modular architecture of a network implements a set of assemblies that can induce a more optimal categorization structure and suppress the formation of undesirable structures (Happel & Murre, submitted). There is much more to these principles, but I refer to Happel and Murre (1994) for more information. Tijsseling (1994) also offers a deeper analysis.
12. There is another principle guiding learning and categorization in multimodular architectures developed by Happel and Murre (submitted). This principle of dynamic, chaotic activation focuses on the oscillatory behavior of the use of many interacting modules. The activation of V-nodes and the subsequent inhibition of the R-nodes are relatively slow processes and consequently implements an oscillatory mechanism. Happel and Murre show with a series of impressive analyses that these oscillations evoke chaotic regimes in the activity evolution of a network. The dynamic representations in these networks are represented as space-time patterns of activity and show a fractal boundary structure.
13. According to Happel and Murre, chaos in a neural network serves several purposes. Two functions are that of a novelty filter and of explorative deterministic noise. In response to new stimuli, chaotic activity arises in a network, whereas for familiar input, stable limit cycles occur. Chaotic activation may also serve as an autonomous search through previously learned information, and as a mechanism for the exploration and formation of new representations (Happel & Murre, 1994). Another role chaos plays is that of a fundamental form of neural activity that provides continuous, sequential access to memory pattern, independent of the initial state of the network. This enables the autonomous generation of a sequence of different activity patterns in response to a sequence of inputs, without the necessity to reset the initial state. Finally, chaos also serves as a mechanism that underlies the formation of complex representations. Oscillatory dynamics in the multi-modular network induce a winner-take-all competition resulting in the classification and expression of complex temporal activity patterns in terms of readily interpretable, unitary convergences. The representations arising from this must be a composition of the attractor basins of various stimulus-invoked oscillatory patterns, resulting in the formation of complex, fractal category structures.
14. In this review, I have presented new lines of research which extend the CALM paradigm described in Murre's book. I have not thoroughly described these extensions -- I refer to the original articles -- but I wished only to introduce them and to show that they may mitigate several criticisms put forward by other reviewers, one of which is the formation of multi-modular networks. The self-organizing version of CALM also exhibits several new features making it a more attractive means to model cognition. My hope is that it is at least clear that we have certainly not finished with CALM.
Aitken, A.M. (1993) Have module, need architecture! PSYCOLOQUY 4(47) categorization.10.
Gray, C.M. & Singer, W. (1989) Stimulus-specific neuronal oscillations in orientation columns of the cat visual cortex. Proceeding s of the National Academy of Science, USA, 86, 1689-1702.
Happel, B.L.M. & Murre, J.M.J. (1994) Design and evolution of modular neural network architectures. Neural Networks, Vol. 7, No 6/7, 985-1004.
Happel, B.L.M. & Murre, J.M.J. (submitted) Evolving complex dynamics in modular interactive neural networks
Hebb, D.O. (1949) The organization of behaviour. New York: Wiley.
Murre, J.M.J. (1992a) Learning and Categorization in Modular Neural Networks. UK: Harvester/Wheatsheaf; US: Erlbaum.
Murre, J.M.J. (1992b) Precis of: Learning and Categorization in Modular Neural Networks. PSYCOLOQUY 3(68) categorization.1.murre.
Powers, D.M.W. (1993) CALM, Chaos and surprise! PSYCOLOQUY 4(36) categorization.7.
Phaf, R.H., van der Heijden, A.H.C. & Hudson, P.T.W. (1990) SLAM: A connectionist model for attention in visual selection tasks. Cognitive Psychology, 22, 273-341.
Phaf, R.H., Tijsseling, A.G. & Lebert, E. (submitted) Self-organizing CALM Maps.
Sompolinsky, H., Golomb, D. & Kleinfeld, D. (1991) Cooperative dynamics in visual processing. Physical Review A, Vol. 43, No. 12, 6990-7011.
Tijsseling, A.G. (1994) A hybrid framework for categorization. Unpublished Master Thesis, Utrecht University, The Netherlands.