Nigrin's 1993 book provides a very readable and unusually broad perspective on neural network models. Unlike most neural network treatments, Nigrin's begins with a top down analysis of requirements. Throughout the rest of his book, he discusses how his SONNET networks attempt to meet those requirements. This review concentrates on the top down approach and on how realistic Nigrin's models are in meeting the requirements he has identified.
1. The traditional symbolic approach to Artificial Intelligence may be described as a top down design process. Specific objectives, typically related to thinking, are stated, and then through a process of repeated refinement an attempt is made to design an artificial implementation. You may not agree with the design, but at least you can see that there is one, and that it has some sense of organization and structure.
2. By contrast, connectionist designs have been bottom up. Roughly speaking, you throw some artificial neurons in a black box, shake them around, and hope that you finish up with cognition. Such a description of connectionism is, of course, excessively harsh. But it remains a problem with neural network models that they are largely based on a bottom up design process.
3. In a bottom up design, it is hard to see the big picture. You can see little bits and pieces, but the explanation is somehow not very satisfying. There is a lack of apparent overall organization. The Fodor and Pylyshyn critique of connectionism (1988) is concerned with this lack of organization. Fodor and Pylyshyn rightly point to the need for a systematic approach in the representation of knowledge. They can see a systematic approach in traditional AI, and in the use of linguistic representations suggested by some cognitive scientists. But no such systematic approach is immediately apparent in connectionist systems, with their bottom up origins.
4. Along comes Albert Nigrin with his new book (1993), and in it he attempts a top down analysis. In keeping with connectionist tradition, the major problem to be solved is not "thinking," but pattern recognition. Nigrin begins with the reasonable assumption that pattern recognition is an important component of cognition; he assumes that a neural network is the way to implement this component. In his first chapter, he discusses the biological origin of natural neural networks and uses this as a basis for a top down analysis of pattern recognition in order to establish the requirements for an adequate neural network. Later chapters are concerned with the details of a bottom up implementation; throughout, Nigrin compares the implementation results with the requirements he has established in his top down analysis. The requirements for perception (input patterns) and motor action (output patterns) are seen as the most urgent problems of cognition.
5. In this review I will examine the requirements Nigrin has established with his top down analysis. I will be concerned with the question of whether they are valid, and with how well his SONNET 1 and SONNET 2 networks meet them. A pattern recognizer could be important for cognitive modeling, and it could also be important for industrial applications. The requirements for industrial applications might be quite different from the requirements for cognitive modeling. I shall be concerned only with the suitability of Nigrin's work for cognitive modeling.
6. Nigrin lists his requirements in chapter 1. I list them here for reference. I have followed Nigrin's numbering, and will mainly refer to these requirements by number.
(1) Self organize using unsupervised learning. (2) Form stable category codes. (3) Operate in the presence of noise. (4) Operate in real time. (5) Perform fast and slow learning. (6) Scale well to large problems. (7) Use feedback expectancies to bias classifications. (8) Create arbitrarily course or tight classifications that are distortion insensitive. (9) Perform context sensitive recognition. (10) Process multiple patterns simultaneously. (11) Combine existing representations to create categories for novel patterns. (12) Perform synonym processing. (13) Unlearn or modify categories when necessary.
7. How realistic are these assumptions? It seems to me that Nigrin has made a very good start here. That is, in the large, his assumptions appear close to correct, but I do have some disagreements either with his assumptions or with the manner in which he has implemented them. Even where I am critical, however, we should recognize the value of Nigrin's work in providing requirements that can be criticized.
8. To start with his description of the basic principles his SONNET systems use: Nigrin considers separately spatial patterns and temporal patterns. Visual object recognition is a typical application of spatial pattern recognition, and word recognition in speech is an obvious application of temporal patterns. Nigrin's approach is to first develop mechanisms for spatial pattern recognition and then to adapt these to the task of temporal recognition by means of converting a temporal pattern into a spatial pattern.
9. It is tempting to say that there is something wrong with basing temporal pattern recognition on spatial pattern recognition. Temporal patterns are associated with actions of an organism or a predator, and for our early vertebrate ancestors, actions must have been of primary importance. However, it is likely that Nigrin's description of temporal recognizers in terms of spatial recognizers is nothing more than a pedagogical choice. Indeed this is apparently confirmed in Chapter 6, where he instead discusses temporal recognition first, and then presents his description of spatial recognition in terms of temporal recognition.
10. Requirements (1), (2), (5) and (13) seem to be a related set which deal with the manner and rate of learning. Requirement (1) is that the network should use unsupervised learning. There is no special training set of patterns to initialize the network, and learning is continuous throughout the network's operation. As Nigrin points out, given a system with unsupervised learning, supervised learning and reinforced learning can be constructed on top of the basic learning system. Nigrin's emphasis on unsupervised learning is quite appropriate. It is difficult to understand the breadth of human knowledge on the basis of other approaches to learning. But it is important not to carry this principle too far. The human undergoes a period of supervised learning, which we refer to as childhood and school. During childhood, neural reorganization of the brain appears to occur at a particularly high rate. Likewise, we learn most rapidly when highly motivated, and our motivations could be considered to provide a form of reinforcement learning, where the source of reinforcement is internal.
11. This brings us to requirement (5), according to which both slow and rapid learning are possible. In Nigrin's models, the learning rate is a parameter in the learning equations. Certainly humans exhibit slow and rapid learning. But is this due to differential learning rates with unsupervised learning? Or is it that rapid learning is a form of reinforcement learning? I suggest that the latter is closer to the correct explanation, with internal motivations usually providing the reinforcement.
12. Requirement (2) is that the network should form stable category codes. This requirement is closely related to the problem of learning rates. If rapid learning is occurring, a pattern which had previously been learned could easily be unlearned. However, if learning is slow, then the rate at which unlearning occurs would also be slow, so categories would already be relatively stable, subject only to slow evolution.
13. Tied in with this is requirement (13), that categories be unlearned or modified as needed. This SONNET 1 fails to achieve. And no wonder, for requirement (13) contradicts requirement (2). By including support for the fixation of categories, Nigrin has apparently eliminated the possibility that they can be changed.
14. Let us consider an example from real life. When a child first learns to walk, he is quite small. In later years, as an adult, he is much heavier, and his weight distribution is quite different. The inertia with his larger weight is higher, and the rotational inertia of the swinging legs has also changed. Considerable relearning must occur in the patterns of motion and balance between childhood and adulthood. This cannot happen if patterns are prematurely fixed. What is needed is a rather slow but steady learning rate which can adapt to changing conditions.
15. When a child is first introduced to a cat, this is likely to be an exciting new experience. Reinforcement learning might well be present to generate a rapid learning rate. In the absence of the cat, there is no reinforcement, so the loss of learning will be at a slow rate. But when the child next sees a cat, some of the excitement may return, for rapid relearning. A combination of rapid learning or relearning under reinforcement, together with slow learning rates at other times, should be able to account for the apparent stability of category codes while at the same time permitting modification of learning where appropriate.
16. Nigrin's requirements (7), (9) and (12) are in some manner related to context sensitivity. (9) is the need for context sensitivity. (7) states that feedback should be used to bias recognitions. Nigrin gives as an example that after hearing "Jack and Jill went up the..." there should be an expectancy that the next word will be "hill", so the network should be biased toward such a recognition. I would call this a case of context sensitivity. Chapter 6 discusses the representation of synonyms, and in technical terminology this includes the problem of recognizing ABA, where the first A must be distinguished from the second A. Again, I would call this a case of context sensitivity, although part of what he discusses in Chapter 6 is more correctly called synonym processing.
17. Nigrin's models have some difficulty with context sensitivity. You can already see this from the fact that he has had to deal with it in three different ways. He first introduces context sensitivity in his review of Adaptive Resonance Theory in Chapter 2. But he introduces it late in the chapter, well after he has established the basic mechanisms of recognition. In other words, he treats context sensitivity as an add on property to be introduced after everything else is working.
18. In his modular theory of the mind, Fodor (1983) attempts to downplay the role of context sensitivity in perceptual recognition. According to Fodor, the mind is organized into a number of modules which are informationally encapsulated. Because of the information encapsulation, context sensitivity would be restricted to information within the same module. That makes context sensitivity a relatively local property. And this seems to be the version Nigrin is discussing. I believe this is a mistake. Fodor's theory is not persuasive, nor is Nigrin's version of context sensitivity. But, as Rock (1983) persuasively demonstrates, context sensitivity permeates perception through and through. Thus, context sensitivity needs to be a global organizing principle for the network.
19. Nigrin's Chapter 6 attempts to correct his problems with matters such as context sensitivity. It is a somewhat speculative chapter. But here Nigrin begins to look for the more global view that will be needed in an adequate design. His treatment is still bottom up, but because he allows himself to speculate, he is able to reach further up than in his description of the SONNET network. His thought experiment is not completely convincing, but it is certainly a good start. As Nigrin recognizes, if he can develop the correct global organization, he will go a long way toward providing the systematicity demanded by Fodor and Pylyshyn (1988) in their critique of connectionism.
20. Nigrin has provided us with an unusually broad perspective for neural network approaches to cognition. His attempt to combine a top down analysis of requirements with a bottom up implementation provides unusual insights into the difficulties that neural network implementors face. He maintains a high degree of readability, and manages to strip away some of the mystery that is all too common in the connectionist literature.
21. Regrettably, his top down analysis is too shallow, and his bottom up development does not reach high enough so that top down and bottom up analyses fail to meet in the middle. But Nigrin has made a good start, and perhaps he and other researchers will be able to continue and extend this type of development.
22. I found Nigrin's self criticism a strong point of the book. At several places in his presentation he pauses and takes stock, carefully pointing out where he has failed to meet one of his criteria. Neural network designers who favor quite different approaches are likely to find this self criticism quite valuable as they develop their own designs.
23. The index was inadequate. For example, there were very few index entries for authors referenced in the bibliography.
Fodor, J. (1983) The Modularity of Mind: An Essay in Faculty Psychology. The MIT Press, Cambridge, MA.
Fodor, J. & Pylyshyn, Z. (1988) Connectionism and Cognitive Architecture: A Critical Analysis. Cognition, 28:3-71.
Nigrin, A. (1993). Neural Networks for Pattern Recognition. The MIT Press, Cambridge, MA.
Nigrin, A. (1994). Precis of Neural Networks for Pattern Recognition. PSYCOLOQUY 5(2) pattern-recognition.1.nigrin.
Rock, I. (1983). The Logic of Perception. The MIT Press, Cambridge, MA.