Jonathan A. Marshall (1994) Synonyms, Embedding, Segmentation, and the Banana Problem. Psycoloquy: 5(32) Pattern Recognition (5)

Volume: 5 (next, prev) Issue: 32 (next, prev) Article: 5 (next prev first) Alternate versions: ASCII Summary

Topic:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 5(32): Synonyms, Embedding, Segmentation, and the Banana Problem

SYNONYMS, EMBEDDING, SEGMENTATION, AND THE BANANA PROBLEM
Book review of Nigrin on Pattern Recognition

Jonathan A. Marshall
Department of Computer Science
CB 3175, Sitterson Hall
University of North Carolina
Chapel Hill, NC 27599-3175, U.S.A.
Fax: 919-962-1799

marshall@cs.unc.edu

Abstract

Nigrin's Neural Networks for Pattern Recognition (1993) presents several very interesting advances in neural network methods. The book is well written, self contained, and accessible both to experts in neural networks and to intelligent novices. It provides a brief but readily understandable introduction to self-organizing neural networks, including the ART networks. It then describes in depth Nigrin's own theories of self-organizing neural networks for pattern recognition. Nigrin's work addresses, in new ways, some fundamental questions, such as unsupervised context-sensitive segmentation of patterns in continuous input streams, fast but stable learning, and simultaneous learning of multiple patterns.

Keywords

clustering, embedded pattern recognition, neural networks, segmentation, synonym representation, translation invariance.

1. Albert Nigrin's (1993) book is a cauldron of fresh, exciting ideas cast in a new foundation for the field of self-organizing neural networks, a foundation that can support the construction of neural networks for many applications, such as speech recognition, signal processing, and image processing.

2. Nigrin invents and applies several new methods to solve fundamental pattern recognition problems, problems on which other neural network methods fall short. His methods have similarities with the work of Carpenter, Cohen, Grossberg, and others on the ART networks, masking fields, and serial order models (e.g., Carpenter & Grossberg 1987, Cohen & Grossberg 1987)), and the work of Reggia et al. (e.g., 1992) on competitive distribution theory, and that of Marshall (e.g., 1992) on the self-organization of global context-sensitive constraint satisfaction mechanisms.

3. Nigrin's approach is based on a novel neuron assembly design that can be viewed as a repeated module of four neurons or, equivalently, as a set of internal biochemical circuit pathways within a single neuron. To support the operation of the neuron assembly, Nigrin proposes some novel connection structures, novel learning rules that allow connections to learn to excite or inhibit other connections, and novel signaling rules that govern the way that connections carry information.

I. FAST LEARNING

4. Many neural network models can be trained only gradually, using a large training set and long training times. Nigrin pays special attention to methods that reduce the need for gradual training and yet preserve the stability of the network's learned categories. Nigrin presents new ways to use feedback information and confidence measures to prevent previously learned categories from being eroded by new data, even when the network's learning rate is very high.

II. LEARNING EMBEDDED PATTERNS

5. Nigrin shows how his network can segment, extract, learn, and recognize patterns that are never presented in isolation, such as the pattern AB in the inputs ABCD, EABF, and GHAB. This capability is widely useful in neural network applications, for example, segmenting words from a continuous speech stream. Again Nigrin uses feedback information and confidence measures for this capability.

III. STABILITY AND PLASTICITY

6. Nigrin's methods address the issues of automatically regulated stability as well as plasticity, to allow the networks to learn new input patterns but prevent new inputs from erasing the information gathered from earlier inputs. The ART models were also designed to balance stability and plasticity, but Nigrin's method is novel and addresses some additional subtleties, such as learning multiple patterns simultaneously.

IV. SYNONYMS AND HOMONYMS

7. Probably the most useful and exciting part of Nigrin's book is the material in Chapter 6 on "synonyms" and "homonyms." Synonyms, in Nigrin's terminology, are different patterns that have the same meaning, like the words "small" and "little." Homonyms refer to a pattern that has multiple meanings, like the word "star" (movie or celestial). Nigrin shows some novel and interesting ways that a network can learn to represent synonyms and homonyms and can use them to accomplish several neural computation tasks, including exclusive allocation (or credit attribution), representing repeated patterns (the "banana problem"), and translation/scale-invariant pattern recognition.

8. The core of Nigrin's approach is to use link competition and link cooperation, which are excitatory and inhibitory interactions between connections rather than between neurons. The link interactions represent constraints on the ways that input patterns can be segmented. Nigrin's link interactions regulate short-term transmission of signals through links and they are targeted to individual connections in a highly specific manner. They are thus fundamentally different from the long-term, weight-competition rules and from the nonspecific or pooled link interactions used previously by several researchers.

9. Nigrin describes local, unsupervised learning rules that govern the strength of inter-link interactions. Unlike Hebbian rules, which are bilocal, the learning rules for interlink interactions are at least trilocal. For instance, when two links converge on a target neuron, Nigrin's learning rules modify the interaction strengths between the links as a function of the activations of the target neuron and both source neurons. Because of these trilocal learning rules, the link interactions have enhanced expressive power (specificity) and adaptive power compared with most standard neural network models.

10. Nigrin cleverly designs learning rules that cause synonyms to be learned simultaneously. For instance, when a network learns to represent the input pattern "Joe is a small boy," it can simultaneously learn to represent "Joe is a little boy," even though "little" was not part of the input. This capability is useful in invariant pattern recognition; when the network learns to recognize an object presented in one location, it simultaneously learns to recognize it in all locations.

V. OTHER ISSUES

11. Besides the major concepts above, numerous other interesting concepts are sprinkled throughout the book. Nigrin discusses new ways to represent and use temporal order information, temporal chunking, variable presentation rates and rhythms, confidence measures, pattern matches and mismatches, attentional vigilance, and scale sensitivity.

VI. WEAKNESSES

12. Because this is mainly an idea book, rather than a description of successful applications or formal proofs, it would be inappropriate to criticize it on the grounds that are normally applied to those other types of works. The book pioneers a new approach to the field of self-organizing neural networks for pattern recognition; it remains for researchers to apply Nigrin's concepts to specific computational applications and to derive formal analyses of the applicability and power of Nigrin's methods.

13. Some of the ideas in the book seem to have enormous potential, such as the interlink interactions and trilocal learning rules. Other ideas may fall by the wayside as further analyses are conducted. In fact, in the later portions of the book, Nigrin discards some of the ideas presented in the earlier portions, having used those ideas rhetorically to help introduce the reader to the complexities of some of the problems that he proposes. This approach helps the reader appreciate the difficulty of solving some of the problems.

14. Below are listed some weaknesses of Nigrin's approach. These weaknesses can be viewed as either areas that need further work or interesting ideas that in the end are unlikely to survive. (1) Nigrin's representation of temporal sequences (by increasing neuronal activations over time) is awkward. It seems to cause more problems than it cures -- for instance, requiring a complex mechanism for reset when the activations get too large. (2) There are too many free parameters in the model. Perhaps some simplifications can be obtained. (3) Some of the mechanisms in the model appear mutually contradictory; this is forgivable, because they are intended to balance contradictory goals, such as fast learning and generalization across multiple embeddings. (4) The model ultimately resolves the stability/plasticity dilemma by freezing some weights. It would be better to allow some form of unlearning for the occasions when new input should override older information. (Nigrin acknowledges this goal in the final chapter.)

15. Despite these specific problems, the book contains a richness of ideas, including several new concepts that should be given serious consideration by the mainstream of the field and that can be used to solve some currently popular problems, such as binding, invariant representations, generalization, and stability.

VII. ACCOMPLISHMENTS

16. Nigrin has succeeded in achieving his aims. He has introduced some new open questions to the field of real-time self-organizing pattern recognition. He has clarified and extended the characterization of some older problems. He has proposed and implemented some novel, thoroughly original, and useful practical solutions to the problems. Nigrin's perspective is new yet solidly grounded in the prior work in the field. The work is very original and very exciting.

VIII. SIGNIFICANCE

17. This work is a major contribution to the fields of pattern recognition and self-organizing neural networks. It is significant because it points out (through simple, understandable, gedanken experiments) some fundamental capabilities that self-organizing pattern recognition systems ought to have and then shows how to build neural network systems with those capabilities. Few (if any) other works exist that put all the key issues in self-organizing neural networks for pattern recognition into a unified perspective, and fewer still are accessible to novices in the field.

IX. AUDIENCE

18. Although the book has an introductory chapter that covers the ART models, it is not a survey of the field. It covers Nigrin's theory in depth. The book is addressed to a general audience. It is self-contained and does not require any prior background knowledge in pattern recognition, self-organization, or neural networks. It can be understood by advanced undergraduates, graduate students, and even college professors. It is not purely an academic book, though; it will prove useful to practicing engineers and scientists as well. The book will be of great interest to neuroscientists, psychologists, engineers, and computer scientists.

X. CONCLUSION

19. The technical content of the book is exciting, and much of Nigrin's way of thinking should find a home in researchers' solutions to fundamental problems in self-organizing neural networks for pattern recognition. The book is highly recommended; the field needs more creative idea books like this one, and fewer rehash books.

REFERENCES

Nigrin, A. (1993). Neural Networks for Pattern Recognition. The MIT Press, Cambridge, MA.

Nigrin, A. (1994). Precis of Neural Networks for Pattern Recognition. PSYCOLOQUY 5(2) pattern-recognition.1.nigrin.

Carpenter G. & Grossberg, S. (1987) A Massively Parallel Architecture for a Self-Organizing Neural Pattern Recognition Machine. Computer Vision, Graphics, and Image Processing 37 :54-115.

Cohen M. & Grossberg S. (1987) Masking Fields: A Massively Parallel Neural Architecture for Learning, Recognizing, and Predicting Multiple Groupings of Data. Applied Optics. 26: 1866-1891.

Marshall J.A. (1992) Development of Perceptual Context-Sensitivity in Unsupervised Neural Networks: Parsing, Grouping, and Segmentation. Proceedings of the International Joint Conference on Neural Networks, Baltimore, MD, III :315-320.

Reggia J.A., Dautrechy C.L., Sutton G.G. & Weinrich M. (1992) A Competitive Distribution Theory of Neocortical Dynamics. Neural Computation 4(3): 287-317.

Volume: 5 (next, prev) Issue: 32 (next, prev) Article: 5 (next prev first) Alternate versions: ASCII Summary