Nigrin's "Neural Networks for Pattern Recognition" (1993) is clearly written and presents a series of interesting insights and developments in a nonmystifying way. This commentary discusses specific issues such as the presynaptic paradigm and translation and size invariance.
1. Nigrin's "Neural Networks for Pattern Recognition" (1993, PSYCOLOQUY Precis: 1994) is an extension of the author's Ph.D. thesis, which was primarily concerned with enhancing some of the neural models developed by Grossberg and collaborators (e.g., Carpenter and Grossberg 1987a,b). The book has been conceived in a way that will certainly make it accessible to a large audience without diminishing its scientific relevance. The basic principles as well as new concepts and developments are clearly presented and illustrated with many carefully prepared diagrams and examples; mathematical developments are kept to a minimum. I am convinced that one of its principal accomplishments of is its non-mystifying attitude to the treatment of artificial neural networks. The overall plan of the book as well as remarks on specific issues treated in it are presented in the following sections.
2. After a brief introduction to the principal aspects of Grossberg's neural models for pattern recognition, the book presents a simple-to-complex series of new neural system models (e.g., SONNET 1 and 2) that extend in various ways the properties of Grossberg's neural models and other systems based on multiple layers and attention. Then, in Chapter 6, new architectures are presented to address the problem of synonym representation; these are based on competition between the links of the network. Chapter 7 addresses the implementation of those architectures presented in the previous chapters and discusses mechanisms for translation and size invariant pattern recognition. Conclusions and perspectives for further developments are discussed in Chapter 8.
3. When I first saw the title of this book, I was sure it would deal with the full spectrum of neural network paradigms. Since this is obviously not the case, nor was it intended to be, a more specific title emphasizing the focus on new contributions would have been more appropriate. The present title seems a bit pretentious under the circumstances.
4. In Section 1.2.1 of Nigrin's book, an analogy is drawn between neural networks and an ordinary electric circuit containing wires, dimmers and bulbs. Although the analogy is interesting, it has the major flaw that the luminous signal produced by bulbs can neither be directly transmitted through the wires (links) nor is it of a pulsed nature (in incandescent bulbs). In fact, I would question the utility os such a basic analogy considering that in the subsequent chapter difference equations are introduced with no similar introductory framework.
5. After commenting on the many previous definitions of neural networks, the author proceeds to make his own contribution, suggesting that neural networks should be composed of a very large number of elements which are neurally based and operate only on local information in a fully asynchronous fashion. It is not clear, however, to what degree a processing element should be "neurally based": to the level of ionic channels or just to a level that expresses the overall properties of neurons? Also, why shouldn't smaller systems be classified as (artificial) neural networks? There are plenty of biological examples of simple yet interesting neural networks that incorporate only a handful of neurons, such as the lobula plate structure in the domestic fly (Franceschini, 1985).
6. I am convinced that (artificial) neural networks should still be understood, at least in a general context, as artificial systems that attempt to emulate the remarkable properties of natural neural networks; such a definition seems to be implicit in their very name. The principal problem with more specific definitions is that there is much more to neural networks than just locality, weights, and nonlinear transfer functions. A series of remarkable recent findings from psychophysics and neurophysiology (e.g., Blakemore, 1991; Blasdel, 1992; Churchland and Sejnowski, 1988; Livingstone and Hubel, 1988; Zeki and Shipp, 1988) has indicated many new dimensions of neural processing and organization such as extensive modularity, the important role of topographical maps, the variety of neuron and neurotransmitter types, the effect of neuron morphology on function, and the importance of the interconnection topology, to name but a few. These all represent interesting possibilities for more effective performance and could (and perhaps should) have been adopted more widely as underlying principles in artificial neural networks in addition to (or instead of) the traditional hypotheses. Another possibility is not only to describe neural systems in terms of symbolic rules but also to combine the neural and symbolic approaches as complementary components in an overall integrative model.
7. Although Nigrin's book makes progress towards more effective artificial neural systems by adopting less conventional principles such as the presynaptic competition paradigm and by incorporating attention, I think it still relies too heavuly on the traditional connectionist approach to neural networks.
8. Perhaps because they are extensions of previous work on the learning of temporal patterns, most of the examples in Nigrin's book are based on linear sequences of words. Although such an approach is fine for addressing many of the higher-level issues in pattern recognition (particularly natural language processing), it fails it does not confront multidimensional pattern recognition directly. Extending techniques from one to two or more dimensions is often a subtle and nonstraightforward matter.
9. The presynaptic neural mechanism described in Chapter 6 is presented as one of the most important contributions of the book. Although I find such an approach interesting, it should be noted that the presynaptic model can be understood in terms of conventional neural networks. As illustrated in Figure 1, the synaptic interconnections can be replaced by two additional neurons, which can be viewed as defining a new neural layer (F1.5). Concerning the hardware implementation of the structures illustrated in Figure 1(a) and (b), it is clear that similar amounts of resources will have to be used in either case. Although the presynaptic paradigm does provide a novel and elegant way of describing some neural models, it should be borne in mind that it has little or no potential for saving hardware resources or processing time.
| | | | ----------------- ------- F2: | A | F2: | A | ----------------- ------- | | | | | ---------| | | | | | -----| |----- |--------- | | | | | w1 * | | * w2 ----- | | ----- |--- ---| F1.5: | B | | | | C | | | ----- | | ----- input1 input2 | | *---|---| | | --------* | | | input1 input2
(a) (b)
10. There is little doubt that geometrical transformation invariance is one of the principal issues in multidimensional pattern recognition, because these usually imply combinatorial orders of processing complexity. Concerning the review of previous work dealing with such an issue, only Grossberg's (op. cit.) approach and Fukushima's (e.g.. 1988) are mentioned, though there are alternative neural mechanisms such as Widrow et al.'s (1991) that could have been briefly discussed as well. I also believe that Fukushima's approach is much more interesting than the way it is described by Nigrin because it incorporates selective attention, feedback, and excitatory and inhibitory neurons and it accomplishes progressive translation and size invariance through the simple and complex neuron models extracted from experimental data on the functional organization of the primate visual cortex. It should be stressed that Fukushima's approach is interesting not because of this biological background, but because, in addition to being the first to incorporate the above mentioned mechanisms, it has led to interesting practical applications in the recognition of handwritten characters (e.g., Imagawa & Fukushima 1993). It should be observed that the main shortcoming identified by Nigrin in Fukushima's approach, namely, the fact that it would require too many cells to be useful for practical purposes, can be completely overcome by incorporating additional mechanisms such as foveal vision and selective attention.
11. The proposed neural architecture for implementing pattern centralization seems a clever way to attack such a problem ASSUMING THAT ARTIFICIAL NEURAL NETWORKS MUST BE USED, since it is known the positions of spatial patterns can be readily normalized by shifting the pattern elements according to the coordinates of their centre of mass (see, for example, Schalkoff, 1989). Translation invariance is a good example of a task to which artificial neural networks should NOT be applied, at least given the currently available digital hardware technology. Such tasks also afford interesting possibilities for hybrid pattern recognition systems (i.e., artificial neural networks PLUS other techniques such as statistical moments or symbolic rules). It should also be noted that a remarkable alternative solution to the problem of translation-invariant pattern recognition has been provided by nature, which accomplishes it precisely by not solving the problem at all (at least in the input space)! It usually comes as a surprise to verify that the human retina is much less tolerant to pattern translation than one might expect (Goldstein 1989); in fact, translation tolerance in the primate visual system is achieved through the coordinated movement of the eyes, which scan the image so as to centre each object to be analysed over the foveal region of the retina. (To be precise, it should be pointed out that the retina itself provides some rather limited translation invariance capability.)
12. Finally, I must was disappointed to find that the important and usually difficult problem of rotation invariance was not discussed at all in a book focusing on pattern recognition. Of course a solution similar to the one adopted for size invariance could be considered, that is, replicating neurons so as to have one for each possible rotation. The problem is that such a strategy leads to a combinatorial explosion in hardware resources.
13. Nigrin's book is clearly written and presents an interesting and nonmystifying approach to artificial neural networks for pattern recognition that should be accessible to a broad audience. It describes important developments towards more versatile and powerful neural systems (e.g., SONNET 1 and SONNET 2) and includes interesting discussions and insights. I believe that further editions of this book would benefit from incorporating more biological paradigms, a more appropriate title and a more comprehensive approach to the problem of geometrical transformation (including rotation).
Blakemore, C. (1991) Understanding Images in the Brain. In Image and Understanding, Blakemore, C., Barlow, H. and Weston-Smith, M. (eds) 257-283, Cambridge University Press.
Blasdel, G. (1992) Orientation Selectivity, Preference, and Continuity in Monkey Striate Cortex. The Journal of Neuroscience 12: 3139-3161.
Carpenter, G. and Grossberg, S. 1987a. A Massively Parallel Architecture for a Self-organizing Neural Pattern Recognition Machine. Computer Vision, Graphics, and Image Processing, 37: 54--115.
Carpenter, G. and Grossberg, S. 1987b. ART 2: Self-organization of Stable Category Recognition Codes for Analog Input Patterns. Applied Optics, 26(23): 4919--4930.
Churchland, P.S. and Sejnowski, T. (1988) Perspectives on Cognitive Neuroscience. Science 242: 106-115.
Franceschini, N. (1985) Early Processing of Colour and Motion in a Mosaic Visual System. Neuroscience Research, Supplement 2 S17-S49.
Fukushima, K. (1988) A Neural Network for Visual Pattern Recognition. Computer 1: 65-76.
Goldstein, B. (1989) Sensation and Perception. Wadsworth Publishing Company.
Hubel, D.H. and Wiesel, T.N. (1962) Receptive Fields, Binocular Interaction, and Functional Architecture in the Cat's Visual Cortex. Journal of Physiology (London) 160: 106-154.
Imagawa, T. and Fukushima, K. (1993) Character Recognition in Cursive Handwriting with the Mechanism of Selective Attention. Systems and Computers in Japan 24: 89-97.
Livingstone, M. and Hubel, D.H. (1988) Segregation of Form, Color, Movement, and Depth: Anatomy, Physiology and Perception. Science 240: 740-749.
Nigrin, A. (1993) Neural Networks for Pattern Recognition. The MIT Press, Cambridge MA.
Nigrin, A. (1994) Precis of Neural Networks for Pattern Recognition. PSYCOLOQUY 5(2) pattern-recognition.1.nigrin.
Schalkoff, R.J. (1989) Digital Image Processing and Computer Vision. John Wiley and Sons.
Widrow, B, Winter, R.G. and Baxter, R.A. (1991) Layered Neural Networks for Pattern Recognition. IEEE Trans. Acoustics, Speech and Signal Processing 36: 1109-118.
Zeki, S. and Shipp, S. (1988) The Functional Logic of Cortical Connections. Nature 335: 311-317.