Albert Nigrin (1994) Context Sensitivity and Reinforcement Learning. Psycoloquy: 5(42) Pattern Recognition (7)

Volume: 5 (next, prev) Issue: 42 (next, prev) Article: 7 (next prev first) Alternate versions: ASCII Summary
Topic:
Article:
PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).
Psycoloquy 5(42): Context Sensitivity and Reinforcement Learning

CONTEXT SENSITIVITY AND REINFORCEMENT LEARNING
Reply to Rickert on Pattern Recognition

Albert Nigrin
Department of Computer Science & Information Systems
The American University
4400 Massachusetts Avenue NW
Washington DC USA 20016-8116
(202) 885-3145

nigrin@american.edu

Abstract

This article attempts to answer some of the concerns raised by Rickert (1994) in his review of the book Neural Networks for Pattern Recognition (Nigrin, 1993). Topics that are discussed include: fast learning and stability, reinforcement learning, and context sensitivity.

Keywords

context sensitivity, learning rates, neural networks, pattern recognition

I. INTRODUCTION

1. Rickert (1994) examines the properties that should be achieved by a pattern recognizer, focusing on fast learning, stability, and context sensitivity. Although I agree with much of his commentary, I must clarify the relationships between different properties and also discuss how the network can be embedded within a larger framework.

II. LEARNING RATES AND REINFORCEMENT LEARNING

2. The book discusses thirteen properties that are desirable for a pattern recognizer to satisfy. Rickert (1994) examines four of these properties in one section of his commentary:

    (1) Self organize using unsupervised learning.
    (2) Form stable category codes.
    (5) Perform fast and slow learning.
    (13) Unlearn or modify categories when necessary.

3. In SONNET, the learning rate is adjusted by the use of a single parameter. When that parameter is large, unsupervised learning takes place rapidly and categories can form in as soon as a single trial. When the parameter is small, the categories form more slowly, to allow the network to generalize over multiple examples. Rickert questions this approach stating: "Certainly humans exhibit slow and rapid learning. But is this due to differential learning rates with unsupervised learning? Or is it that rapid learning is a form of reinforcement learning? I suggest that the latter is closer to the correct explanation, with internal motivations usually providing the reinforcement."

4. I am entirely in agreement with Rickert's statement. However, learning in SONNET is not at odds with this. As Rickert notes: "given a system with unsupervised learning, supervised learning and reinforced learning can be constructed on top of the basic learning system." Thus, if SONNET is used as a module within a larger reinforcement learning system, that system would control the learning rate in SONNET to decide whether fast or slow learning was appropriate. The fact that makes this possible is that the SONNET module IS ABLE to accommodate both fast and slow learning. Without this ability, such embedding might not be feasible. (For an overview of one possible way to perform this embedding, see Grossberg 1987, 1988.)

5. This brings us to the next two properties: stable category codes and unlearning. Rickert states that these two properties are contradictory, since unlearning will erode existing categories and undermine stability. I prefer to think of the properties as tradeoffs. Though stability is a desirable goal, categories should also be allowed to evolve slowly so they can continue to represent objects whose appearances change slowly over time. (I believe that slow unlearning is different from fast unlearning in that slow unlearning can be accomplished purely through unsupervised learning. Conversely, fast unlearning may need reinforcement learning to allow it to occur properly.) As Rickert notes, the inability to achieve slow unlearning was one of the deficiencies of SONNET 1. This property was not achieved by SONNET 1 because it seemed considerably easier to achieve it within the framework of SONNET 2. Thus, a decision was made to postpone its satisfaction until SONNET 2 was implemented. (The basic mechanism for achieving slow unlearning is already in place, even in SONNET 1. A minor modification to the parameter rji should allow SONNET 2 to properly regulate unlearning.)

III. CONTEXT SENSITIVITY

6. I agree with Rickert's statement that "context sensitivity permeates perception through and through. Thus, context sensitivity needs to be a global organizing principle for the network." Rickert also states: "Nigrin's models have some difficulty with context sensitivity. You can already see this from the fact that he has had to deal with it in three different ways. He first introduces context sensitivity in his review of Adaptive Resonance Theory in Chapter 2. But he introduces it late in the chapter, well after he has established the basic mechanisms of recognition. In other words, he treats context sensitivity as an add on property to be introduced after everything else is working."

7. However, the fact that the mechanisms for handling these properties are introduced relatively late does not mean I believe that context sensitivity is an add-on property. Quite the contrary. These mechanisms (nonhomogeneous cell parameters and nonuniform inhibitory connections) form one of the primary pillars of SONNET, and their removal would seriously compromise any network operation at all. The reason they are introduced relatively late is that a sufficient foundation needs to be in place in order to accommodate their presentation. Furthermore, context sensitivity is dealt with in different ways because there are different aspects to the property. For example, in the sentence: "Jack and Jill went up the...", there is an expectancy that the next word in the sentence will be "hill". However, this type of context sensitivity is different from that which results from the parsing of the phonemes in the phrase "All turn". Humans segment this utterance using the words "All" and "Turn", rather than using the embedded word "Alter", whose phonemes are also contained within. Since expectation cannot account for this type of parsing, the mechanisms that account for it are different from those which account for the context in the sentence "Jack and Jill went up the hill".

8. Unfortunately in discussing SONNET's weaknesses with context sensitivity, Rickert provides no concrete examples that I can admit to or refute. I will speculate that one of the problems he may be referring to is that SONNET 1 did not form reliable segmentations when input was represented by multiple categories (as in the "All Turn" example above). Fortunately, this problem has been remedied in the SONNET 2 network. SONNET 2 represents input patterns using the criterion that as much of the input as possible should be exclusively accounted for and that as few categories as possible should be activated. This is demonstrated by the simulation shown below, where the network used fast learning to learn the 10 binary patterns AB, ABC, ABCDE, BC, CD, CDE, CDEF, CE, DE, and FG. After learning was completed, the STM response of the network was tested by setting the learning rate to zero (to prevent the formation of new categories) and presenting the patterns shown in the Table below.

TABLE

    Input Pattern | Active Categories || Input Pattern | Active Categories
    --------------|-------------------||---------------|------------------
       ABHIJK     |      AB           ||   ABCDE       |   ABCDE
       A          |      AB           ||   ABCDEF      |   AB, CDEF
       AB         |      AB           ||   ABCDEFG     |   ABCDE, FG
       ABC        |      ABC          ||   BCDEFG      |   BC, DE, FG
       ABCD       |      AB, CD       ||   IJBCDEFGK   |   BC, DE, FG

9. As can be seen from the Table, the network used a minimum number of categories to represent as much of each input pattern as possible. For example, ABCDEFG was parsed as ABCDE and FG rather than as AB, CDE, and FG. The differences between winning and losing categories were clear cut. Winning categories reached activities of approximately 1.5, while losing categories had activities less than 0.1. Notice also that the segmentations operated correctly for different numbers of categories consisting of different pattern sizes. This demonstrates that the parameters were not optimized for a single situation. (Although simulations have not yet been performed on larger patterns, mathematical analysis has indicated that the network will continue to operate correctly.)

IV. SUMMARY

10. Rickert presents a well-thought-out analysis of the properties that should be achieved by a classification network. Although I may quibble with different parts of his analysis, it seems that much of our disagreement may arise from different uses of terminology or from misunderstandings. I believe that for the most part we are in agreement about both the properties that a network should satisfy and the general manner in which they can be satisfied.

REFERENCES

Grossberg, S. (1987) The Adaptive Brain, I: Cognition, learning, reinforcement, and rhythm}. Elsevier Science Publishing Company Inc, North Holland.

Grossberg, S. (1988) Neural Networks and Natural Intelligence. MIT Press, Cambridge Mass.

Nigrin, A. (1993) Neural Networks for Pattern Recognition. The MIT Press, Cambridge, MA.

Nigrin, A. (1994) Precis of Neural Networks for Pattern Recognition. PSYCOLOQUY 5(2) pattern-recognition.1.nigrin.

Rickert, N. (1994) A Broader Perspective to Neural Networks. PSYCOLOQUY 5(29) pattern-recognition.4.rickert.


Volume: 5 (next, prev) Issue: 42 (next, prev) Article: 7 (next prev first) Alternate versions: ASCII Summary
Topic:
Article: