David C. Krakauer & Alasdair I. Houston (1993) Evolution, Learning & Categorization. Psycoloquy: 4(28) Categorization (4)

Volume: 4 (next, prev) Issue: 28 (next, prev) Article: 4 (next prev first) Alternate versions: ASCII Summary
Topic:
Article:
PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).
Psycoloquy 4(28): Evolution, Learning & Categorization

EVOLUTION, LEARNING & CATEGORIZATION
Book Review of Murre on Categorization

David C. Krakauer & Alasdair I. Houston
NERC Unit of Behavioural Ecology,
Department of Zoology,
University of Oxford,
South Parks Road,
Oxford OX1 3PS, UK

krakauer@vax.ox.ac.uk houston@vax.oxford.ac.uk

Abstract

Murre's model is a conceptual level model of cognition intended to address psychological phenomena. Thus we are challenged by a possible confluence of two levels, the neural substrate and the psychological data.

Keywords

Neural networks, neurobiology, psychology engineering, CALM, Categorizing And Learning Module, neurocomputers, catastrophic interference, genetic algorithms.
    "Learning, that Cobweb of the Brain" (Samuel Butler, 1612-1680)

1. In "Learning and Categorisation in Modular Neural Networks" Murre (1992) has presented a connectionist model based on a competitive learning rule that takes as its exemplar the neocortical minicolumns. Whilst the basis of the model is a schematization of anatomical observations, the model remains symbolic -- a conceptual level model of cognition intended to address psychological phenomena. Thus we are immediately challenged by a possible confluence of two levels, the neural substrate and the psychological data. By avoiding the claim that the architecture of the model is a veridical representation of cortical microcircuits and simultaneously stressing the primacy of the behaviour, Murre diffuses what might have been a source of confusion. The minicolumn structure pre-eminently provides an approach to modularity, where modularity is understood as a limitation on connectivity. By approaching psychological phenomena from the bottom (the level of connectionism), Murre suggests that he has constructed a "strong" generative model as opposed to a "weak" model where descriptions remain purely phenomenological. As Maki and Abunawass (1991) have argued, a model can be more neurally inspired than neurobiologically faithful.

2. One ambiguity we should like to be more explicit about is that of the relationship between the "learning rules" used to establish the connection weights in a network and the learning rules discussed by psychologists to describe the behaviour of animals. Whilst it remains possible that the higher order features of associative learning are themselves grounded in associative rules at a network level (Sutton and Barto, 1981), this is in no sense certain or necessary. While the Hebb rule can be interpreted as a simple form of classical conditioning, psychological classical conditioning needn't be a linear projection of the Hebb rule to the behavioural level. A latent danger in Murre's approach is in turning a "strong" model into a "weak" one by understating the dichotomy between these two uses of learning.

3. Murre has gratifyingly included a chapter on genetic algorithms (GAs). He describes the analogies between nets and GAs as optimisation strategies and characterises where their respective strengths lie (e.g., coarse versus fine search). We see utility in an approach that combines elements of connectionism with GAs, because it provides a suggestion of how evolutionary function and proximate mechanism might be integrated. Murre has identified how a network can transform a "weak" hypothesis into a "strong" one, but he does not fully credit GAs with a similar power. GAs not only provide a means of solving a problem, but through their analogy to biological evolution they give a measure of where selective pressures are brought to bear (Sumida et al. 1990). A function might fail to converge to the optimum solution because parameters do not contribute to a fitness function. High variability in a given trait might reflect a weak selective influence. If we are unable to evolve a net to perform a given task we are led to ask how we can modify our constraints so as to expose the net to a greater selection pressure. It is with this in mind that Hinton and Nowlan (1987) introduce "learning" (Murre pp. 106). For a given fitness function, the constraint of learning a proportion of the network's weights through a series of guesses (where fitness is inversely related to the number of guesses) provides continuity of solution. In other words, the network becomes subject to a graded selection pressure which allows an incremental approach to the solution. While this approach expedites a solution, it is questionable whether we are entitled to call it learning. Hinton and Nowland's model eventually evolves to a solution involving maximum constraint where learning is effectively excluded. We shall discuss this conception of learning and the desirability of restricting what we could call the network's "phenotypic plasticity."

4. According to Dickinson (1989), a learning mechanism is one that is able to detect and store information about casual relationships. On this definition, the environment should confer fitness in direct proportion to the assessment of causality by any network in which we would like to evolve the capacity to learn. Evolution should act to set up the machinery which will subsequently allow an association to be formed, and these associations should be formed within the lifetime of a single individual. The networks discussed by Murre use learning as a tactic to arrive at a solution which ultimately obviates the need to learn. These presuppose an unchanging environment, whereas learning might more appropriately be described as a mechanism to minimise the detrimental fitness consequences of variability. We may benefit here by introducing more formally the idea of phenotypic plasticity: "when the environment is heterogeneous, a single genotype can develop different phenotypes in different environments" (Stearns, 1992). Where we argue for the ability of a network to assess causality to provide a selection gradient, we might have said "heritability of plasticity should measure the additive genetic variation in plasticity" (Stearns, 1992). Learning will not always be advantageous and might well impose a cost in terms of wasted time (viz. Hinton and Nowlan XXXX). To quote Lynch and Gabriel (1987), "when the within generation component of environmental variation is much less than the between generation component, spatial heterogeneity can actually select for a high degree of specialisation." Hence any discussion of learning and its relation to evolution must distinguish between its role in allowing for a selection gradient and the contextual issue (subject to environmental variability) concerning the desirability of plasticity.

5. Finally, we would like to suggest that a CALM-like ("categorization and learning in neural networks") module could form the basis of a model of spatial memory with a view to understanding cache recovery in food storing birds. The problem faced by the bird is that of relocating scattered caches of food. Some parids are able to store hundreds of food items in a single day (Sherry et al., 1982). There is strong evidence that each spatial location is retained independently in memory. The properties of this memory are: recall of large numbers of items; long retention intervals; and rapid elaboration of new memories. The problems with the conventional fully connected networks discussed by Murre (pp 4-14), involving catastrophic interference and a lack of autonomous programming, make them unlikely models for this type of memory, where thousands of potential patterns might be stored and where latent learning is important. The possibility of evolving a sparse or modular network which employs competitive learning would also enable us to address the issue of adaptive specialisation when learning involves a unique task.

REFERENCES

Dickinson, A. (1980). Contemporary animal learning theory. New York: Cambridge University Press.

Hinton, G. E. and Nowlan, S. J. (1987) How learning can guide evolution. Complex Systems 1: 495-502.

Lynch, M. J. & Gabriel, W. (1987) Environmental tolerance. Am. Nat. 129, 283-303.

Maki, W. S. & Abunawass, A. M. (1991). A connectionist approach to conditional discriminations: Learning short term memory and attention. In, Commons, M. L., Grossberg, S. & Staddon, J. E. R. (eds.) Neural Network Models of Conditioning and Action. Lawrence Erlbaum Associates, Hillsdale, New Jersey.

Murre, J.M.J. (1992) Learning and Categorization in Modular Neural Networks. UK: Harvester/Wheatsheaf; US: Erlbaum

Murre, J.M.J. (1992) Precis of: Learning and Categorization in Modular Neural Networks. PSYCOLOQUY 3(68) categorization.1

Sherry, D. F., Krebs, J. R. & Cowie, R. J. (1981) Memory for the location of stored food in marsh tits. Animal Behaviour, 29, 1260-1266.

Stearns, S. C. (1992) The Evolution of Life Histories. Oxford.

Sumida, B. H. , Houston, A. I., McNamara, J. M., Hamilton, W.D. (1990) Genetic Algorithms and evolution. J. theor.Biol, 147, 59-84.

Sutton, R. S. & Barto A. G. (1989). Towards a modern theory of adaptive networks: Expectation and prediction. Psychological Review, 88, 135-170.


Volume: 4 (next, prev) Issue: 28 (next, prev) Article: 4 (next prev first) Alternate versions: ASCII Summary
Topic:
Article: