Alan D. Pickering (1993) Keeping Calm About Neural Networks. Psycoloquy: 4(46) Categorization (9)

Volume: 4 (next, prev) Issue: 46 (next, prev) Article: 9 (next prev first) Alternate versions: ASCII Summary
Topic:
Article:
PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).
Psycoloquy 4(46): Keeping Calm About Neural Networks

KEEPING CALM ABOUT NEURAL NETWORKS
Book Review of Murre on Categorization

Alan D. Pickering
Dept. of Psychology,
St George's Hospital Medical School,
London, UK.

a.pickering@sghms.lon.ac.uk

Abstract

Murre's (1992a) book appears at a critical time for neural network research and illustrates the difficulty which diversification brings by his attempts to keep Categorisation And Learning in Modular (CALM) neural networks synapsing with several subdomains. The test of success rests on whether a variety of specialists find CALM strong enough in their own domain, although this critical approach may do little for the integrative perspective of the book. A new and broader impetus could have been provided by links with Murre's other theme: IM vs. EM. Catastrophic interference, when modelling certain kinds of IM behaviour, might actually be NECESSARY for the psychological validity of the neural network model. More demanding and interesting data from the IM literature when simulated could show CALM in a better light.

Keywords

Neural networks, neurobiology, psychology engineering, CALM, Categorizing And Learning Module, neurocomputers, catastrophic interference, genetic algorithms.
1.1 Murre's (1992a) book appears at a critical time for neural network research. The field is firmly established but is now so large that it is difficult to keep abreast of all the emerging subdomains (neurobiological and behavioural modelling, implementation, applications etc.). Murre's book illustrates the difficulty which this diversification brings by his attempts to keep Categorisation And Learning in Modular (CALM) neural networks synapsing with several subdomains. On the one hand this could stress the general utility of CALM and sustain multidisciplinary interest; on the other hand, the parts, of which the book is the sum, may each tend to isolation and flimsiness. The test of flimsiness rests on whether a variety of specialists find CALM strong enough in their own domain, although this critical approach may do little for the integrative perspective of the book. I shall therefore concentrate on the psychological aspects of CALM, which are acknowledged (page x) to be critical.

2.1 Newcomers to neural networks are put off by the variety of networks used by different researchers; these appear very similar and yet there is little justification for the use of model X rather than model Y. This presumably reflects a desire, evident in Murre's book, to promulgate one's own network rather than someone else's. For reasons explored below, I would prefer an approach which concentrates on generalities across nets: are there modes of behaviour which all neural nets cannot achieve? Are there core features of networks which determine broad aspects of their behaviour? (Etc.)

2.2 Why is the "my-net-right-or-wrong" approach unsound? First, Murre's arguments for preferring CALM to other networks are weak and ill-developed. More generally, there is the "universality class" assumption (Abbott, 1990) upon which neural net research is implicitly based. The networks are assumed to be members of a class of such systems which have highly similar behaviour in the (unspecified) limiting situation. The brain is assumed to be a member of the universality class also, thus justifying the net as a heuristic model of the brain. On grounds of neurobiological plausibility, rightly stressed by Murre, we know it is probable that all current models are deeply inadequate. Neural nets might be members of the appropriate universality class but we are likely to be still so far from the regions of that universe occupied by the human brain that current attachments to model X (or Y) seem overzealous.

3.1 Accepting that Murre has a different philosophy, one can still consider the manner and nature of CALM's simulation of behavioural data. I like Murre's decision to focus on the contrast between implicit and explicit memory: neural nets are natural models of memory and yet an unhealthily small part of research addresses questions about which kind(s) of memory they can simulate and how. Although there has been a recent massive proliferation in data related to implicit memory (IM), there are many contradictions. Thus, the choice of the to-be-simulated findings is tricky. The contradictions may in large part arise because there are several dissociable forms of IM sharing a different bundle of properties with each other and overlapping to varying extents with explicit memory (EM).

3.2 Murre, I would argue, has chosen a rather weak set of IM data for CALM to simulate. First, his choice was guided by the theoretical ELaboration and ActivatioN framework (ELAN; Graf & Mandler, 1984). According to this view, IM depends exclusively on the "automatic activation" occurring when an existing mental representation is accessed. EM requires elaborative processes involving "attention as an intervening variable (Murre, p. 65)." The framework has numerous limitations in its ability to handle IM data. For example, from the above it follows directly that IM performance should not be affected by restricting available attentional resources during encoding. The data on this issue are mixed. ELAN was contradicted by an early study which showed that word stem completion measures of IM were affected by attentional resources (Pickering, Mayes, & Shoqeirat, 1988), a finding later echoed by Murre's own colleagues (Wolters & Phaf, 1990). Later data, using other measures of IM, have been more consistent with ELAN (Smith & Oscar-Berman, 1990; Parkin & Russo, 1990).

3.3 Second, the main test of CALM is found in the simulation of the differential effect of word-frequency on IM (word stem completion) and EM (free recall). Although Murre was simulating some unpublished human data gathered by Dutch colleagues, word frequency effects do not form a strong basis for a critical simulation because they tend to be nonlinear (see Hintzman & Hartry, 1990, for references). This means that the results obtained are range-dependent (and hence variable across studies); this is serious because we have no idea of the correspondence between Murre's network "frequencies" and actual word-frequency. The published data for the effects of word-frequency in IM relate to fragment completion tasks (closely related to the stem completion task simulated by Murre) and bear out the lability of findings. Tenpenny and Shoben (1992) and Hintzman and Hartry (1990) found greater IM for high frequency words; the result opposite to the one simulated by Murre. A further study found the opposite result (MacLeod, 1989).

3.4 Third, there are problems with Murre's use of context nodes in the simulations. In the implementation of CALM, these nodes critically differentiate the EM test (where the retrieval cue arose from stimulus layer input solely to these nodes) from the IM test (where they receive no input from the stimulus layer during retrieval). Contextual processing during encoding must frequently operates without intention because contextual features are often accorded little attention during learning (Mayes, Meudell, & Pickering, 1985). Specialised modules could exist in the brain to pick up contextual features automatically from the environment. What Murre fails to address is why this involuntary pickup of contextual information occurs during encoding but is switched off during retrieval?

3.5 Finally, and ignoring the fact that I find the simulation uncompelling, I would also suggest that Murre has not presented these results to best effect. On page 78, for example, he notes a difference between his word-frequency simulation and the human data and suggests a means for testing the importance of it. Instead of conducting this test, Murre tells us that the factor "would probably not change [the results] in any qualitative way (p. 78)." Furthermore, the section on the "Interpretation of the model's behaviour (pp. 80-81)" is, I suspect, too densely packed for some genuinely interesting ideas to be appreciated by most readers. Also, Murre appears to be working with a very hard to explain the low-frequency advantage in word stem completion (IM). What he says sounds quite plausible but one cannot possibly evaluate it without experience in CALM.

3.6 For me, in summary, CALM fails the flimsiness test in the chapter on modelling human memory. I suspect that when CALM is pitted against more demanding and interesting data, however, it's strengths may be more apparent.

4.1 In evaluating CALM in his final chapter, Murre gives a detailed treatment of the issue of catastrophic interference. One might feel that here the emphasis was, as urged above, on a generality across networks because catastrophic interference is "one of the major weaknesses of MANY connectionist models (Murre, p. 135, emphasis added)." Murre reviews (pp. 135-136) the extremely long list of simulation features which influence the amount of interference. The typical approach has been to tinker with some of these features to reduce the amounts of interference and so preserve the particular network (usually backpropagation) as a valid model of human memory. Murre continues this approach with some backpropagation simulations and concludes (p. 152) that backpropagation cannot be a useful model because the problem of catastrophic interference cannot be fully overcome. Murre then states that CALM "reduces interference to zero" but, once again, fails to demonstrate this important point with a simulation.

4.2 With this concentration on minutiae, the debate on catastrophic interference is getting stale. A new and broader impetus could have been provided by links with Murre's other theme: IM vs. EM. The human data demonstrating a lack of catastrophic interference are based on EM experiments. A thought experiment shows that human IM, by contrast, may be subject to high levels of interference. Imagine two moderately related word pairs, A-B and A-C (e.g., ARMY-SOLDIER and ARMY-RIFLE). One can measure the IM produced by a recent exposure to A-B by the increased response probabilities in a free-association test ("say the first word which comes to mind when you see ARMY"). If this was followed by an exposure to A-C then a subsequent free association test will be affected by unmodulated competition between the increased probabilities of B and C responses; high levels of interference should result. The competition should be modulated, and interference reduced under EM instructions to use A as a cue to recall B or C. (In fact, this thought experiment has been confirmed -- Mayes, Pickering, & Fairbairn, 1987 -- with the additional finding that amnesic patients could not reduce their interference even with EM instructions.) This raises the intriguing possibility that catastrophic interference, when modelling certain kinds of IM behaviour, might actually be NECESSARY for the psychological validity of the neural network model. This might be an example of the "more demanding and interesting data" from the IM literature which, when simulated, would show CALM in a better light.

REFERENCES

Abbott, L.F. (1990). Learning in neural network memories. Network, 1, 105-122.

Graf, P., & Mandler, G. (1984). Activation makes words more accessible, but not necessarily more retrievable. Journal of Verbal Learning and Verbal Behaviour, 23, 553-568.

Hintzman, D.L., & Hartry, A.L. (1990). Item effects in recognition and fragment completion: Contingency relations vary for different subsets of words. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 955-969.

MacLeod, C. (1989). Word context during initial exposure influences degree of priming in word-fragment completion. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 398-406.

Mayes, A.R., Meudell, P.R., & Pickering, A. (1985). Is organic amnesia caused by a selective deficit in remembering contextual information? Cortex, 21, 167-202.

Mayes, A.R., Pickering, A., & Fairbairn, A. (1987). Amnesic sensitivity to interference: Its relationship to priming and the causes of amnesia. Neuropsychologia, 25, 211-220.

Murre, J.M.J. (1992a). Learning and categorisation in modular neural networks. Hemel Hempstead, UK: Harvester-Wheatsheaf.

Murre, J.M.J. (1992b). Precis of: Learning and categorisation in modular neural networks. PSYCOLOQUY 3(68) categorization.1

Parkin, A.J., & Russo, R. (1990). Implicit and explicit memory and the automatic/effortful distinction. European Journal of Cognitive Psychology, 2, 71-80.

Pickering, A.D., Mayes, A.R., & Shoqeirat, M. (1988). Priming tasks in normal subjects: What do they reveal about amnesia? In M.M.Gruneberg, P.E. Morris, & R.N. Sykes (Eds.), Practical aspects of memory. Vol 1. Clinical and educational implications (pp 58-63). Chichester, England: Wiley.

Smith, M.E., & Oscar-Berman, M. (1990). Repetition priming of words and pseudowords in divided attention and in amnesia. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 1033-1042.

Tenpenny, P.L., & Shoben, E.J. (1992). Component processes and the utility of the conceptually-driven/data-driven distinction. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 25-42.

Wolters, G., & Phaf, R.H. (1990). Explicit and implicit measures of memory: Evidence for two learning mechanisms. In B.Bonke, W.Fitch, & K.Millar (Eds.), Memory and awareness in anaesthesia (pp. 57-63). Amsterdam: Swets & Zeitlinger.


Volume: 4 (next, prev) Issue: 46 (next, prev) Article: 9 (next prev first) Alternate versions: ASCII Summary
Topic:
Article: