Changsin Lee, Bram van Heuveln, (1998) Why Connectionist Nets are Good Models. Psycoloquy: 9(17) Connectionist Explanation (14)

Volume: 9 (next, prev) Issue: 17 (next, prev) Article: 14 (next prev first) Alternate versions: ASCII Summary

Topic:

Article:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 9(17): Why Connectionist Nets are Good Models

WHY CONNECTIONIST NETS ARE GOOD MODELS
Commentary on Green on Connectionist-Explanation

Changsin Lee, Bram van Heuveln,
Clayton T. Morrison & Eric Dietrich
PACCS, Department of Philosophy
Binghamton University
Binghamton, NY 13902
http://www.paccs.binghamton.edu/chang/

chang@turing.paccs.binghamton.edu

Abstract

We agree with Green that some connectionists do not make it clear what their nets are modeling. However, connectionism is still a viable project, connectionism, because it provides a different ontology and different ways of modeling cognition by requiring us to consider implementational details. We also argue against Green's view of models in science and his characterization of connectionist networks.

Keywords

cognition, connectionism, explanation, model, ontology, theory

1. Green (1998) calls into question connectionist models' viability as scientific theories of cognition. According to Green, unlike other models in science, individual nodes and connections in a connectionist network do not represent any specific entity and it is difficult to see how positing these theoretical entities contributes in any way to our theoretical understanding of cognition. Instead of pronouncing a death sentence on connectionism, Green advises connectionists to restrict their research to purely modeling individual neurons to as accurately as possible based on the results from neurological studies of the brain. Although Green is right that many connectionists have been unclear about what -- if anything -- they are modeling, this should not result in a wholesale rejection of the connectionist approach to cognition. On the contrary, we believe that connectionist models show many interesting and promising ways of explaining cognition that were not possible in the classical symbolic approach.

2. Let us state clearly where our agreements lie and what we are arguing against. We agree with Green that, for many scientific models, the meaning of a theoretical entity is relatively transparent, and that in these cases it is easy for us to see the role the entity plays in the theory: e.g., genes in Mendelian genetics. We also agree with Green that most connectionist models (except localist ones) do not have a clear picture of what they are modeling or what they explain because individual nodes and connections (which are "theoretical entities" according to Green) do not represent any specific function or entity required for a particular cognitive activity. Accepting these two pivotal claims of Green, however, does not lead to the conclusion that connectionist models are not suitable scientific models of cognition because Green's assumptions about the nature of models in general and his characterization of connectionist models in particular appear to be in error.

3. Some researchers equate models and theories. Although this is sometimes a useful way to talk, it is not useful when talking about computer programs and theories of cognition. Let us distinguish between models and theories. A scientific model is an abstract representation of certain phenomena and it usually simplifies the phenomena under investigation. By its nature, a model does not and cannot render the reality perfectly accurately. The only perfect representation of a phenomenon is the phenomenon itself, but we cannot take the phenomenon itself as its own model because that would defeat the very purpose of building a model, namely, to explain the phenomenon. A scientist explains the phenomenon by proposing a theory which has the model as a crucial part. The theory specifies the way the model (which consists of phenomena we do understand to some extent) matches the phenomena one is trying to explain. More precisely, the theory specifies the similarity relations that obtain between the model and the phenomenon to be explained. Hence, a theory specifies an analogy between a model (a computer program in the case of a neural net) and the phenomena to be explained. A model often contains negative, and neutral similarities as well as the intended positive similarities (Aronson, Harre', and Way, 1995, p. 59).

4. Consider the solar system model of atoms. Although we do not understand the solar system perfectly, we understand it well enough to make the analogy work. When Bohr first used the solar system to explain the structure and behavior of an atom, people still understood his analogical model correctly by filtering out negative and neutral similarities: e.g., nobody seriously asked whether certain electrons have moons or rings around them as some planets do. People knew that the point of the model was to highlight a common relational similarity between the solar system and an atom, namely, both are central force field systems.

5. If we are right about the analogical nature of theories and models, then a model does not have to be isomorphic to the phenomenon concerned in all respects. Instead, all that is required is to share relational similarities with the phenomenon only in certain RELEVANT respects. The question of what counts as a relevant respect and what counts as trivial similarities is a tough one. It is the job of the theory to specify which similarities are relevant and which are not. As Green has pointed out correctly, a connectionist has to show what similarities exist between connectionist networks and biological neural networks. All we want to say at this point is that from the fact that there are many differences between biological neural networks and connectionist networks it does not follow that connectionist models are not scientific models, nor does it follow that the theories in which they participate are not scientific theories. Connectionists should not wait for the results from neurological studies of the brain. If scientists had waited until everything was laid out clearly, we would still be living in caves.

6. Second, Green's characterization of connectionist models left out an essential ingredient of connectionism: post hoc analyses. Green is right that if you look at individual nodes and connections of an artificial neural network (e.g., NETalk), you will have no idea what they are doing or what they are supposed to represent. However, hardly any connectionist would stop there. Most researchers would launch a long and arduous process of post hoc analysis of a network using various techniques such as principal components analysis or hierarchical cluster analysis. Sejnowski and Rosenberg (1987), for example, carried out a cluster analysis of NETalk, by taking 79 letter-to-phoneme patterns of hidden-unit activation and pairing each of them with its closest neighbor. The end result of the cluster analysis revealed that the network learned a hierarchically represented division of consonants and vowels. Although the network was never explicitly encoded with such information, it was able to navigate the cognitive domain and to chart a map for its own functioning. In other words, it is possible to understand and explain how a connectionist network works the way that it does. We have to conclude that Green's observation of connectionist networks (Green, 1998, para. 11: "Since none of the units correspond to ANY particular aspect of the performance of the network, there is no particular justification for any one of them.") is not accurate. Yes, connectionist networks require elaborate post hoc analyses, but these analyses are not, in general, ad hoc.

7. Green's charge that connectionist models have too many degrees of freedom deserves careful consideration. The reason the solar system model of atoms is a good or better model than the plum-pudding model of Thomson is that the solar system shares certain invariant causal structures with atoms. If there were no invariant, independent ontological and causal mechanisms common to the two compared systems, the analogy would not work. As Green contends, connectionist networks come in a variety of forms: localist networks, multi-layered feedforward networks, recurrent networks, Kohonen networks, etc. Since so many things are optional, it is natural to ask what are the invariant binding principles for all connectionist networks. To answer the charge, we would like to draw another analogy. Consider building a model airplane and testing it in a wind tunnel. A model airplane certainly has more degrees of freedom than an actual airplane (In Green's terminology, many features are OPTIONAL). Does this mean that the model airplane is of no use in understanding the behavior of an actual airplane. Not at all! In fact, all airplane constructions rely crucially on such model airplanes. Likewise, constructing many different types of connectionist models of cognition helps us understand what kind of cognitive terrains we are dealing with and what kind of cognitive functions are required to negotiate various cognitive tasks. Having a multitude of models does not necessarily entail too many degrees of freedom. Furthermore, as in the case of the atom and the solar system, there is something invariant between networks of neurons and all connectionist nets: nodes, arcs, and levels of activation. The theory states that it is these invariant properties that matter to cognition. This may be wrong, but it is certainly an interesting and important model and theory.

8. Another way to make our point is this: connectionism, properly practiced, does not have too many degrees of freedom. "Properly practiced" means appealing to the internal workings of connectionist nets to tell some explanatory story about how cognition emerges from distributed processes. Green is right that mere curve-fitting does have too many degrees of freedom, but mere curve-fitting is not connectionism properly practiced.

9. As Clark (1993) has shown, the classical symbolic approach is able to preserve a close relationship between Marr's (1977) level-1 competence theory and its level-2 implementation because it assumes a symbol-processing architecture for implementation. Connectionists, on the other hand, make different ontological assumptions: instead of context-invariant symbols, they work with nodes and their activations. The representational entities are supposed to emerge from these activated nodes. (Emergence, is poorly understood, but no more poorly than representation itself). If the network has mastered certain skills, then one goes on to analyze the patterns that it has learned. Thus, the connectionist paradigm offers a different way of modeling cognitive phenomena, requiring researchers to wade through thick implementation details of cognition. However, as many connectionist models have been showing, the results are fascinating and intriguing. As far as we can see, we have only begun to explore all that connectionist nets have to offer.

REFERENCES

Aronson, J.L., Harre', Rom, and Way, E.C. (1995) Realism Rescued: How scientific progress is possible. Chicago: Open Court.

Clark, Andy (1993) Associative Engines. Cambridge: MIT Press.

Green, C.D. (1998) Are connectionist models theories of cognition? PSYCOLOQUY 9(4) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/ psyc.98.9.04.connectionist-explanation.1.green

Marr, D.C. (1977) Artificial intelligence: A personal view. Artificial Intelligence 9: 37-48.

Sejnowski, T.J. and Rosenberg, C. (1987) Parallel networks that learn to pronounce English text. Complex Systems 1: 145-68.