Ute Schmid (1998) Bottom-up and Top-down Processes in Learning. Psycoloquy: 9(76) Efference Knowledge (4)

Volume: 9 (next, prev) Issue: 76 (next, prev) Article: 4 (next prev first) Alternate versions: ASCII Summary
Topic:
Article:
PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).
Psycoloquy 9(76): Bottom-up and Top-down Processes in Learning

BOTTOM-UP AND TOP-DOWN PROCESSES IN LEARNING
Commentary on Jarvilehto on Efference-Knowledge

Ute Schmid
Department of Computer Science
Technical University Berlin
Franklinstr. 28
D-10587 Berlin
+49 (0)30/314-23938 FAX -24941
http://ki.cs.tu-berlin.de/~schmid

schmid@cs.tu-berlin.de

Abstract

The physiological processes of afferent transmission of receptor activities to the central nervous system and of efferent influences from the central nervous system on receptors correspond roughly to the notion of bottom-up and top-down processing in cognitive and AI models. I will discuss the role of bottom-up and top-down processes in approaches to knowledge acquisition and (machine) learning and their relation to theory and findings in physiology.

Keywords

afference, artificial life, efference, epistemology, evolution, Gibson, knowledge, motor theory, movement, perception, receptors, robotics, sensation, sensorimotor systems, situatedness

I. INTRODUCTION

1. Jarvilehto (1998) proposes that efferent influences on sensory organs play a crucial role in knowledge formation: Efferent information "tunes" the receptors to provide such information from a given environment which is needed by a given organism. He reports findings from two experiments and a thought experiment which he interprets in accordance with this postulate. In the following, I will comment (as a layman in the neurosciences) on some aspects of Jarvilehto's target article. Afterward, I will discuss Jarvilehto's suggestion in the context of higher-level processes, especially in relation to cognitive models of knowledge and skill acquisition and to approaches in machine learning.

II. KNOWLEDGE FORMATION WITHOUT THE HELP OF RECEPTORS?

2. Intuitively, I agree with Jarvilehto's (1998) postulate that knowledge acquisition cannot be purely data-driven. His arguments in paragraph 4 correspond with Piaget's theory (1954): To survive and function in a given environment, human beings must have the means to both assimilate and accommodate new information. Both processes are highly dependent on a person's knowledge. If information in the environment has no relation to existing knowledge, it cannot be processed. But if information deviates only to a limited degree, the current knowledge structure will be changed to accommodate the new information -- resulting in the ability to assimilate more (complex) knowledge.

3. Jarvilehto concludes that this kind of information processing can only be modelled by an inseparable organism-environment system. Here I cannot agree: It is possible to expose a given organism to different environments (for example, first to one with only horizontal bars and afterwards to one with only vertical bars) and study its nervous activity and its behavior relative to both; and it is possible to expose different organisms (for example, one with prism-glasses and another without) to a given environment and study their differences in nervous activity and behavior. If environmental and organismic characteristics can be manipulated independently, they can also be described independently. I believe that such an analytical separation is very useful for understanding the mutual influences between environment and organism. Actually, Jarvilehto does exactly this in his thought experiment!

4. In his first experiment (pars. 13-16), Jarvilehto can show that thresholds and response characteristics of mechano-receptors change depending of the task: the receptors are "more sensitive" if the task is related to tactile information than if it is related to auditory information. But the experimental design allows a different interpretation: when one is exposed to tactile stimuli only, the receptors are more sensitive than when one is exposed to stimulation of different channels (tactile and auditory). I do not propose this as a serious hypothesis, but as a factor which should be controlled.

5. In the second experiment (pars. 17-22) it could be shown that a rabbit learning the location of food with open and covered eyes later produced afferent activity of the optic nerve even when approaching the food location with covered eyes. This finding is interpreted as evidence against the attention hypothesis and as evidence against simple transduction of sensory information. The result concerns performance only. In the first phase of the study the rabbit learned to locate the food with open eyes, that is, it was exposed to both visual and motor information. The experimental setting allows for the following speculation: The rabbit learned a correlation between motor activities and visual stimuli in the sense of Hebbian learning (cf. Milner 1993) and during the test with covered eyes the motor signals could co-activate the visual ones. I would be interested to see what happens in the performance phase when the rabbit is presented with the visual information of nearing the foot while its motor activity is blocked.

6. Jarvilehto proposes a thought experiment to illustrate that knowledge formation does not depend (only) on afferent information. The heading of this section -- "Knowledge Formation Without Senses" -- is a little misleading: Even in the first part of the experiment the organism processes sensory information -- via absorption from an energy field (par 26) -- and the author acknowledges this fact later (par. 31). The last part of the experiment is crucial for Jarvilehto's argument, only here the organisms have efferent control of receptors. In the first two parts of the experiments the organisms can only react to the environment (i.e. only assimilate new information) and learning is restricted to building associations during trial-and-error behavior. The organisms provided with additional efferent control can adapt to their environment and the author argues that this ability is crucial for perception. If perception is defined as "interpreting environmental information in accordance with subjective experience" (needs, knowledge) -- in contrast to "representing some aspects of stimuli from the environment" (as input to hard-wired reflexes or simple associative rules) -- then I fully agree. Nevertheless, a system without the possibility of receptor tuning can also be adaptive in some sense: If we allow that associations built by a kind of Hebbian learning can weaken when there is no co-occurrence of stimuli over some time interval, then the organism can "forget" and new associations can simultaneously be learned. On the behavioral level such a system is obviously adaptive to its environment.

III. BOTTOM-UP AND TOP-DOWN PROCESSES IN LEARNING

7. In the machine learning context learning can be described as finding a hypothesis h: X -> Y which maps all possible input states X to an output Y with high accuracy. Only a subset of X is presented to the system during learning. That is, learning is an inductive process. The final hypothesis represents newly acquired knowledge or skill and the resulting behavior of the system should be improved with respect to some evaluation function (e.g. numbers of errors when the system is confronted with new inputs from X). Models of learning can be characterized by the way in which the different components of the definition above are realized: X can be a set of discrete or real-valued vectors (i.e. a feature space) or a set of structures (i.e. graphs or logical clauses). Y can be a set of class labels or a set of actions. During learning the system can be presented with x-y pairs (supervised learning) or with inputs x only (unsupervised learning). A training sample of X can be presented to the learning system in batch mode or incrementally. The hypothesis language in which h is represented can be a decision tree, a (logical or functional) program or a matrix of weights. (For an overview of current approaches to machine learning, see Mitchell 1997.)

8. All models of learning assume some kind of hypothesis language which restricts learning (the "language bias"). It is not possible to define a learning system without such built-in restrictions. For example, in a simple linear classifier such as the Perceptron, it is not possible to learn XOR (that is: if x1 and x2 occur together or if neither of them occurs then react with y1, otherwise with y2). A system which is able to learn (the grammar of) a language must have the means to detect regularity in structures -- as in the case of Chomsky's (1968) Language Acquisition Device or algorithms for grammatical inference (Freivalds, Kinber & Smith 1997) and inductive program synthesis (Schmid & Wysotzki 1998a). So the first kind of non-afferent influence on learning is given by the architecture of a system itself.

9. Furthermore, the current knowledge structure of a system determines what it will learn in the next step. For example, let knowledge be represented as a decision tree (Quinlan 1986): If a system is presented with "red and big means dangerous" when it has no current hypothesis available, it will infer "everything is dangerous"; but if the system already knows that "green and big means nice", it will acquire more differentiated knowledge. The same is true for cognitive models of knowledge acquisition such as schema theory (Rumelhart & Norman 1978).

10. Up to now we have discussed models usually associated with concept acquisition. Jarvilehto's experiments address another aspect of learning: skill acquisition (i.e. acquisition of procedural in contrast to declarative knowledge). On a symbolic level, skill acquisition is mostly modelled by chunking (Rosenbloom & Newell 1986) or compilation (Anderson 1986) of production rules. In machine learning, statistical, cybernetic or artificial neural net approaches are dominant (cf. reinforcement learning: Dean, Basye & Shewchuk 1993; see Schmid & Wysotzki 1998b). Skill acquisition is influenced by current knowledge in a different way than concept acquisition: The system does not receive new information from an independent source ("teacher") but provides it partially itself, by interaction with the environment. Each "perceived" state of the environment causes the system to perform an action which can produce a change in the environment. The current behavioral repertoire (production rules or state-action associations) determines the way in which the system interacts with the environment and therefore how the environment can be changed. That is, the system has influence on the kind of experience it makes and hence on what it will learn.

11. The top-down processes we have discussed so far -- architecture and current knowledge structure -- are only loosely connected with the claim of efferent influences on learning. The stronger claim Jarvilehto makes is that efferent influences on receptors are crucial for learning. The learning mechanisms discussed so far mostly do not rely on such a principle -- nevertheless, most of them are very powerful (see Mitchell 1997). There exists one class of learning mechanism for artificial systems which could be interpreted in accordance with Jarvilehto's postulate: the correction of weights by feedback, as for example in Perceptrons, backpropagation or reinforcement learning systems. If a system produces inadequate output, it gets negative feedback, which is distributed to the units by the degree of their participation in the negative result. For the weighted connections which are directly associated with the input units such changes can be interpreted roughly as receptor tuning.

IV. CONCLUSIONS

12. Jarvilehto's weak claim that learning cannot be explained by a purely data-driven process is not controversial in cognitive science and machine learning: of course learning is influenced by top-down processes. His strong claim that efferent influences on receptors are crucial for learning is challenging. From a machine learning point of view it would be interesting to replace Jarvilehto's thought experiment by a formal comparison of the set of hypotheses which are learnable by systems with and without the capacity to change the relevance of different aspects of the input information (cf. Hanson, Drastal & Rivest 1994).

REFERENCES

Anderson, J. R. (1986), Knowledge compilation: A general learning mechanism. In: R. S. Michalski, J. G. Carbonell and T. M. Mitchell (Eds.), Machine Learning -- An Artificial Intelligence Approach (vol. 2, pp. 289-310). Morgan Kaufmann.

Chomsky, N. (1968). Language and Mind. New York: Harcourt, Brace Jovanovich.

Dean, T., Basye, K. & Shewchuk, J. (1993). Reinforcement learning for planning and control. In: S. Minton (Ed.), Machine Learning Methods for Planning (pp. 67-92). Morgan Kaufmann.

Freivalds, R., Kinber, E. & Smith C. H. (1997). The functions of finite support: a canonical learning problem. Proceedings of the 19th Annual Conference of the Cognitive Science Society (pp. 235-240), Hillsdale, NJ: Lawrence Erlbaum.

Hanson, S. J., Drastal, G. A. & Rivest, R. L. (1994). Computational Learning Theory and Natural Learning Systems. Cambridge, MA: MIT Press.

Jarvilehto, T. (1998d) Efferent influences on receptors in knowledge formation. Psycoloquy 9(41) http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?9.41 ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1998.volume.9/psyc.98.9.41.efference-knowledge.1.jarvilehto

Milner, P. (1993). The mind and Donald O. Hebb, Scientific American, 268 (1), 124-129.

Mitchell, T. (1997). Machine Learning. New York: McGraw Hill.

Piaget, J. (1954). The Child's Construction of Reality. New York: Basic Books.

Quinlan, J.R. (1986). Induction of decision trees, Machine Learning, 1 (1), 81-106.

Rosenbloom, P. S. & Newell, A. (1986). The chunking of goal hierarchies: A generalized model of practice. In R. S. Michalski, J. G. Carbonell and T. M. Mitchell (Eds.), Machine Learning -- An Artificial Intelligence Approach (vol. 2, pp. 247-288). Morgan Kaufmann.

Rumelhart, D. E. & Norman, D. A. (1978). Accretion, tuning and restructuring: Three modes of learning. In J. W. Cotton and R. L. Klatzky (Eds.), Semantic Factors in Cognition (pp. 37-53). Hillsdale, NJ: Lawrence Erlbaum.

Schmid, U. & Wysotzki, F. (1998a). Induction of recursive program schemes, In C. Nedellec and C. Rouveirol (Eds.), Proceedings of the 10th European Conference on Machine Learning (pp. 228-240), LNAI 1398. Springer.

Schmid, U. & Wysotzki, F. (1998b). Skill acquisition can be regarded as program synthesis: An integrative approach to learning by doing and learning by analogy. In U. Schmid, J. Krems and F. Wysotzki (Eds.), Mind Modelling -- A Cognitive Science Approach to Reasoning, Learning and Discovery. Lengerich: Pabst Science Publishers.


Volume: 9 (next, prev) Issue: 76 (next, prev) Article: 4 (next prev first) Alternate versions: ASCII Summary
Topic:
Article: