In "The Autonomous Brain," Milner (1999a,b) tries to shift the tide of emphasis from stimuli to internal plans that contribute importantly to the control of behaviour. With a strong focus on mechanism, he begins with the innate organization of motivational systems being able to activate certain response patterns and the representations of the stimuli that will satisfy the biological need giving rise to that motivation. He adds learning and memory as processes that also allow novel stimuli to be activated by internal states and hence to capture attention. He focuses in later chapters on reward-related learning and the role of dopamine in the striatum. Dopamine may produce learning by activating D1-like receptors and the cAMP-PKA second messenger pathway leading to phosphorylation of proteins involved in both short term and long term modifications of glutamatergic synapses in the striatum. Many studies have indicated a role for a variety of molecular mechanisms in learning. It remains a challenge to researchers to isolate the mechanisms involved in various types of learning.
2. The autonomy to which Milner alludes in his title and within the text is freedom from stimulus control, a reaction to the radical behaviourism of much of the twentieth century that drove discussions of the inner workings of the brain underground for so long. His thesis that behaviour is internally generated is reflected in his opening chapter in a section with the title, Why Did the Chicken Cross the Road? Milner states, "The notion that behavior is always a reaction to a stimulus is so ingrained that the name applied by psychologists to an element of behaviour is `response'" (p. 3). There is a lot of insight in this statement. With respect to the word "response" I had never realized the point that he makes so obvious here.
3. What are some of the mechanisms that underlie the behaviour of a rat that learns to press a lever in a Skinner box? Skinner himself, in one of his last publications, "Upon Further Reflection" (Prentice-Hall, Englewood Cliffs, NJ, 1987), compared the selection of responses by their consequences in operant learning to Darwin's concept of natural selection in the evolution of species. This is an interesting comparison because of its implications for the idea of purpose. Evolutionary biologists are quick to point our that there is no purpose in nature; random mutations simply serve up variations that provide the options upon which environmental circumstances act. By analogy, organisms simply emit responses and the responses sometimes succeed in producing a reward. Like the traits that confer advantages to individuals allowing them to survive, the responses that produce reward are selected.
4. Although there is no purpose in nature, there is a mechanism. The synthesis of genetics with evolutionary biology provided the mechanism underlying natural selection. There is also a mechanism for behaviour although Skinner never extended his analogy to include one. Milner provides a mechanism. Thus, a rat that presses a lever in a Skinner box has a plan. "Plan" does not refer to a mental state but rather to a mechanism that is responsible for directing attention and subsequent behaviour to a stimulus that has produced rewarding consequences in the past. "The Autonomous Brain" is an elaboration of the development and workings of the mechanism that leads organisms to direct their attention to certain stimuli in the environment. The book provides many details of the mechanism that underlies the behaviour of a rat that learns to press a lever in a Skinner box. The same mechanism may govern a wide range of behaviour, for example, the writing of book reviews.
5. Milner aptly titles his opening chapter, Where to Start? But this is a rhetorical question. He knows where to start and thus begins his treatment of the mechanism with innate goal seeking. The model is based upon neural circuits that have come under the control of genes through the process of natural selection. These circuits have the capacity to detect need states of the organism and to detect satisfiers of those needs. The circuits facilitate motor components of the selected responses and in so doing switch attention to the stimuli necessary for the execution of those responses. The next major addition to this automaton is its ability to learn, which provides it with memory for past successes and failures in satisfying needs. Learning can be of the simple, nonassociative type as habituation, or it can be more complex learning involving acquisition through response plans (including the neutral stimuli associated with them) of the ability to control the motivation to act, which is nitially only controlled innately. This form of learning is called "incentive learning."
6. A number of chapters deal with aspects of perceptual learning including the nature of engrams. In keeping with the focus in the earlier chapters, Milner argues that ideas are associated with each other via the response system. He also emphasizes innate aspects of the organization of the nervous system, for example, in pointing out the number of different receptors in the olfactory bulb and indicates that their possible combinations exceed the number of available connections in the central nervous system. Clearly, there must be some innate organization of olfactory inputs.
7. At the beginning of Chapter 6, on memory, Milner briefly reviews synaptic plasticity, referring to metabotropic receptors and intracellular second messenger cascades. These include the cyclic 3',5'-adenosine monophosphate (cAMP) coupled dopamine (DA) receptor. The importance of calcium and its possible influence on protein phosphorylation and gene expression is also mentioned. Milner points out that enough is known to explain changes in synaptic effectiveness ranging from those that endure for less than a second to those that last a lifetime. Exactly which cascades are involved in any particular memory process is not known.
8. Milner reviews the role of DA in reward in Chapter 8. In Chapter 9 he states that increased DA, acting at synapses of the direct pathway through the basal ganglia, may promote synaptic change, producing long-lasting increases in the effectiveness of concurrent cortical input. This may provide the mechanism for incentive learning and may involve the second messengers mentioned above. When I studied as a student of Peter Milner's, he taught me many of these ideas about the mechanisms of learning, including especially those for reward-related learning or incentive learning. In recent years, my own studies and those of others have shown many of his ideas to be correct and have provided details of the mechanisms he postulated.
9. Many studies have shown that both D1- and D2-like DA receptor antagonists can decrease responding for reward. Some studies have shown that antagonists acting at the two receptor subtypes produce dissimilar effects, however. For example, Fowler and Liou (1994) showed that the D2-like receptor antagonist raclopride decreases responding for water reward at doses that produce microcatalepsy; the D1-like receptor antagonist, SCH 23390, on the other hand, decreases responding for food at doses that do not produce catalepsy. These results suggest that D1-like receptors may play a more important role in reward-related learning than D2-like receptors, a result that is supported by related studies of the effects of D1- and D2-like receptor agonists on responding for conditioned reward (Sutton and Beninger, 1999).
10. In the striatum, at the interface of cortical afferents with medium spiny neurons of the direct pathway, DA released when reward occurs may modify synaptic strength, as suggested by Milner. This effect may be mediated by D1 receptors. Kelley (1999) has provided good evidence for a role of stimulatory G proteins in reward-related learning, results that further strengthen this idea. Recently, we have shown that cAMP-dependent protein kinase (PKA) is necessary for incentive learning in several paradigms (Beninger et al., 1996; Sutton et al., 2000); in one study, we have shown a critical role for Ca2+- dependent protein kinase (PKC) in reward-related learning (Aujla and Beninger, 1999). Thus, data continue to support the hypothesis that incentive learning is mediated by D1 receptors in the striatum through the activation of second messenger pathways.
11. The events that occur when reward produces learning may be as follows. Whenever there is a cortical input to the striatum, NMDA receptors are stimulated leading to an increase in Ca2+ concentration in the dendritic spines with which the inputs make contact. This event leads to the activation of PKC and Ca2+-calmodulin-dependent protein kinase (CaMK). As pointed out by Milner, these kinases will phosphorylate substrate proteins; some of these may translocate to the nucleus and others may act outside the nucleus. If reward occurs, the release of DA will stimulate D1 receptors and activate the cAMP-PKA second messenger pathway. However, as Milner points out, only cortical inputs that were most recently active should be modified.
12. A number of mechanisms are possible to explain how diffuse DA input leads to selective changes in corticostriatal synapses that were most recently active. One possibility is that the glutamatergic input leads to enhanced coupling of nearby monoamine receptors to adenylate cyclase, as has been shown in Aplysia (Kandel, 1991). Another possibility, not excluding the first, relates to the phosphorylating effect of PKC on existing proteins in the synaptic spine. This could be seen as a temporary signature of activity at that synapse. If a DA input occurs in close temporal contiguity with the glutamatergic input, activation of PKA by DA may lead to phosphorylation of DARPP32, a protein that, when phosphorylated, inhibits protein phosphatase 1 (PP-1). Normally, PP-1 dephosphorylates phosphoproteins. If PP-1 was inhibited while the protein phosphorylating effect of the NMDA input was still in progress, the phosphoproteins may persist, forming a substrate for memory. For example, phosphorylated kainate or AMPA receptors could lead to enhanced effectiveness of the glutamatergic input (Wang et al., 1993). Thus DA, acting through D1-like receptors, may increase the effectiveness of cortical inputs representing environmental stimuli and responses (plans) that preceded the reward.
13. Besides these relatively transient changes in the synapse associated with reward, there may be more enduring changes mediated by protein synthesis. The D1 receptor input, acting through the cAMP-PKA pathway, phosphorylates CREB, a protein involved in transcription. Through this pathway, new proteins may be synthesized. Milner (p. 71) mentions the need for newly synthesized proteins involved in memory to find their way back to the correct synapse. One possible mechanism, mentioned by Milner, is that molecules in the vicinity of the active synapse could be marked in some way, perhaps by phosphorylation, to attract the newly synthesized proteins. The phosphoproteins mentioned above could also serve that purpose.
14. Milner was a student of Hebb who was a student of Lashley (Orbach 1999). This distinguished lineage has produced some of the most influential ideas concerning the neuronal organization underlying behaviour. Milner's book contributes importantly to this tradition. No doubt Milner's ideas will contribute to continued experimentation on a wide range of topics on many species -- including, to use Milner's own words, "rats, mice and the occasional chicken".
Aujla, H.S. and Beninger, R.J. (1999). Intracellular signalling and reward-related learning: Inhibition of PKC in the nucleus accumbens blocks amphetamine-induced place conditioning in rats. Society for Neuroscience Abstracts, 25, 628.
Beninger, R.J., Nakonechny, P.L. and Todd, M.J. (1996). Inhibition of protein kinase A in the nucleus accumbens blocks amphetamine-produced conditioned place preference in rats. Society for Neuroscience Abstracts, 22, 1127.
Fowler, S.C. and Liou, J.-R. (1994). Microcatalepsy and disruption of forelimb usage during operant behavior: Differences between dopamine D1 (SCH-23390) and D2 (raclopride) antagonists . Psychopharmacology, 115, 24-30.
Kandel, E.R. (1991). Cellular mechanisms of learning and the biological basis of individuality. In E.R. Kandel, J.H. Schwartz & T.M. Jessell (Eds.). Principles of Neural Science 3rd ed. (pp. 1009-1031). Norwalk CT: Appleton & Lange
Kelley, A.E. (1999). Neural intergrative activities of nucleus accumbens subregions in relation to learning and motivation. Psychobiology, 27, 198-213.
Milner, P.M. (1999a) The Autonomous Brain. Erlbaum, Mahwah NJ
Milner, P.M. (1999b) Precis of "The Autonomous Brain" PSYCOLOQUY 10(071) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ psyc.99.10.071.autonomous-brain.1.milner http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?10.071
Orbach, J. (1999) Precis of: The Neuropsychological Theories of Lashley and Hebb. PSYCOLOQUY 10(029) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ psyc.99.10.029.lashley-hebb.1.orbach http://www.cogsci.soton.ac.uk/psyc-bin/newpsy?10.029
Sutton, M.A. and Beninger, R.J. (1999). Psychopharmacology of conditioned reward: Evidence for a rewarding signal at D1-like dopamine receptors. Psychopharmacology, 144, 95-110.
Sutton, M.A., McGibney, K. and Beninger, R.J. (2000). Inhibition of protein kinase A in the nucleus accumbens: dissociable effects on unconditioned activity, locomotor sensitization, and conditioned activity produced by intra-accumbens amphetamine. Behavioural Pharmacology, in press.
Wang, L.Y., Taverna, F.A., Huang, X.-P., MacDonald, J.F., and Hampson, D.R. (1993). Phosphorylation and modulation of a kainate receptor (GLuR6) by cAMP-dependent protein kianse. Science, 259, 1173-1175.