Worden's definition of optimality hides two components: efficacy in satisfying the needs of the organism, and effectiveness or robustness across a range of environmental contexts. There is however, in between, another criterion which relates to the efficiency of the solution in terms of the organisms resources, which is closely related to the parsimony of the model employed. In addition, there is a question as to how significant and useful the bounds are which Worden proposes, but he assumes that they are fairly tight and his proof of monotonicity assumes a rather simplistic model. Part of the problem here is the failure to recognize efficiency directly or to allow direct comparison of orthogonal mechanisms.
2. Worden next draws the conclusion that: "(2) For any aspect of cognition, we can calculate the optimum ... it uses internal representations and (sometimes) symbols." This turns out to be a simple (but not necessarily simplistic) notion of optimization of expected outcome (par. 12, eqn. 1). However, there is no argument, let alone proof, that internal representations or symbols are ever required to attain the optimum. Rather, solutions formulated using an implied symbolic representation are exemplified for specific scenarios, the prime example being the ant's summing of two dimensional vectors (par. 22), which clearly solves the problem perfectly. However, it is not the only solution, and optimality of solutions which are equally accurate depends on the resource requirements: the optimal solution uses the minimum resources (which would presumably be taken into account in the outcome O in the sense of greater energy requirements or less resources for other tasks, etc.)
3. In no case is the optimal solution demonstrated. Moreover, while the desert ant may use dead reckoning, other insects/animals appear to use other systems relating to magnetic fields, polarization, landmarks, etc. It may be that the environment determined that one mechanism was more appealing than another. It may be that the pre-empted pre-existing mechanisms lent themselves to adaption to one solution rather than another. The result of paragraph 59 notwithstanding, such a situation would seem to lead to a local maximum, and the effect of equation 4 is merely to shift his monotonicity requirement to the basis functions. If the original function of the commandeered mechanism was no longer required, we may have an equally satisfactory solution which is not optimally efficient; but small perturbations of this solution will lead to worse behaviour and there would be no selection pressure towards the development of an alternate more efficient mechanism.
4. Although I am sympathetic to symbolic interpretation of cognitive representations which are not directly perceptual, I fail to see any argument to support this position in Worden's article. Indeed, I would see a more direct grounding being required, in the sense that the substrates for cognitive representation differentiate from those for perceptual representation and that a neural path would persist between the emergent cognitive substrate and the extant perceptual substrate. Much abstract general and linguistic cognition can be modelled by analogy with perceptual processing, which is suggestive of such a connection. These observations underlie entire fields, such as Cognitive Linguistics (Deane, 1992), whilst avoiding the Symbol Grounding Problem (Harnad, 1990) due to the postulated emergence from grounded perceptual representations.
5. Worden draws a third conclusion which states that: "(3) While evolution cannot exactly reach the optimum ... it seems to come very close." Again, I tend to agree that the solutions with which "the brains we see today" (par. 58) are endowed are excellent compromises in terms of meeting the requirements of the problem well. Nonetheless, efficacy -- how well a task is performed -- is not the only requirement of optimality, and efficiency -- how well resources are utilized -- is equally important. Taking, for example, energy/food requirements into account is difficult, and incompatible with the focus on a single part of the organism, the "brain", since the resources are consumed by the entire organism.
6. Evolution actually has a big problem dealing with finding optima, and there are several reasons for this.
7. First, the assumption/argument that there is an optimum is specious (par. 14). Even fixing "a given habitat with given sense organs and choices of action" does not assure this. The argument that there is an optimum appeals to the limited problem domain, and in general for us to be certain there is an optimum the domain must be finite or be subject to similarly tight constraints. But assuming a finite space-time domain assures us of the existence of an optimum irrespective of the Requirement Equation (1).
8. Even if there exists an optimum, Computability Theory (Manna, 1974) tells us that some problems are inherently impossible or intractable in the sense that there is no algorithm or mechanism which can guarantee correct (optimal) solutions. However, we can usually get arbitrarily close to a general solution -- losing certain special cases or just missing out on optimality. But the more resources we expend, the better we can expect to do.
9. More importantly, evolution has difficulties with optima because it is faced with a dynamic programming problem, as the optimum "is a moving target" (par. 2) and it is constrained by the present configuration, the limited options for viable mutations, the limited scope of genetic breeding and the fact that (contra par. 59) modifying a stable state will tend to produce worse performance. However, in the event that the environment changes, a matching compensation may be found. But we have no mechanisms for replacing an adequate but suboptimal and intrinsically limited system by a provably better system founded on totally different principles and mechanisms. It is not just a case of missing links, but a case of the putative links being neither viable nor motivated. We would expect to see an external cause of failure of the existing system (so it is not being driven by optimization in the general sense, but in the specific sense of robustness -- effectiveness across a broad spectrum of possible states, as captured in the model of Figure 1) accompanied by a massive reduction in population (so whilst Worden's "speed limit" of paragraph 63 may be "independent of population size", population size is critical in relation to whether the adaption of the system may be accomplished before extinction in the context of exponential decimation), and correlated with the development of independent problem avoidance strategies which support the failing system while the new system problem solving system is being evolved. (N.B. The problems avoided and solved will in general be different.)
10. All of this is totally consistent with being in a local minimum and needing a jolt to continue the optimization processes. In general, evolutionary learning mechanisms, which should be viewed as part of a broader arsenal of methods (all of which are employed in the context of Machine Learning) are relatively slow mechanisms and neural learning mechanisms are rather fast. The many training examples required by back-propagation (par. 71) are somewhat irrelevant. Back-propagation has never been seriously proposed as a model of real neural networks, and there remains a considerable gap with respect to mapping it onto neurologically plausible structures. More importantly still, the supervised training paradigm which is normally adopted is too simplistic and the requisite feedback is not usually available in the form required. It is not a good candidate for searching out solutions or optimizing solutions. On the other hand, there are a wide variety of other connectionist and non-connectionist learning techniques (Langley, 1996) which have better properties in relation to optimization or which structure data by self-organization (Kohonen, 1995). These don't have to wait for a specified number of training examples before they can be useful.
11. My final point is that information theoretic "speed limit" of paragraph 63, equation 5, is just that, a limit or upper bound. But unlike the speed limits which we flaunt daily, there is no reason to expect that the speedometer is going to be hovering around that mark all the time. This formula simply relates probability of survival to the information which appears to be conveyed by survival in a fairly straightforward way. But it says nothing about whether that survival is based on an accident of genetics or an accident of environment, whether particular abilities in the area of learning and adaptability were involved or just the general variation in experience, whether the distinguishing capabilities are specific to the situation or are more in the area of general cognitive ability.
Deane, Paul D. (1992) Grammar in Mind an Brain: Explorations in Cognitive Syntax, Cognitive Linguistics Research 2, Berlin/New York: Mouton de Gruyter.
Harnad, Stevan (1990) "The Symbol Grounding Problem", Physica D, 335-346.
Langley, Pat (1996) Elements of Machine Learning, San Francisco: Morgan Kaufmann.
Kohonen, Teuvo (1995) Self-Organizing Maps, Berlin/New York: Springer.
Manna, Zohar (1974) Mathematical Theory of Computation", New York: McGraw Hill.
Worden, R.P. (1996) An Optimal Yardstick for Cognition. PSYCOLOQUY 7(1) optimal-cognition.1.worden.