Paul Barrett (2000) Intelligence, Psychometrics, iq, g, and Mental Abilities:. Psycoloquy: 11(046) Intelligence g Factor (45)

Volume: 11 (next, prev) Issue: 046 (next, prev) Article: 45 (next prev first) Alternate versions: ASCII Summary

Topic:

Article:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 11(046): Intelligence, Psychometrics, iq, g, and Mental Abilities:

INTELLIGENCE, PSYCHOMETRICS, IQ, G, AND MENTAL ABILITIES:
QUANTITATIVE METHODOLOGY DRESSED AS SCIENCE
Book Review of Jensen on Intelligence-g-Factor

Paul Barrett
The State Hospital and University of Liverpool
Carstairs
Lanark Scotland
ML11 8RP
United Kingdom
www.liv.ac.uk/~pbarrett/paulhome.htm

p.barrett@liv.ac.uk

Abstract

Jensen's book is a masterpiece of scholarship and careful reasoning. It is the definitive presentation of the outcome of thinking and empirical work carried out in a substantive psychological domain of interest, work that extends back to the beginning of the 20th century. The sheer breadth of knowledge, didactic material, and empirical facts contained within this book make it virtually unique. Yet, I find it disquieting. Not because of its socio-political or socio-genetic aspects, but because it appears to exemplify a truly fundamental mistake made by many psychologists, who assume that they are doing science when in fact they are merely observing and classifying phenomena with ever more complex quantitative statistical methodologies. Whilst I stand back in awe of Jensen's profound scholarship in this book, I feel he has inadvertently written the epitaph of g as a meaningful scientific construct.

Keywords

behavior genetics, cognitive modelling, evoked potentials, evolutionary psychology, factor analysis, g factor, heritability, individual differences, intelligence, IQ, neurometrics, psychometrics, psychophyiology, skills, Spearman, statistics

1. "The Science of Mental Ability." Just what does such a statement mean? To determine whether such a statement possesses validity, we need to consider a definition of science, and the constituent properties of three sciences that have given us the greatest knowledge of the natural world (physics, chemistry, and biology). A broad definition of science can be found in Collins English Dictionary (1991): "The systematic study of the nature and behaviour of the material and physical universe, based upon observation, experiment, and measurement, and the formulation of laws to describe these facts in general". So far so good. However, Michell (1997, 1999) has described two stages that are involved in natural science investigation. The first is determining whether an attribute possesses a quantitative structure, the second is devising procedures to measure magnitudes of that attribute. The first task is concerned with determining whether an isomorphism can be proposed between a putative unit of the construct, and the numbers used to represent that unit. This isomorphism can be proposed on either a theoretical basis, or on the basis of empirical investigation. All quantitative scientific measurement possesses such a "standard" unit and empirically verifiable unit-concatenation properties. Once this isomorphism is established, the second "instrumental" task of creating magnitude measurement can be undertaken. There are no units of 'g', no units of IQ, and no units of 'intelligence'. However, maybe Michell's view of quantitative scientific measurement is simply too highly constrained, and unit-free measurement can be a valid property within a special science of psychology?

2. The answer is no, not without relinquishing the very level of understanding of causality to which scientific investigation aspires. It is of interest to consider Jackson and Maraun's (1996a, 1996b, Maraun, 1998) position regarding the meaning ascribed to scientific constructs. From Wittgenstein and especially Ter Hark's (1990) propositions concerning the grammar of such meaning, it is clear that the temporal order of meaning assignment to a construct in science is crucial. Meaning is not an empirical phenomenon. You cannot assign meaning to your measurement on the basis of the outcome of any measurement process. In short, you cannot make a measurement unless you have a priori specified its meaning (even if this remains assumed or implicit). Maraun's contribution, relying heavily upon the Wittgenstein view of meaning, shows that from a different perspective, a Goedelian paradox exists in psychometrics. If we classify outcomes based upon item responses, can we determine from the empirical data (items) alone the correct meaning of what is being measured? For Maraun, meaning icannot be determined from the rules that govern the operations carried out on empirical item data. Furthermore, the existence of the "rules for instantiation of a construct" cannot be tested with empirical data, as the test itself implies the rules' existence. "[N]o empirical finding can refute or support a measurement claim. For example, the claim that 'these are measurements of IQ' cannot be shown to be correct or incorrect on the basis of the actual numbers recorded, nor the correlation of these numbers with other sets of numbers (e.g. measurements of school performance etc.). On the contrary, rules are constitutive for empirical evidence: these empirical findings are not about IQ at all unless they already are based upon numbers that have meaning as measurements of IQ" (Jackson and Maraun (1996b), p. 115).

3. If we conjoin Maraun's arguments with those of Michell, we see that there is an initial temporal sequence of operations in scientific investigation. First, we establish the meaning and set out what we think are the rules for instanting a proposed construct (note that we are not discovering such rules, merely trying to specify their existence and their operation). At this stage we may engage in phenomenon identification or qualitative research to refine our thinking. However, to progress in our understanding of causal processes, we have to link with Michell's first task of proposing/testing an isomorphism with a putative unit for a construct (or units and link functions between these units for its component processes). Finally, we attempt to create measurements of magnitudes that accord with the unit-concatenation properties hypothesized, and then determine whether our measures conform to prior deductions. It is this conformation (or lack of), within the constraint of unit-specified axiomatic measurement, that permits an investigator to determine whether the meaning and rules for instantiation proposed for the construct have been accurately construed. Merely adopting quantitative procedures of analysis, without prior consideration and specification of meaning, and without any attempt to investigate the quantitative structure of a putative psychological attribute, is a "pretence" at science (Michell, 1997). Whilst such work may have pragmatic benefit, little causal understanding of what is being measured is imparted. As Anderson (2000) has most eloquently stated earlier in these reviews, this is largely operational measurement; its meaning is defined from its content.

4. So, returning to Spearman's creation of the construct of g, Jensen (1998, 1999) shows us in chapter 2 of his book that Spearman proposed a latent (unobservable)"entity" which he called g. One of its instantiation rules was that this g is common to all mental ability tests. However, although no concatenation unit of g was ever specified, additive, linear, numerical operations were specified and used throughout the investigation of g. As Jensen indicates, Spearman was successful in demonstrating that there was some empirical evidence in support of such a latent variable. At this early point in time, I believe that a greater adherence to the principles of science as espoused above rather than just further identification of phenomena (creating IQ "measurement" and the associated "correlates" investigative approach it spawned) would have generated a fundamentally different research strategy from the 1940s onward than that currently set out in Jensen's book, a strategy to better specify the meaning and causal basis for g, rather than just describing where 'g' could be found as an "explanatory" variable (i.e race research). It seems that the desire to use a measure for applied, classificatory, pragmatic purposes was confounded with the scientific task of elaborating the meaning and causal basis for the construct. I think Jensen is still promulgating this confusion when he tries in chapter 3 to dispose of a construct of intelligence in favour of one operationally defined by "mental abilities". There is no real advantage in doing this. However, it is very convenient, as we no longer have to grapple with the meaning of what it is we are measuring -- for we simply call it "mental ability", and go on to develop psychometrically sterile classifications of these abilities.

5. Jensen refers throughout his book to a theory of g. In chapters 3 and 4 he shows conclusively how g can be empirically determined using mathematical constructions of covariance. This is empirical confirmation of Spearman's model and one of the tenets of a theory of g. But,we knew this 60 years ago! It seems differential psychologists have spent the last 50 years squabbling over abstract psychometry rather than substantive issues of causality. Jensen, on page 127, reporting Marshalek, Lohman, and Snow's (1983) non-metric analysis of ability tests, shows us that a change of analytic technique and measurement metric does not affect the substantive theoretical proposition that mental abilities have something in common. Maraun (1997) has done almost exactly the same with the five factor model, showing that the "factors" are not real in any sense of the word but merely a functional property of the measurement constraints of the particular quantitative analysis undertaken. Later work by Gustafsson and others on the psychometric "structure" of g seems irrelevant in this context -- a mere quantitative re-arrangemant of the psychometric furniture, when what the area was crying out for was substantive scientific investigation of the nature of g itself.

6. At this point, Buckhalt's (1999) comment that g is akin to the concept of gravity or radiation is apposite. However, let us look carefully at how Newton approached his universal law of gravitation. First there were his own and others' observations of planetary bodies, especially the moon. He was curious as to the nature of the force that must act in order to keep the moon in its nearly circular orbit around the earth. Since falling bodies were seen to accelerate, he concluded that they must have a force exerted on them. Newton identified a prevalent latent force, much the way Spearman and others have identified a prevalent latent g. However, this observational/descriptive work was followed immediately by the specification of a unit of magnitude of this gravitational force (computed in terms of mass, distance, and a universal gravitational constant). Finally, precise mathematical predictions were made and subsequently followed up with empirical confirmation, albeit 100 years later by Henry Cavendish. Several hundred years later, we have seen the introduction of the concept of gravitational fields and a graviton in gauge particle theory. This is not what has happened with g. There are no precise predictions as we have no testable meaning of what g might be, and what its units of measurement are, let alone the concatenation operations for those units. Of course, it might be suggested that the g factor is a relatively "young" construct. However, the youth or otherwise or a construct is immaterial when the fundamental scientific process of construct investigation is flawed. I refer those who question whether this matters to Lykken's (1991) recent chapter and conclusions in answering the question "What's wrong with Psychology anyway?". I would also refer the reader to Kline (1998) and his arguments concerning unit-less measurement and its constraints upon subsequent knowledge acquisition.

7. Looking at the correlational data presented by Jensen from the areas of electrophysiology, chronometrics, and nuclear imaging (including brain size and other biological and biochemical correlates), at first glance it appears that this is an impressive body of evidence. This is based on the sheer quantity of work, but it is not so impressive when we ask "what knowledge does this give us of g, its processes, and its causal mechanisms?". Further, as I have shown elsewhere (Barrett, 1999a), with regard to the evoked potential, choice reaction time, and inspection time correlates, correlation scatter plots from real experimental data show how one can obtain correlations of up to about -0.3, yet simultaneously be faced with the fact that no consistent theory or law can explain certain results from a substantive number of cases in the scatterplot (and I don't mean outliers).

8. Returning to the gravity analogy, it is as if we varied the size of one object's mass, whilst keeping the mass of a second object, and inter-mass distance constant, then observing a negative relation between gravitational force and some individually varying masses. From Newton's theory and the relationships proposed, we know this is impossible. Yet, we unquestioningly accept this kind of result in much of the biological correlates work. My other recent presentation on the string measure (Barrett 1999b) demonstrates just how deeply flawed this particular work really is (20 studies are listed, 10 with results in the right direction, 10 with results in the opposite or with no correlation at all). The flaw is not in the lack of replication, but in the lack of any coherent theory of why this parameter should be used at all. The Hendrickson nerve transmission theory for the genesis of an evoked potential was discarded many years ago within neuroscience, not directly, but by the accumulation of evidence within neuroscience concerning nerve transmission. When we turn to brain size and IQ, we are again faced with a phenomenal correlation - but no coherent, testable theory of why such correlations exist. These phenomenal correlations are of interest, and perhaps suggestive of some common processes. But without a strong theory of why any of these should correlate with IQ (which is itself a poor measurement scale), we are required to attribute meaning to the correlation as a post-hoc exercise, which once again invites operational explanations and ad hoc speculation. In this context, I note Jensen's responses to Verleger (1999) and Burns (1999), who have also made significant points about some of the electrophysiological evidence.

9. As we move to the population genetics of g in chapters 11 and 12, again, the evidence is impressive. The evidence of racial differences in g is substantive and clear. However, it is of uncertain value. As we have no coherent scientific idea of what we are measuring (what g is, and what causes it, remain a mystery), of what scientific value is it to know that we are measuring crude differences between races (IQ is a poorly constructed measure of g)? The language of many is loose, with talk of "genes for IQ". Of course, these results have enormous socio-political import and as has been said elsewhere, it is this section of the book that will capture the interest of many who read it. But, frankly, none of this is particularly useful from a scientific perspective. Jensen (and Rushton) point to the hypotheses they form and successfully test, but these hypotheses are at population levels, tested using measures for race and psychological attributes which are barely specified or defined (e.g., what are the key measurable variables that define a race, that can be causally attributed to those variables that define the constituents of g, as measured by IQ?). In short, both seem to avoid any serious attempt at a scientific explanation of what is causing such observed differences, in favour of demonstrating more and more phenomenal "population" effects. I do not deny the quantitative validity of any of these results concerning race differences; but without a proper scientific approach, with careful testing of hypothesised causal processes as its primary goal, I feel this work that is doomed to fail scientifically.

10. In conclusion, I wonder whether I have just made explicit what seems implicit throughout Jensen's book. That is, Jensen himself is recognising that this remains a scientifically sterile area while hypotheses remain at the level of the phenomenal description of the hypothesised outcomes of g (i.e. IQ). In using the term g and arguing that we should dropp the intelligence construct, Jensen is, I feel, still trying to evade the deeper issue of the meaning of the g. If we stand outside the area of individual differences, and look into a world where computational scientists are trying to build "intelligence" (the Artificial Intelligence and Connectionist community), we see an entire field of endeavour without so much as a mention of g. It is here that the key question (in my opinion) is being asked - and practically answered in part - "what are the constituent properties of a system that are required for it to develop intelligent behaviour". To answer this requires a completely different perspective on measuring intelligence as a within species and cross-species construct. Conway, Kane, and Engle (1999) propose working memory capacity as the determiner of g - but, why are they fixated on a single factor analytically determined latent entity as a causal variable? I find I am in total agreement with Anderson's (2000) last statement concerning the research ethos espoused in Jensen's book "as a strategy for the scientific understanding of human intelligence, it threatens to lead us into a wilderness from which there will be no return".

REFERENCES

Anderson, M. (2000) An unassailable defense of g but a siren-song for theories of intelligence. PSYCOLOQUY 11(013) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/2000.volume.11/ psyc.00.11.013.intelligence-g-factor.28.anderson http://www.cogsci.soton.ac.uk/psyc-bin/newpsy?11.013

Barrett, P.T. (1999a) Individual Differences: the end of an era. Where do we go from here? Presentation available at: http://www.liv.ac.uk/~pbarrett/present.htm

Barrett, P.T. (1999b) The String Measure, Evoked Potential Correlate Research, and Psychometric IQ. Presentation available at: http://www.liv.ac.uk/~pbarrett/present.htm

Buckhalt, J.A.(1999) Defending the science of mental ability and its central dogma. PSYCOLOQUY 10 (47) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ psyc.99.10.047.intelligence-g-factor.4.buckhalt http://www.cogsci.soton.ac.uk/psyc-bin/newpsy?10.047

Burns, N.R. (1999). Biological correlates of IQ scores do not necessarily mean that g exists. PSYCOLOQUY 10(73) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ psyc.99.10.073.intelligence-g-factor.15.burns http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?10.073

Conway, A.R., Kane, M.J., and Engle, R.W. (1999) Is Spearman's g determined by speed or working memory capacity? PSYCHOLOQUY 10(74) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ Psyc.99.10.074.intelligence-g-factor.16.conway http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?10.074

Jackson, J.S.H., Maraun, M. (1996a) The conceptual validity of empirical scale construction: the case of the Sensation Seeking Scale. Personality and Individual Differences, 21, 1, 103-110

Jackson, J.S.H., Maraun, M. (1996b) Whereof one cannot speak, thereof one must remain silent. Personality and Individual Differences, 21, 1, 115-118

Jensen, A.R. (1998). The g factor: The science of mental ability. Westport, CT: Praeger.

Jensen, A. (1999). Precis of: The g Factor: The Science of Mental Ability. PSYCOLOQUY 10(23) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ psyc.99.10.023.intelligence-g-factor.1.jensen http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?10.023

Kline, P.(1998) The New Psychometrics. Science, psychology, and measurement. London & New York: Routledge

Lykken, D.T. (1991) What's Wrong with Psychology Anyway? In Dante Cicchetti and William Grove (eds.) Thinking Clearly about Psychology. Volume 1: Matters of Public Interest. pp. 2-39. University of Minnesota press. http://www.psych.umn.edu/psyfac/emeritus_sr/Lykken/148.PDF

Makins, M. (1991) Collins English Dictionary, 3rd Edition. Glasgow, UK: Harper-Collins Publishers

Maraun, M.D. (1997) Appearance and Reality: Is the Big Five the Structure of Trait Descriptors?. Personality and Individual Differences, 22, 5, 629-647

Maraun, M.D. (1998) Measurement as a normative practice: Implications of Wittgenstein's philosophy for measurement in psychology. Theory and Psychology, Vol 8(4), 435-461

Marshalek, B., Lohman, D.F., and Snow, R.E. (1983) The complexity continuum in the radex and hierarchical models of intelligence. Intelligence, 7, 102-127

Michell, J. (1997) Quantitative science and the definition of measurement in Psychology. British Journal of Psychology, 88, 3, 355-383

Michell, J. (1999) Measurement in Psychology: Critical History of a Methodological Concept. Cambridge: Cambridge University Press.

Ter Hark, M. (1990) Beyond the Inner and the Outer. Berlin: Kluwer Academic Publishers.

Verleger, R. (1999). The g factor and event-related EEG potentials. PSYCOLOQUY 10(039) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ psyc.99.10.039.intelligence-g-factor.2.verleger http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?10.039

Volume: 11 (next, prev) Issue: 046 (next, prev) Article: 45 (next prev first) Alternate versions: ASCII Summary