Arthur R. Jensen (2000) Psychometric Scepticism. Psycoloquy: 11(039) Intelligence g Factor (38)

Volume: 11 (next, prev) Issue: 039 (next, prev) Article: 38 (next prev first) Alternate versions: ASCII Summary

Topic:

Article:

PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).

Psycoloquy 11(039): Psychometric Scepticism

PSYCHOMETRIC SCEPTICISM
Reply to Harrington on Jensen on Intelligence-g-Factor

Arthur R. Jensen
Educational Psychology
School of Education
University of California
Berkeley, CA 94720-1670

nesnejanda@aol.com

Abstract

Harrington (1999) denies the reality of psychometric g (as an artifact) and the heritability of individual differences in the level of g. This is an extreme minority stance entirely at odds with current views and findings in this field. The position may still be tenable in principle, but the arguments adduced in its support here either misrepresent what is actually said in "The g Factor" (Jensen, 1998; 1999) or have long since been refuted, both in the book under review and in the recent literature on human intelligence and behavioral genetics.

Keywords

behavior genetics, cognitive modelling, evoked potentials, evolutionary psychology, factor analysis, g factor, heritability, individual differences, intelligence, IQ, neurometrics, psychometrics, psychophyiology, skills, Spearman, statistics

1. Again, I should probably heed Theodosius Dobzhansky's advice that argumentation is otiose when the parties are not in considerable agreement on fundamentals at the outset. But as my commitment to Psycoloquy requires a response, I shall comment on several of the points that seem to be the dominant themes in Harrington's (1999) critique.

2. The core of what Harrington (par. 1) calls a "reasonable rebuttal" is that "g is not a psychological phenomenon and is simply a statistical artifact derived from assumptions of linearity in data arising from multiple interacting causes." Actually, my thesis is that g is a behavioral or psychological phenomenon, in the sense that it was discovered and measured at the behavioral level, and that its causal basis lies in some condition(s) of the brain -- although psychometric measurements of it, necessarily being behavioral, are naturally affected to some degree by environmental conditions throughout the individual's life span from the very moment of conception. A "statistical artifact" is a measurement or a quantity derived from a set of measurements that necessarily -- repeat, necessarily -- results solely from the mathematical manipulations performed to arrive at the result. This is no more the case with the g factor extracted in a hierarchical factor analysis than it is with the sum total of the money you have today in your several different back accounts.

3. Another characteristic of a statistical artifact is that its result is not related to anything outside the mathematically derived variable. The g factor is not a necessary outcome of factor-analyzing a correlation matrix. A general factor is not found, for example, among various personality inventories. The g factor results from the fact that all tests in the mental abilities domain (as I have defined mental ability [Jensen, 1998, pp. 49-57]) are positively correlated. The fact that Pearsonian correlations reflect only the linear component of the relationship between two variables is no problem for factor analysis in the abilities domain, because it is rare that mental tests of any kind show any significant nonlinear components in their relationship, and the linear component is always by far the largest. Moreover, if one has reason to be believe that some other form of relationship exists between two variables, such as a curvilinear or quadratic relation, these nonlinear components can be removed by some transformation of the variable or can be entered independently into the factor analysis, with the effect of increasing the positive correlations between the variables. A student with a computer could do a master's thesis using artificial data to examine how much g is affected by having anywhere from 1 to n nonlinearly related variables in a correlation matrix; but this would be a rather pointless exercise, because in reality it is hard to find any pairs of ability variables whose covariance has any appreciable nonlinear components.

4. Related to this issue, Harrington (par. 7) goes on to say that all psychometric tests are specifically constructed so as to yield a g factor when the matrix of their intercorrelations is factor analyzed. But he seems to have it backwards. Since it has proved impossible to construct a battery of mental tests that are not all positively correlated with one another, thereby yielding a general factor, g, test constructors often attempt to maximize the one feature of their test battery that makes for a high g loading of the tests' composite score. But many test constructors have completely ignored factor analysis and g in particular. Instead, they have emphasized the test's external validity, i.e., the test's correlation with practical nonpsychometric variables and criteria, such as described in my Chapter 9 (Jensen, 1998) on the practical validity of g-loaded tests.

5. Simply as a by-product of constructing tests so as to maximize their practical predictive validity, one also increases the test's g loading, whether intentionally or not. This happens even when the tests are constructed by those who have no use for factor analysis and don't believe there is a g factor. I have shown that, however they are constructed, the predictive validity of various tests is directly related to their g loadings.

6. There is no doubt that g is the chief active ingredient in mental tests' correlations with a host of nonpsychometric variables. If someone wants to prove that g is a psychometric artifact of the way tests are constructed, let them construct a battery of individually reliable mental tests whose matrix of intercorrelations does not yield a g factor. This has been tried assiduously, without success. Attempts merely to lower the g loading of a test have shown that the test's external validity is thereby seriously impaired.

7. Because population group mean differences are closely related to tests' g loadings, there have been innumerable attempts to reduce tests' g loadings in the hope of reducing the size of group differences, most notably the mean Black-White difference. This can be accomplished to some degree by including in the battery more subtests with only moderate g-loadings. The price paid for doing this is diminished external validity. We can easily determine the effects of a complete removal of g from a test battery by partialling g out of the test's validity coefficient, leaving non-g factors and test specificity as the sole predictors. The typical result is validity coefficients that are so close to zero as to have no practical value, and this is true regardless of the external criterion.

8. Harrington's notion (par. 5) that the heritability of g cannot be inferred from any correlational structure is incorrect. It is tantamount to saying that the heritability of any variable cannot be inferred from the pattern of various kinship correlations (from MZ twins to genetically unrelated children reared together) when the correlations are based on sets of kin reared together and reared apart from infancy. In light of the great many studies of the behavior-genetic analysis of human mental abilities based on these statistical methods commonly used in quantitative genetics there is simply no basis for claiming that IQ and g do not show substantially heritability.

9. The same methods for estimating heritability have been applied to many metric physical characteristics, such as height, weight, and finger-print ridge count, which are found to have fairly high heritability. Does Harrington deny the heritability of these characters as well? What I have shown is that in a number of studies, various tests' g loadings predict the broad heritability of those tests with validity coefficients (i.e., correlations between g loadings and heritability coefficients) between .60 and .80. No other factor independent of g that can be extracted from the tests shows a significant correlation.

10. Most studies of the heritability of individual differences in mental abilities are based on IQ batteries or on single tests; if they are based on g factor scores, we should expect slightly higher estimates of heritability, since the variance due to non-g factors and specificity would be minimized. Natural selection has apparently caused the heritable component of individual-difference variance in Homo sapiens behavioral capacities to have rather similar influence across all mental abilities in each individual's overall cognitive development, causing these abilities to be positively correlated in the population. The g factor simply reflects this common causal factor, whatever its neural basis. It must of course be a product of natural selection. Alhough selection acts on the phenotypes, its effects are genetically transmitted across generations. Evolutionary psychologists have begun theorizing about the evolution of g itself, and this should be an interesting new development in g theory.

11. Harrington suggests that the heritability of the number of heads, hands, and fingers people have is "near zero," although these characteristics are certainly inherited. Not all characteristics have either genetic or phenotypic variance. Heritability is a coefficient, defined as genotypic variance divided by phenotypic variance. Since the phenotypic variance in the number of heads people have is effectively zero, and since the coefficient of any quantity divided by zero is not defined (but it is not zero, either), the concept of heritability is altogether meaningless in the absence of reliable phenotypic variance. But there is plenty of phenotypic variance in IQ and g.

12. If Harrington doubts that the methods of quantitative genetics are applicable to establishing the heritability of IQ (or presumably any other metric variable in humans), what methods are indeed applicable? (I doubt that Harrington could believe that the heritability of human traits that show considerable variance is truly zero.) Would we have to resort to true breeding experiments, rather than rely on natural genetic experiments (e.g., MZ twins reared apart versus unrelated adopted children reared together), or on actually identifying the specific genes in the DNA that are correlated with IQ? Such DNA research is in fact underway now; one hopes it will yield benefits beyond that of convincing sceptics that the genes are involved in variation in human mental abilities, including their common factor, g.

13. Harrington (par. 11) fails to note that R.A. Fisher (1918), in one of the classic articles in the history of genetics, resolved the debate between Mendelians (who dealt with single-gene traits) and the biometricians (who dealt with polygenic traits) by inventing a "Mendelian algebra" for perfectly generalizing Mendel's principles of single-gene inheritance to polygenic traits such as height and intelligence. Each gene in the polygenic system acts according to Mendelian principles. Absent from Mendel's single-gene formulation, but present in Fisher's polygenic model, is the phenomenon called epistasis (i.e., the interaction of genes at different chromosomal loci). Although normal variation in intelligence is clearly polygenic, many of the major gene defects causing various types of mental deficiency show Mendelian inheritance and are nearly always double-recessives at the same chromosomal locus.

14. Guilford's (1954) "Structure of Intellect" theory, which posited 150 independent mental abilities, is now defunct in ability theory, as the tests he devised to measure these hypothetically independent abilities were found to be as positively correlated with one another as any other tests, and so they yield a good g when factor analyzed. Sir Godfrey Thomson's "sampling theory," to which Guilford referred, has remained merely a mathematical exercise without substantive content. As a theory, it has not generated any empirical research or advanced knowledge about the nature of intelligence. It is an empirically empty model for explaining correlations between mental test variables based on common "elements," without any indication of what these elements are. Since no way in principle can be suggested by which the theory could ever be empirically falsified, it fails the crucial criterion for a true scientific theory.

15. It is amazing to see a reference to Wissler's (1901) primitive study, which has been used in generations of psychology textbooks to discredit the Galtonian analytical-physical approach to the study of individual differences in mental ability. The great amount of fruitful research in recent years showing highly significant and theoretically important relationships between chronometric measures of information processing speed in various experimental tasks (Jensen, 1998, Chapter 8; Vernon, 1987) has completely contradicted the conclusions nearly every psychologist in the past (except Spearman, 1904) drew from Wissler's conspicuously flawed study.

16. Indeed, the uncritical acceptance of Wissler's findings inhibited research on mental chronometry for more than half a century. The gist of this story is that Wissler reported an utterly nonsignificant correlation of -.02 between measures of reaction time (RT) and "intelligence." There were no IQ tests at that time, so class grades in mathematics and classics were used as the measure of intelligence. The subjects were undergraduates in Columbia University, a group highly selected for scholastic aptitude (which is highly g-loaded). The reliability of the RT test is estimated at between .15 and .20. In those days virtually nothing was known about psychometrics, so the attenuation of the correlation coefficient by low reliability and severe restriction of the range-of-talent was not considered.

17. The failure to reject the null hypothesis in this famous study is an extreme and classic case of the statistician's Type II error, i.e., not rejecting the null hypothesis when it is false. The same was true of the other Galtonian tests used by Wissler, such as measures of visual, auditory, and haptic discrimination. In recent years, all these variables have been found to be correlated with g. These Galtonian kinds of measurements, implemented by modern technology, may afford more direct and pointed means of discovering the physiological basis of g than the much more complex mental tasks used to measure IQ. Psychology is now moving, I believe, in both a more analytical and a more biological direction and shows healthy signs of recovering from the scientifically unfortunate influences from its past history -- dualism, subjectivism, hollow-organism behaviorism, and environmentalism sans genetics.

REFERENCES

Fisher, R.A. (1918). The correlation between relatives on the supposition of Mendelian inheritance. Transactions of the Royal Society (Edinburgh), 52, 399-433.

Guilford, J. P. (1954). Psychometric methods (2nd ed.). New York: McGraw-Hill.

Harrington, G.M. (1999). Born before genes: The g legacy. PSYCOLOQUY 10(079) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ psyc.99.10.079.intelligence-g-factor.18.harrington http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?10.079

Jensen, A.R. (1998). The g factor: The science of mental ability. Westport, CT: Praeger.

Jensen, A.R. (1999). Precis of: "The g Factor: The Science of Mental Ability" PSYCOLOQUY 10(23) ftp://ftp.princeton.edu/pub/harnad/Psycoloquy/1999.volume.10/ psyc.99.10.023.intelligence-g-factor.1.jensen http://www.cogsci.soton.ac.uk/cgi/psyc/newpsy?10.023

Spearman, C.E. (1904). "General intelligence" objectively determined and measured. American Journal of Psychology, 15, 201-292.

Vernon, P.A., Ed. (1987). Speed of information processing and intelligence/ Norwood, NJ: Ablex.

Wissler, C. (1901). The correlation of mental and physical tests. New York: Columbia University.