Arthur R. Jensen (2000) Some Recent Overlooked Research on the. Psycoloquy: 11(106) Bell Curve (3)

Volume: 11 (next, prev) Issue: 106 (next, prev) Article: 3 (next prev first) Alternate versions: ASCII Summary
PSYCOLOQUY (ISSN 1055-0143) is sponsored by the American Psychological Association (APA).
Psycoloquy 11(106): Some Recent Overlooked Research on the

Commentary on Reifman on Bell-Curve

Arthur R. Jensen
School of Education
University of California
Berkeley, CA 94720-1670


Reifman's (2000) method of literature search focuses so much on books and articles aimed specifically at criticizing "The Bell Curve" (TBC) by Herrnstein and Murray (1994) as to miss other recent publications that importantly advance the scientific underpinnings of the arguments involved in TBC. A few of these publications are noted here.


IQ, adoption studies, behavior genetics, bell curve, crime, education, intelligence, nature/nurture, poverty, twin studies, uterine environment.
1. Reifman's (2000) review provides a rather lopsided impression of the scientific, as opposed to the ideological, reactions during the years since the publication of Herrnstein & Murray's (1994) "The Bell Curve" (TBC). Searching the literature by using little more than the keyword Bell Curve, as Reifman did, was bound to turn up a preponderance of negative criticisms of TBC and to overlook researches published in scholarly and scientific journals and books that are more relevant to understanding the scientific issues at the basis of TBC. Most empirical researchers in the relevant fields, as contrasted with a good many social philosophers, commentators, and ideological critics, have found little to disagree with scientifically in TBC and therefore have not had any incentive to write critical commentaries with an aim of putting down this important feature of TBC. However, specialized journals concerned with human variability in mental abilities, intelligence, and individual differences, such as INTELLIGENCE and PERSONALITY AND INDIVIDUAL DIFFERENCES, have published many studies since the appearance of TBC that extend and strengthen the body of evidence that supports the arguments of TBC. Those concerned with the issues raised by TBC will appreciate knowing of some of these recent additions to the literature. I will cite a few of them that seem to have the most far-reaching significance and are worthy of critical examination and further empirical research.

2. Reifman's most conspicuous omission is the research monograph by Charles Murray (1998), which I trust will be described in Murray's (2000) reply to Reifman's review. How could this important study have been overlooked? A follow-up analysis of the NLSY data, based on within-family measures of mental abilities and achievements, it deepens and amplifies the social concerns associated with the wide range of variation in these variables in the population. For one thing, it empirically opposes sociologists' long-favored theory that socio-economic status (SES) is a chief causal factor in individual and group differences in IQ and its important real-life correlates such as scholastic performance, job status, and income. It is surely an eye-opener and a 'must read' for all those who are concerned with the central issues of TBC.

3. Another unmentioned study, comparing the causal contributions of SES and genetic and/or biological factors to individual differences in IQ, is especially cogent because it is based on a full adoption design. This design yields an assessment of the separate effects of the SES of adopted children's biological parents (from whom they were separated in infancy. i.e., a genetic/ biological effect) and the SES of their adoptive parents (who reared them from infancy, i.e., a cultural/ environmental effect). The full adoption study consists of a 2 X 2 factorial design comparing the IQs of four evenly divided groups of subjects: school-age children who were born either to biological parents of high SES or to parents of low SES, and were adopted as infants either by parents of high SES or by parents of low SES. The original study (Capron & Duyme, 1989) is often cited to show that, indeed, both the biological and the environmental conditions influenced IQ scores based on the Wechsler Intelligence Scale for Children-Revised (WISC-R). But these data also allow a further analysis, which reveals a most crucial finding regarding the interpretation of these adoption data (Jensen, 1998a). The g factor scores derived from the intercorrelations of the various WISC-R subtests and accounting for some 50 to 60 percent of their total variance show virtually no effect of the SES-related environmental difference (i.e., the SES of the children's adoptive parents) but strongly and significantly reflect the SES-related genetic/biological effect (i.e., the SES of the children's biological parents).

4. As amply reviewed in TBC and elsewhere (Gottfredson, 1997; Jensen, 1998b), the psychometric evidence shows that the g factor is by far the largest component of the practical validity of the Wechsler scales and most other general ability test batteries for predicting scholastic achievement, success in job training programs, job performance, and occupational status. It also shows larger correlations with more physical, brain-related variables, such as brain size, brain glucose metabolic rate, and evoked electrical brain potentials, than any other psychometric factors independent of g. Accounting for only about half of the total variance in all of a battery's diverse subtests, their common factor, g, is the component of variance that most accounts for mental test scores' correlations with so many educationally, socially, and economically significant variables. The remaining reliable variance in test scores consists of group factors (e.g., verbal, numerical, and spatial abilities independent of g) that constitute less general and more specialized abilities or skills, and test specificity -- items of informational content or skill that are unique to the particular test.

5. It is a serious mistake in the context of Reifman's review to confuse the specific item-information content of the tests with the construct or latent trait that IQ tests are intended to measure, which is g. Whether or not a person knows who wrote Hamlet? or can define the meaning of vindicate, or can recall a string of seven digits is itself trivial. What the test is intended to measure and in fact what constitutes the largest proportion of the test's population variance is the general factor that is common to all of the highly diverse items composing the test and accounts for the substantial correlations between verbal and nonverbal, numerical and spatial, and many other kinds of mental tests. In the Wechsler battery, for example, g factor emerges from the substantial correlations between such diverse subtests as Vocabulary and Block Designs, General Information and Object Assembly, Similarities and Picture Arrangement. These tests have no elements of information content or skills in common; but they do have g in common.

6. It is clear that g does not reflect a property of the test itself, because many extremely different tests can measure one and the same g, which is a result of individual differences in the speed and efficiency of information processing, whatever the content of the information to which the individuals are exposed may be. An IQ test is a vehicle for assessing individuals' level of g. And as I have explained in detail elsewhere (Jensen, 1998, Chapter 10), a vehicle carries some excess baggage besides the latent trait it is intended to measure. This excess baggage in the case of IQ tests consists of certain so-called group factors (verbal, spatial, numerical, memory) and specificity, or variance that is unique to particular subtests or items and therefore contributes nothing to the item intercorrelations or subtest intercorrelations. But in virtually all present-day IQ tests the group factors and specificity, when residualized from g, constitute only a minor portion of the total variance in IQ as measured in a representative sample of native born, English speaking Americans. In the previously cited adoption study it is this excess baggage rather than g that accounts for the effect of SES of the adoptive parents on the adoptees' IQ. The SES difference in adoption experience affected only the non-g aspect of the test scores; the g factor scores that best reflect the g factor showed no effect of the SES factor at all but was clearly correlated with the biological parents' SES.

7. It should also be noted that the Armed Forces Qualification Test (AFQT) used in TBC as a measure of general intelligence is highly g-loaded and correlates as much with a variety of other IQ tests as they correlate with each other. Reifman's (par. 7) endorsement of the claim that the AFQT is really a test of schooling is therefore misleading and invalidates his conclusion that this claim strengthens the case against TBC.

8. The secular increase of approximately 3 IQ points per decade over the last few decades, a phenomenon now known as the Flynn effect, has been a favorite citation by those who wish to minimize the importance of IQ, their theory being that if the average IQ of the population can rise across time, it must not represent anything very important, or at least not anything genetic (see Neisser, 1998). Since the Flynn effect is discussed in TBC, it is surprising that Reifman does not bring it up. Probably the most interesting thing that has been discovered about the Flynn effect since the publication of TBC is similar to the finding of the adoption study previously mentioned. The Flynn effect for Wechsler tests shows it to be unrelated to the g factor; that is, the magnitudes of the secular gains on the various subtests are not at all related to the subtests' g loadings (Rushton, 1999). Apparently the secular gains in Wechsler IQ are attributable to gains in the non-g variance components in the subtests rather than to their common factor, g. What these non-g components consist of in terms of group factors or specificity is as yet undetermined. But Rushton's finding is made worrisome to TBC critics by the fact that the average difference between the mean White IQ and the mean Black IQ on the Wechsler tests is strongly related to g; that is, the g loadings of the subtests significantly predict the magnitudes of the White-Black mean differences on each of the various subtests, an effect known as Spearman's hypothesis (Jensen, 1998b, Chapter 11). When the identical statistical analysis of White-Black differences on the Wechsler scales that displays the effect predicted by Spearman's hypothesis is applied to the subtest differences due to secular gains on the Wechsler scales the result is completely different. Hence, as far as we can tell at present, the Flynn effect and the Spearman effect are entirely different phenomena. So it is unwarranted to use the Flynn effect to belittle the import of the Spearman effect or the Black-White IQ difference.

9. Heritability is most easily understood as the squared coefficient of correlation between phenotype and genotype for a given continuous polygenic trait in the general population. Most critics of TBC do their best to minimize the heritability of IQ. They would prefer to have it zero, but as that has proved wholly impossible they settle for values around 0.4, which is about the heritability of test scores obtained on very young children. The evidence indicates that the heritability of IQ increases with age from early childhood (with heritability around 0.4) to later maturity (over 0.7). Hence it is improper to speak of estimates of heritability without taking age into consideration, rather than viewing all the existing studies of IQ heritability as attempts to estimate one and the same true value of IQ heritability in the population.

10. A more serious problem and less understood phenomenon, however, is the nature of the non-genetic variance in IQ, which may constitute anywhere from 25 to 50 percent of the total IQ variance, depending on the age of the subjects. The preponderance of evidence indicates that by late adolescence virtually none of the non-genetic variance in the population consists of between-families or shared environmental influences, but consists of within-family or unshared environmental influences. To social reformers who discount the message of TBC, this finding, which came as a surprise even to behavioral geneticists, is almost as unacceptable as the evidence for IQ heritability. The non-genetic within-family causes of IQ variation are NOT shared by children who are reared together in the same family, whether they are twins, full siblings, or genetically unrelated adopted children. But TBC critics who minimize heritability appear to believe that the environmental variance in IQ consists mostly of influences that affect all of the children reared togther in the same family (i.e., shared environmental influences). These include the cultural and socio-economic factors on which well-intentioned social reformers have largely based their their hopes of explaining, or even overcoming, the wide range of IQ variation in the population. But it is that between-families component of IQ variance that diminishes to near-zero by mid-adolescence. The only remaining non-genetic variance by that time exists almost entirely within families. The environmentally caused differences in IQ between children reared together are as large as those between children picked at random from different environments.

11. What are the causes of this unsystematic environmental variation in IQ that constitutes all of its non-genetic variation? My analysis of within-family environmental variation affecting IQ, based on identical (i.e., monozygotic) twins reared together, suggests that the unshared or within-family variance is largely the result of a great number of purely random microenvironmental influences (Jensen, 1997). Any one of these influences has too small an effect to be reliably detected on the IQ scale, but their effects are normally distributed and, in concert, cause a considerable range of IQ variation. Similarly the game of roulette is purely random, and all the players who spend, say, an hour playing roulette leave the casino with widely differing amounts of winnings and losses. In the case of IQ, however, my analysis indicates that the IQ 'losses' due to the random microenvironment somewhat exceed the 'winnings.' That is, the microenvironmental factors unfavorable to mental development appear to be either more potent or more prevalent than those that are favorable.

12. Much of the random microenvironment effect on IQ is probably biological and originates from the moment of conception. It is related, for example, to maternal age, parity, immunological incompatibilities between mother and child, variation in birth weight, obstetrical practices, and other perinatal and postnatal effects, including traumas and childhood diseases add to the microenvironment. Then there are other purely random non-genetic effects that might be called 'para-genetic,' as they are carried by genes that are 'tagged' or 'imprinted' by certain molecules; but the essential genes, though 'imprinted,' are not themselves a part of the individual's genotype for a given trait. Yet these 'tagged' genes cause differences in fetal size and birth weight independent of mother's age, size, or parity; low birth weight especially affects IQ unfavorably.

13. It is believed likely that there are a great many more randomly imprinted genes (perhaps hundreds) whose effects are not yet identified (Melton, 2000). The biological and random nature of the microenvironment probably accounts for the difficulty and meager results of attempts to raise children's IQ (especially its g component), substantially and durably, by manipulating only the psychological-educational and cultural-socioeconomic aspects of the child's environment. These are alas not the main locus of control of mental development (Jensen, 1989).


Capron, C. & Duyme,M. (1989). Assessment of effects of socioeconomic status on IQ in a full cross-fostering design. Nature, 340, 552-553.

Gottfredson, L.S. (Ed.) (1997). Intelligence and social policy [special issue]. Intelligence, 24, (1)

Herrnstein, R. J., & Murray, C. (1994). The Bell Curve: Intelligence and Class Structure in American Life. New York: Free Press.

Jensen, A.R. (1989). Raising IQ without increasing g? A review of 'The Milwaukee Project: Preventing mental retardation in children at risk' by H.L. Garber. Developmental Review, 9, 234-258.

Jensen, A.R. (1997). The puzzle of nongenetic variance. In R.J. Sternberg & E.L. Grigorenko (Eds.) Heredity, intelligence, and environment (pp. 42-88). Cambridge:Cambridge University Press.

Jensen, A.R. (1998a). Adoption data and two g-related hypotheses. Intelligence, 25, 1-6.

Jensen, A.R. (1998b). The g factor: The science of mental ability. Westport, CT: Praeger.

Jensen, A.R. (1999). Precis of: "The g Factor: The Science of Mental Ability" PSYCOLOQUY 10(023) psyc.99.10.023.intelligence-g-factor.1.jensen

Melton, L. (2000). Womb wars. Scientific American, 283, 24-26.

Murray, C. (1998). Income inequality and IQ. Washington, D.C.: American Enterprise Institute.

Murray, C. (2000). Heritability and the Independent Causal Role of IQ in "The Bell Curve" (Herrnstein & Murray 1994). PSYCOLOQUY 11(105) psyc.00.11.105.bell-curve.2.murray

Neisser, U. (Ed.) (1998). The rising curve: Long term gains in IQ and related measures. Washington, DC: American Psychological Association.

Reifman, A. (2000). Revisiting The Bell Curve. PSYCOLOQUY 11(099) psyc.00.11.099.bell-curve.1.reifman

Rushton, J. P. (1999) Secular gains in IQ not related to the g factor and inbreeding depression unlike Black-White differences: A reply to Flynn. Personality and Individual Differences, 26: 381-389.

Volume: 11 (next, prev) Issue: 106 (next, prev) Article: 3 (next prev first) Alternate versions: ASCII Summary