The g factor is the highest-order common factor that can be extracted in a hierarchical factor analysis from a large battery of diverse tests of various cognitive abilities. It is the most important psychometric construct in the study of individual differences in human cognitive abilities. Since its discovery by Spearman in 1904, the g factor has become so firmly established as a major psychological construct in terms of psychometric and factor analytic criteria that further research along these lines is very unlikely either to disconfirm the construct validity of g or to add anything essentially new to our understanding of it. In fact, g, unlike any of the primary, or first-order, factors revealed by factor analysis, cannot be described in terms of the knowledge content of cognitive test items, or in terms of skills, or even in terms of theoretical cognitive processes. It is not essentially a psychological or behavioral variable, but a biological one, a property of the brain. But although not itself a cognitive ability, g is what causes positive correlations among individual differences in performance, even on cognitive tasks that differ greatly with respect to sensory motor modality, brain modularity, and learned cognitive skills and knowledge. The g factor derived from conventional nonspeeded psychometric tests shows higher correlations than any other factors independent of g with various measures of information-processing efficiency, such as working memory capacity, choice and discrimination reaction times, and perceptual speed. A test's g loading is the best predictor of its heritability and its sensitivity to inbreeding depression. Psychometric g also has more direct biological correlates than any other independent source of test variance, for example brain size, brain evoked potentials, nerve conduction velocity, and the brain's glucose metabolic rate during cognitive activity. The ultimate arbiter among various "theories of intelligence" must be the physical properties of the brain itself. The current frontier of g research is the investigation of the anatomical and physiological features of the brain that cause g. Research has reached the point at which the only direction left in which to go is that presaged by Spearman himself, who wrote that the final understanding of g must "come from the most profound and detailed direct study of the human brain in its purely physical and chemical aspects" (1927, p.403).
PSYCOLOQUY CALL FOR BOOK REVIEWERS
Below is the Precis of "The g Factor" by Arthur Jensen (905 lines). This book has been selected for multiple review in PSYCOLOQUY. If you wish to submit a formal book review please write to firstname.lastname@example.org indicating what expertise you would bring to bear on reviewing the book if you were selected to review it.
(If you have never reviewed for PSYCOLOQUY or Behavioral & Brain Sciences before, it would be helpful if you could also append a copy of your CV to your inquiry.) If you are selected as one of the reviewers and do not have a copy of the book, you will be sent a copy of the book directly by the publisher (please let us know if you have a copy already). Reviews may also be submitted without invitation, but all reviews will be refereed. The author will reply to all accepted reviews.
Full Psycoloquy book review instructions at:
Psycoloquy reviews are of the book not the Precis. Length should be about 200 lines [c. 1800 words], with a short abstract (about 50 words), an indexable title, and reviewer's full name and institutional address, email and Home Page URL. All references that are electronically accessible should also have URLs.
AUTHOR'S RATIONALE FOR SOLICITING COMMENTARY
The g factor arises from the empirical fact that scores on a large variety of independently designed tests of extremely diverse cognitive abilities all turn out to be positively correlated with one another. The g factor appears to be a biological property of the brain, highly correlated with measures of information-processing efficiency, such as working memory capacity, choice and discrimination reaction times, and perceptual speed. It is highly heritable and has many biological correlates, including brain size, evoked potentials, nerve conduction velocity, and cerebral glucose metabolic rate during cognitive activity. It remains to investigate and explain its neurobiological basis. Commentary is invited from psychometricians, statisticians, geneticists, neuropsychologists, psychophysiologists, cognitive modellers, evolutionary psychologists and other specialties concerned with cognitive abilities, their measurement, and their cognitive and neurobiological basis.
1. In the 2,000-year prehistory of psychology, which was dominated by Platonic philosophy and Christian theology, the cognitive aspect of mind was identified with the soul, and conceived of as a perfect, immaterial, universal attribute of humans. This vastly delayed the study of mental ability, or intelligence, as an attribute discernible in people's idiosyncratic behavior, and therefore as manifesting individual differences.
2. The formal recognition of individual differences in mental ability as a subject for study in its own right arose as an outgrowth of the idea of evolution in the mid-nineteenth century. For the first time in history, animals' behavioral capacities and humans' mental abilities were recognized as a product of the evolutionary process, as were the physical systems of organisms. Darwin's theory of natural selection as the mechanism of evolution implied that organisms' behavioral capacities, along with their anatomy and physiology, evolved as adaptations to particular environments. In Darwin's theory, hereditary variation is a necessary condition for the working of natural selection. From this insight, Herbert Spencer, the early philosopher of evolution, interpreted individual differences in intelligence as intrinsic to the human condition. He further introduced the notion that human intelligence evolved as a unitary attribute.
3. Individual differences in mental qualities, however, did not become a subject for empirical study until the latter half of the nineteenth century, with the pioneer efforts of Sir Francis Galton, who is generally regarded as the father of differential psychology (the study of individual and group differences in human traits, which includes behavioral genetics). Galton introduced the idea of objective measurement of human capacities, devised tests to measure simple sensory and motor functions, and invented many of the statistical concepts and methods still used in the study of individual differences. He was the first to apply empirical methods to studying the inheritance of mental ability. Galton's conclusions, or beliefs, are consistent with his empirical findings but are not at all adequately supported by them. They may be briefly summarized as follows:
4. Human mental ability has both general and specific components: the general component is the larger source of individual differences; it is predominantly a product of biological evolution, and is more strongly hereditary than are specific abilities, or special talents. Mental ability, which ranges widely in every large population, is normally distributed, and various human races differ, on average, in mental ability. General ability is best measured by a variety of fairly simple tests of sensory discrimination and reaction time.
5. In putting forth their ideas, which harmonized with the Darwinian revolution in biology, Spencer and Galton had, by the end of the nineteenth century, set the stage for nearly all the basic ideas and questions that have dominated research and theoretical controversy in twentieth-century differential psychology.
6. Spearman invented factor analysis, a method which permitted a rigorous statistical test of Spencer's and Galton's hypothesis that a general mental ability enters into every kind of activity requiring mental effort. A well established empirical method of finding positive correlations among measures of various mental abilities is putative evidence of a common factor in all of the measured abilities. Factor analysis makes it possible to determine the degree to which each of the variables is correlated (or loaded) with the factor that is common to all the variables in the analysis. Spearman gave the label "g" to this common factor, which is manifested in individual differences on all mental tests, however diverse.
7. Spearman's two-factor theory held that every mental test, however diverse in terms of the contents or skills called for, measures only two factors: g and s, a factor specific to each test. But later research based on larger numbers of tests than were available in Spearman's early studies showed that g alone could not account for all of the correlations between tests. So Spearman had to acknowledge that there are other factors besides g, called groupfactors, which different groups of tests, each with similar task demands (such as being either verbal, spatial, numerical, or mechanical), have in common.
8. By comparing tests with high and low g factor loadings, Spearman concluded that g is most strongly reflected in tests which call for the "eduction of relations and correlates," for example, reasoning to solve novel problems, as contrasted with recalling previously acquired knowledge or using already well learned skills.
9. Spearman thought of g metaphorically as "mental energy" that could be applied to any and every kind of mental task, and likened group factors and specificity to specialized "engines" for the performance of certain types of tasks. According to Spearman, individual differences in potential performance on any mental task result from two sources: differences in the amount of mental "energy" that can be delivered to the specific "engine" that mediates performance of the task, and differences in the efficiency of energy utilization by the "engine." The efficiency of the various "engines" differs independently within the same person.
10. Although Spearman remained agnostic concerning the biochemical and physiological basis of this energy, it was his fervent hope that scientists would eventually discover a physical basis for g.
11. The word "intelligence" as an intraspecies concept has proved to be either undefinable or arbitrarily defined without a scientifically acceptable degree of consensus. The suggested remedy for this unsatisfactory condition is to dispense with the term "intelligence" altogether when referring to intraspecies' individual differences in the scientific context, and focus on specific mental abilities, which can be objectively defined and measured. The number of mental abilities, so defined, is unlimited, but the major sources of variance (i.e., individual differences) among myriad abilities are relatively few, because abilities are not independent, but have sources of variance in common.
12. The empirical fact that all mental abilities are positively correlated calls for an analytic taxonomy of mental abilities based on some form of correlation analysis. Factor analysis has proven to be the most suitable tool for this purpose. By means of factor analysis it is possible to describe the total variance of various abilities in terms of a smaller number of independent dimensions (i.e., factors), or components of variance, that differ in their degree of generality. "Generality" refers to the number of abilities that are correlated with a particular factor. The common factors in the abilities domain can be represented hierarchically in terms of their generality, with a large number of the least general factors (called first-order or primary factors) at the base of the hierarchy and the single, most general, factor at the apex.
13. Ability measurements can be represented geometrically and mathematically as vectors in space, with a common origin and with the angles between them related to their intercorrelations. Factors are the "reference axes" in this space, and the number of orthogonal axes, or independent dimensions, needed to represent the ability measurements, defines the number of factors. The dimensions found in the factor analysis of the correlations among a large variety of mental ability measurements can be arranged hierarchically according to their generality. This hierarchical structure typically has three tiers, or strata: a large number of narrow (i.e., least general) first-order factors, a relatively small number (six to eight) of broad (i.e., more general) second-order factors, and, at the apex, a single third-order factor, conventionally symbolized as g. The g factor is the most general of all and is common to all mental abilities.
14. The general factor, g, can be extracted from the correlation matrix of a battery of mental ability tests by a number of different methods of factor analysis and according to different models of the factor structure of abilities. Provided the number of tests in the analyzed battery is sufficiently large to yield reliable factors and the tests are sufficiently diverse in item types and information content to reflect more than a single narrow ability, a g factor always emerges. The only exception occurs when orthogonal rotation of the principal axes is employed. That method expressly precludes the appearance of a g factor. With orthogonal rotation, the g variance remains in the factor matrix, but is dispersed among all of the group (or primary) factors. This method of factor analysis (for which the most common factor rotation method is known as varimax) is not appropriate to any domain of variables, such as mental abilities, in which substantial positive correlations among all the variables reveal a large general factor.
15. Among the various methods of factor analysis that do not mathematically preclude the appearance of g when it is actually latent in the correlation matrix, a hierarchical model is generally the most satisfactory, both theoretically and statistically. In a hierarchical analysis, a number of correlated group factors (first-order factors) are extracted first. The g factor then emerges as a second-order factor (or as a third-order factor in some very large and diverse batteries) from the correlations among the first-order factors (or among the second-order factors when g is at the third order).
16. The g factor is found to be remarkably invariant across all the various methods of factor analysis, except those that mathematically preclude the appearance of a general factor.
17. The g factor is found to be relatively invariant across different batteries of diverse tests of mental ability. This fact justifies the postulation of a true g (analogous to true score in classical measurement theory), of which the g obtained in any empirical study is an estimate.
18. The g factor is also found to be ubiquitous and relatively invariant across various racial and cultural groups.
19. The form of the population distribution of g is not known, because g cannot yet be measured on a ratio scale, but there are good theoretical reasons to assume that the distribution of g approximates the normal, or bell-shaped, curve.
20. The g factor is ubiquitous in all mental ability tests, and tests' g loadings are a continuous variable, ranging from values that are slightly greater than zero on some tests, to values that are near the reliability coefficient of other tests.
21. Although certain types of tests consistently show higher g loadings than others, it is conceptually incorrect to regard characteristics (e.g., relation eduction and abstract reasoning) of such tests as the "essence" or "defining characteristic" of g.
22. These features of tests may indicate the site of g, but not its nature. Unlike the group factors, the g factor cannot be described in terms of the item characteristics and information content of tests. Nor is g a measure of test difficulty: a test's g loading and its difficulty are conceptually separate.
23. It is wrong to regard g as a cognitive process, or as an operating principle of the mind, or as a design feature of the brain's neural circuitry. At the level of psychometrics, ideally, g may be thought of as a distillate of the common source of individual differences in all mental tests, completely stripped of their distinctive features of information content, skill, strategy, and the like. In this sense, g can be roughly likened to a computer's central processing unit. The knowledge and skills tapped by mental test performance merely provide a vehicle for the measurement of g. Therefore, we cannot begin to fathom the causal underpinning of g merely by examining the most highly g-loaded psychometric tests. At the level of causality, g is perhaps best regarded as a source of variance in performance associated with individual differences in the speed or efficiency of the neural processes that affect the kinds of behavior called mental abilities (as defined in section III).
24. Viewpoints and theories antithetical to, or in some cases mistakenly thought to be antithetical to, the large body of psychometric evidence supporting the presence of a predominant general factor, g, in the domain of mental abilities are reviewed below. The proponents of the specificity doctrine, which holds that mental tests measure only a collection of bits of knowledge and skills that happen to be valued by the dominant culture in a society, as well as those who hold that individual differences in mental abilities reflect only differences in opportunities for learning certain skills, largely of a scholastic nature, or the contextualists who claim that mental ability is not general but is entirely specific to particular tasks and circumstances, have not produced any empirical evidence that contradicts the existence of the ubiquitous g factor found in any large and diverse collection of mental tests. There are, however, more rigorous critiques of g.
25. Guilford's Structure-of-Intellect (SOI) model, which claims 150 separate abilities, is supported only by a type of factor analysis that mathematically forces a large number of narrow factors to be uncorrelated, even though all the various ability tests that are entered into the analysis are correlated with one another. Guilford's claim of zero correlations between ability tests is unsupported by evidence: the few zero and negative correlations that are found are attributable to sampling error and other statistical limitations.
26. Cattell's theory of fluid intelligence (Gf) and crystallized intelligence (Gc) is reflected as second-order factors in tests that are either highly culture-reduced (Gf) or highly culture-specific (Gc), and is particularly valid in culturally and educationally heterogeneous populations. The greater the homogeneity in the population, however, the higher is the correlation between Gf and Gc. The correlation between these second-order factors is represented in a hierarchical factor analysis as a single third-order factor, namely g. Typically there is a near-perfect correlation between Gf and g, so that when the second-order factors are residualized, thereby subsuming their common variance into g, the Gf factor vanishes. In other words, Cattell's Gf and the third-order factor, g, turn out be one and the same.
27. Guttman's radex model, a multidimensional scaling method for spatially representing the relations between diverse mental tests, perfectly parallels the relationships shown in a hierarchical factor analysis. Tests' g loadings derived from factor analysis are displayed spatially in the radex model by the tests' proximity to the center of the circular array, with the most highly g-loaded tests being closest to the center.
28. Gardner's theory of seven independent "intelligences" is contradicted by the well established correlations between at least four of these "intelligences", verbal, logical-mathematical, spatial, and musical, all of which are substantially g loaded. The factorial structure of two of the "intelligences", interpersonal and intrapersonal, has not been determined, so their g loadings remain unknown, and one ability, kinesthetic, probably does not fall into the mental abilities domain as defined in Section III. There is no incompatibility between g and the existence of neural modules that control particular abilities.
29. Sternberg's componential and triarchic theories, which are sometimes mistakenly thought to be incompatible with g theory, are shown to be entirely consistent with it. Sternberg's theory explains the existence of g in terms of information processing components and metacomponents rather than in terms of any unitary process or property of the brain, a subject to be considered in Section VIII.
30. Virtually all present day researchers in psychometrics now accept as a well established fact that individual differences in all complex mental tests are positively correlated, and that a hierarchical factor model, consisting of a number of group factors dominated by g at the apex (or the highest level of generality), is the best representation of the correlational structure of mental abilities.
31. The fact that psychometric g has many physical correlates proves that g is not just a methodological artifact of the content and formal characteristics of mental tests or of the mathematical properties of factor analysis, but is a biological phenomenon. The correlations of g with physical variables can be functional (causal), or genetically pleiotropic (two or more different phenotypic effects attributable to the same gene), or genetically correlated through cross-assortative mating on both traits, or the nongenetic result of both being affected by some environmental factor (e.g., nutrition). The physical characteristics correlated with g that are empirically best established are stature, head size, brain size, frequency of alpha brain waves, latency and amplitude of evoked brain potentials, rate of brain glucose metabolism, and general health.
32. The general factor of learning and problem-solving tasks in infrahuman animals has some properties similar to the g factor in humans, and experimental brain lesion studies suggest that a task's loading on the general factor is directly related to task complexity and to the number of neural processes involved in task performance.
33. It is clear that since it is a product of human evolution, g is strongly enmeshed with many other organismic variables.
34. Individual differences in mental test scores have a substantial genetic component indexed by the coefficient of heritability (in the broad sense), that is, the proportion of the population variance in test scores attributable to all sources of genetic variability. The broad heritability of IQ is about .40 to .50 when measured in children, about .60 to .70 in adolescents and young adults, and approaches .80 in later maturity.
35. Environmental variance can be partitioned into two sources: (1) environmental influences that are shared by children reared in the same family but that differ between families, and (2) nonshared environmental influences that are specific to each child in the same family, and therefore differ within families. The shared environmental variance diminishes from about 35 percent of the total IQ variance in early childhood to near zero percent in late adolescence. The non-shared environmental variance remains nearly constant at around 20 to 30 percent from childhood to maturity. That is, virtually all of the nongenetic variance in adult IQs is attributable to within-family causes, while virtually none is attributable to the kinds of environmental variables that differ between families. The specific sources of much of the within-family environmental variance are still not entirely identified, but a large part of the specific environmental variance appears to be due to the additive effects of a large number of more or less random and largely physical events - developmental "noise" - with small, but variable positive and negative influences on the neurophysiological substrate of mental growth.
36. More of the genetic variance in test scores is associated with g than with any other common factor. Hence the relative g loadings of various tests predict their relative heritability coefficients (the proportion of genetic variance in the test scores).
37. Traits that show genetic dominance provide evidence that they have been subjected to natural selection as a Darwinian fitness character over the course of evolution. IQ, and particularly its g component, manifest the theoretically predictable effects of genetic dominance: inbreeding depression in the offspring of consanguineous parents, and the opposite effect, hybrid vigor (or heterosis), that shows up in the offspring when each parent has a different racial ancestry. Tests' relative g loadings significantly predict the degree to which various tests manifest both inbreeding depression and heterosis. These data support the hypothesis that the g factor of psychometric tests has arisen through natural selection over the course of human evolution and therefore can be regarded as a fitness character in the Darwinian sense.
38. Psychometric g can be studied more analytically by means of elementary cognitive tasks (ECTs) than is possible with the conventional IQ tests, with items based on past acquired knowledge, reasoning, and problem solving requiring the concerted action of a number of relatively complex cognitive processes. A particular ECT is intended to measure a few relatively simple cognitive processes, independently of specific knowledge or information content. Each ECT is devised to tap a somewhat different set of cognitive processes, and performance on two or more different ECTs yields data from which individual differences in distinct processes can be measured, such as stimulus apprehension, discrimination, choice, visual search, scanning of short term memory (STM), and retrieval of information from long term memory (LTM).
39. Typically ECTs nvolve no past learned information content, and in those that do, the content is so familiar and over learned as to be common to all persons taking the ECT, as can be shown on a nonspeeded version of the ECT. Most ECTs are so simple that every person in the study can perform them easily, and individual differences in performance must be measured in terms of response time (RT). The theoretically most interesting ECTs are those with RTs of less than one second and with response error rates close to zero. The subject's median RT (over n number of trials) and the subject's intraindividual variability of RTs (measured as the standard deviation of RT, or RTSD, over n trials) are of particular interest. Another type of ECT, known as Inspection Time (IT), measures sheer speed of perceptual discrimination (visual or auditory) independently of RT.
40. Measures of RT, RTSD, and IT derived from the various ECTs are correlated with IQ. For single ECTs, the correlations average about -.35, ranging from about -.10 to -.50, depending on the complexity or number of distinct processes involved in the ECT. Some processes are more strongly correlated with IQ than others. ECTs that strain the capacity of working memory generally have larger correlations with IQ. A composite score based on the RTs and RTSDs from several different ECTs, thereby sampling a greater number of different processes, typically correlates between .50 and .70 with IQ. (Recall that the average correlation between various standard IQ tests is about .80.) Factor analysis and the method of correlated vectors show that it is the g component of IQ (or of any other kind of cognitive test) that is almost entirely responsible for the correlations between ECTs and conventional psychometric tests. RT and RTSD show only negligible loadings on group factors independent of g.
41. The RT x g correlation is not explained by speed-accuracy tradeoff, use of strategies, or motivation. Nor can the correlation be attributed to correlating RT with speeded psychometric tests. Most studies of the RT X g correlation are based on untimed or nonspeeded IQ tests. Moreover, the RTs of ECTs have near-zero loadings on the speed-of test-taking factor that emerges from some factor analyses of test batteries that include speeded tests.
42. The RT X g correlation reflects individual differences in the speed and efficiency (i.e., trial-to-trial consistency of RT, as measured by RTSD) of information processing. As there is a general factor of speed of processing common to virtually all ECTs, and as this general speed-of-information-processing factor is highly loaded on psychometric g, it is hypothesized that g is explainable, at least in part, in terms of the speed and efficiency of information processing.
43. The physiological properties of the brain that might account for the speed-of-processing aspect of g are not yet known completely, but it seems safe to say that they would have to be properties that are common to all regions and modules of the brain that subserve cognitive functions in which there are reliable individual differences in the neurologically normal population. One obvious candidate is individual differences in nerve conduction velocity (NCV). Brain NCV increases along with measurements of mental growth from childhood to maturity and decreases along with mental decline in old age. NCV is also significantly correlated with IQ (Raven matrices) in college students. (The great theoretical importance of this finding, based on a single study, absolutely demands its replication.) It has been hypothesized that periodic oscillation of the synchronized action potentials of groups of neurons may account for intraindividual variability (RTSD) in ECTs. It has also been hypothesized that random biological "noise" in the neural transmission of information in the brain causes slower and less efficient information processing, individual differences in which constitute some part of g. NCV and "noise" in neural transmission are related to the degree of myelination of nerve fibers, which may be the major physiological variable underlying g. Considerable empirical evidence indicates a relationship between myelin and other physiological and behavioral phenomena that are correlated with g. Structural, neural net, or "design" features of the brain have scarcely been investigated in relation to g in normal persons and cannot be evaluated in this respect at present.
44. At the level of complex psychometric tests the g factor is unitary. But it now appears most unlikely that g is unitary at the level of its causal underpinnings, as indicated by the timed measurements of performance on various ECTs and by neurophysiological measurements of variables such as NCV, rate of glucose metabolism (PET scan), and degree of myelination (MRI) of nerve fibers.
45. Practical validity is indicated by a significant and predictively useful correlation of a measurement with some educational, economic, or social criterion that is deemed important by many people. The g factor (and highly g-loaded test scores, such as the IQ) shows a more far-reaching and universal practical validity than any other coherent psychological construct yet discovered. It predicts performance to some degree in every kind of behavior that calls for learning, decision, and judgment. Its validity is an increasing monotonic function of the level of cognitive complexity in the predicted criterion. Even at moderate levels of complexity of the criterion to be predicted, g is the sine qua non of test validity. The removal of g (by statistical regression) from any psychometric test or battery, leaving only group factors and specificity, absolutely destroys their practical validity when they are used in a population that ranges widely in general ability.
46. The validity of g is most conspicuous in scholastic performance, not because g-loaded tests measure specifically what is taught in school, but because g is intrinsic to learning novel material, grasping concepts, distinctions, and meanings. The pupil's most crucial tool for scholastic learning beyond the primary grades - reading comprehension - is probably the most highly g-loaded attainment in the course of elementary education.
47. In the world of work, g is the main cognitive correlate and best single predictor of success in job training and job performance. Its validity is not nullified or replaced by formal education (independent of g), nor is it decreased by increasing experience on the job.
48. Although g has ubiquitous validity as a predictor of job performance, tests that tap other ability factors in addition to g may improve the predictive validity for certain types of jobs, tests of spatial ability for mechanical jobs and of speed and accuracy for clerical and secretarial jobs.
49. Meta-analyses of hundreds of test validation studies have shown that the validity of a highly g-loaded test with demonstrated validity for a particular job in a particular organizational setting is generalizable to virtually all other jobs and settings, especially within broad job categories.
50. The g factor is also reflected in many broad social outcomes. Many social behavior problems, including dropping out of school, chronic welfare status, illegitimacy, child neglect, poverty, accident proneness, delinquency, and crime, are negatively correlated with g or IQ independently of social class of origin. These social pathologies have an inverse monotonic relation to IQ level in the population, and show, on average, nearly five times the percentage of occurrence in the lowest quartile (IQ below 90) of the total distribution of IQ as in the highest quartile (IQ above 110).
51. As a construct, the g factor can be represented with varying degrees of convenience, efficiency, and validity by a wide variety of vehicles (psychometric tests, laboratory techniques, physiological indices) which yield measurements that have different scale properties. These three key concepts are related to one another, but do not all represent the same thing. It is important to recognize the distinctions between them when considering the nature of empirically observed changes in objective mental measurements. These may be spontaneous changes in test scores within an individual, or a secular trend in the mean of a population, or score gains induced by training or other interventions.
52. The critical question, then, is the locus of the change. Does it represent a change in the construct itself? Or is the change more attributable to properties of the vehicle, or to properties of the scale of measurement? The item content of the Stanford-Binet IQ tests, for example, differs from one age level to the next. Several different highly g-loaded tests (e.g., Stanford-Binet, Wechsler, Raven) differ in other factors unrelated to g. What exactly has changed, the level of g or the non-g sources of variance? Is a unit change in one range of the measuring scale equivalent to a unit change in another range, that is, are the measurements an interval scale throughout their range? A change in the measurement is not necessarily a change in the level of the construct; it could reflect any one (or a combination) of several different sources of variance in the measurements.
53. Evidence for an authentic change in the construct g requires broad transfer or generalizability across a wide variety of cognitive performance. Anything less implies changes in lower-order factors, or in test specificity, or in conditions peculiar to the tests, or the conditions of administration, or the measurement scales.
54. The practice effect from taking a given g-loaded test, as indicated by the amount of test-retest gain in score, appears to be unrelated to g. Test-retest gains probably reflect only the source of variance known as the test's specificity.
55. Some persons show large, apparently spontaneous changes in IQ from one testing to another. They are a small minority of all persons who have been tested. All but about 10 percent of this group showing large changes in IQ (or in other g-loaded test scores) can be accounted for by the normal distribution of measurement errors. The 10 percent or so not so accounted for by measurement error are not attributable to any specific systematic causes and are statistically unpredictable for any given individual. The kinds of events and life experiences typically invoked post hoc to explain large IQ changes are in fact not significantly correlated with IQ change but occur with the same frequency among persons who have shown little or no change in IQ.
56. Over the past half-century or so, a secular upward trend in IQ averaging three IQ points per decade has been observed in many developed countries. The gain has been greater on tests of fluid abilities (Gf) than of crystallized abilities (Gc), and it is generally greater in the lower than in the upper half of the IQ distribution. It is uncertain to what extent the rise in IQ represents a real change in g itself.
57. Several different theories have been propounded to account for the secular rise in IQ, involving changing attitudes (e.g., risk taking, guessing tendencies) toward mental tests, effects of extended schooling and more widespread education throughout all strata of society, and improvements in nutrition and medical and health care. That many such biological factors could be a major cause of the IQ gains is suggested by the fact that, over the same period of time, the average physical stature of the population has shown a comparable increase (measured in standard deviation units). Experimental attempts to raise IQ have not produced large or lasting effects. The most intensive and extensive psychological interventions, beginning shortly after birth and continuing until five or six years of age, when the treated children enter regular schools, have produced gains of twenty to thirty or more IQ points above that of a control group at the peak of their effectiveness. But these large gains diminish greatly over time. Moreover, the almost negligible generalizability, or transfer, of the training effect to scholastic performance during the years following treatment suggests that it is not the level of g, but only the test scores that were raised, and suggests that most of the training effect resulted from "teaching to the test". The IQ gain is thus "hollow" with respect to g. However, the most recent and best-conducted intensive intervention experiment showed a lasting gain equivalent to about five IQ points (at age twelve), and a significant transfer to scholastic achievement and to unconventional g-loaded Piagetian tests, which suggest that the expected outcome is a real change in the level of g. The long-term persistence of this gain, which some experts question, could be established by a follow up study, perhaps when the subjects are high school seniors.
58. Because IQ is strictly a phenotype, as is every observable or measurable human characteristic, it does not, by itself, support any inference concerning the cause of either individual or group differences in IQ. Whatever their cause, IQ differences are related to variables of immense practical consequence in the modern world. The substantial correlation of IQ with many educational, economic, and social criteria has been well established. Largely for this reason, there has been a long-standing interest in the IQ differences between various populations in the United States that markedly differ, on average, on these salient criteria. By far the most extensively researched group differences in IQ are those between the two largest populations in the United States: persons of European ancestry who are socially identified as "white" and persons of some African ancestry who are socially identified as "black" or African-American.
59. The approximately normal distribution of IQ, as measured by nationally standardized tests, shows that, on average, the American black population scores below the white population by about 1.2 standard deviations, equivalent to eighteen IQ points. Blacks in Sub-Saharan Africa score about two standard deviations (approximately thirty IQ points) below the mean of whites on nonverbal tests.
60. This statistical mean difference between the American black and white populations has scarcely changed over the past eighty years for which IQ data have been available. However, it varies across different regions of the country, being largest in the Southeast and decreasing in magnitude on a gradient running north and west. The mean difference, which is in evidence by about three years of age, increases slightly from early childhood to maturity. These are simply the phenotypic, psychometric, and statistical facts. The average difference, of course, is relatively small compared to the range of variation within either population and, in fact, is not much greater than the average difference between full siblings reared together in the same family.
61. The most visible educational, economic, and social consequences of the group difference in IQ arise largely from two effects: (i) the statistical characteristics of the normal curve, and (ii) the minimum probable threshold of the level of ability needed for certain socially valued attainments. When two normal distributions of IQ have different means, although the curves largely overlap one another, a given cut-score on the IQ scale can make a very large difference between the proportions of the lower-scoring group and the higher scoring group that fall below (or above) the cut-score.
62. The further the distance of the cut-score from the mean of the higher scoring group, the larger is the group difference between the proportion of each group that falls above (or below) the cut-score. Cut-scores on the IQ scale that fall at critical thresholds (mental retardation, passing grades in regular classes, high school graduation, college admission, college degree, high-level occupation, and the like), result in conspicuous disparities between the proportions of the higher- and lower-scoring groups that fall into different social and occupational categories. It is reasonable, therefore, to enquire into the nature and causes of these group disparities. Only their strictly phenotypic or psychometric aspects are examined in this section. Extensive research on test bias has shown that no fraction of the white-black (W-B) IQ difference, at least in the United States, is attributable to any cultural bias in the tests. Nor is the magnitude of the difference a function of the formal characteristics of the tests, such as verbal, nonverbal, individual versus group administration, culture-loaded, or culture-reduced. For all of their legitimate, practical, and typical uses, present-day psychometric tests of mental ability have the same reliability and validity for native, English speaking blacks (and American-born, English- speaking Hispanics and Asians) as they have for whites.
63. The magnitude of the mean black-white difference, however, varies considerably across tests that have different homogeneous item contents. This variation between tests in the size of the standardized mean W-B difference is not explainable in terms of test bias or in terms of differences in types of item content or other formal or superficial characteristics of the tests. Charles Spearman (1927) suggested that the different relative magnitudes of the W-B differences on various tests are a function of each test's g loading. This hypothesis (now called "Spearman's hypothesis") has since been tested in numerous studies based on large, representative samples of the American black and white populations. The hypothesis is strongly borne out in these studies. The degree to which a particular test is g loaded predicts the magnitude of the standardized mean W-B difference on that test better than any other psychometric factor yet identified. This implies that the W-B difference consists mainly of a difference in g. However, two other factors, independent of g, also show a W-B difference: blacks, on average, exceed whites on a short-term memory factor, while whites, on average, exceed blacks on a spatial visualization factor. The effects of these factors, however, show up only on tests that involve these factors, whereas the g factor enters into the W-B difference on every kind of cognitive test.
64. Spearman's hypothesis has also been studied using elementary cognitive tasks (ECTs) that measure the time it takes a person to process information presented in tasks which are so simple that all persons in the study sample are able to perform them correctly in only one or two seconds. The chronometric variables derived from such ECTs vary in their g loadings and show significant W-B differences. The extent to which the different ECT variables are g loaded predicts the relative magnitudes of the standardized mean WB differences on the chronometric variables derived from the ECTs. Spearman's hypothesis is thus confirmed even for tasks that do not call upon previously acquired knowledge or skills and that scarcely resemble conventional psychometric tests.
65. The relationship of the g factor to a number of biological variables and its relationship to the size of the white-black differences on various cognitive tests (i.e., Spearman's hypothesis) suggests that the average white-black difference in g has a biological component. Human races are viewed not as discrete, or Platonic, categories, but rather as breeding populations that, as a result of natural selection, have come to differ statistically in the relative frequencies of many polymorphic genes.
66. The "genetic distances" between various populations form a continuous variable that can be measured in terms of differences in gene frequencies. Racial populations differ in many genetic characteristics, some of which, such as brain size, have behavioral and psychometric correlates, particularly g. What I term the default hypothesis states that the causes of the phenotypic differences between contemporary populations of recent African and European descent arise from the same genetic and environmental factors, and in approximately the same magnitudes, that account for individual differences within each population. Thus genetic and environmental variances between groups and within groups are viewed as essentially the same for both populations. The default hypothesis is able to account for the present evidence on the mean white-black difference in g. There is no need to invoke any ad hoc hypothesis, or a Factor X, that is unique to either the black or the white population. The environmental component of the average g difference between groups is primarily attributable to a host of microenvironmental factors that have biological effects. They result from nongenetic variation in prenatal, perinatal, and neonatal conditions and specific nutritional factors.
67. Past studies of a sex difference in general ability have often been confounded by improper definitions and measurements of "general ability" based on simple summation of subtest scores from a variety of batteries that differ in their group factors, by the use of unrepresentative groups selected from limited segments of the normal distribution of abilities, and by the interaction of sex differences with age-group differences in subtest performance. These conditions often yield a mean sex difference in the total score, but such results, in principle, are actually arbitrary, of limited generality, and are therefore of little scientific interest. The observed differences are typically small, inconsistent in direction across different batteries, and, in above average samples, usually favor males.
68. In this section, sex differences are specifically examined in terms of their loadings on the g factor for a number of test batteries administered to representative population samples. When the sex differences (expressed as a point-biserial correlation between sex and scores on each of a number of subtests) were included in the correlation matrix along with the various subtests, and the correlation matrix was subjected to a common factor analysis, sex had negligible and inconsequential loading on the g factor, averaging about .01 over five test batteries. Applying the method of correlated vectors to these data shows that the magnitude of the sex difference on various subtests is unrelated to the tests' g loadings. Also, the male/ female variance ratio on diverse subtests (generally indicating greater male variability in scores) is unrelated to the subtests' g loadings. Although no evidence was found for sex differences in the mean level of g or in the variability of g, there is clear evidence of marked sex differences in certain group factors and in test specificity. Males, on average, excel on some factors, females on others. The largest and most consistent sex difference is found on a spatial visualization factor that has its major factor loadings on tests requiring the mental rotation or manipulation of figures in an imaginary three-dimensional space. The difference is in favor of males, and within each sex is related to testosterone level. But the best available evidence fails to show a sex difference in g.
69. The g factor derives its broad significance from the fact that it is causally related to many real-life conditions, both personal and social. These relationships form a complex correlational network, or nexus, in which g is a major node. The totality of real-world variables composing the g nexus is not yet known, but a number of educationally, socially, and economically critical elements in the nexus have already been identified and are the subject of ongoing research. Complex statistical methods have been developed for analyzing correlational data to help determine the direction of causality among the elements of the g nexus. These elements include personally and socially significant variables, such as learning disabilities, level of educational attainment, illiteracy, poverty, employment and income, delinquency, crime, law abidance, and personal integrity.
70. The limitations of g as an explanatory variable in personal achievements have also been recognized. A person's level of g acts only as a threshold variable that specifies the essential minimum level required for different kinds of achievement. Other, non-g special abilities and talents, along with certain personality factors, such as zeal, conscientiousness, and persistence of effort, are also critical determinants of educational and vocational success. Since the psychometric basis of g is now well established, future g research will extend our knowledge in two directions. In the horizontal direction, it will identify new nodes in the g nexus, by studying the implications for future demographic trends, employment demands, and strategies for aiding economically developing countries. Research in the vertical direction will seek to discover the origins of g in terms of evolutionary biology and the causes of individual differences in terms of the neurophysiology of the brain.
Burt, C. (1955) 'The evidence for the concept of intelligence.' British Journal of Educational Psychology, 25, 159-177.
Carroll, J. B. (1993) Human cognitive abilities: A survey of factor analytic studies. Cambridge, U.K.: Cambridge University Press.
Cattell, R. B. (1971) Abilities: Their structure, growth, and action. Boston: Houghton-Mifflin.
Eysenck, H.J. (1967) 'Intelligence assessment: A theoretical and experimental approach.' British Journal of Educational Psychology, 37, 81-98.
Galton, F. (1869) Hereditary genius. London: Macmillan.
Jensen, A.R. (1987) 'The g beyond factor analysis.' In R. R Ronning, J.A. Glover, J. C. Conoley, & J. C. Witt (Eds.), The influence of cognitive psychology on testing (pp. 87-142). Hillsdale, NJ: Erlbaum.
Jensen, A.R. (1997) 'The neurophysiology of g.' In C. Cooper & V. Varma (Eds.), Processes in individual differences (pp. 107-124). London: Routledge.
Jensen, A.R. (1998) 'The g Factor: The Science of Mental Ability.' Westport CT: Praeger
Spearman, C. (1904) 'General intelligence, objectively determined and measured.' American Journal of Psychology, 15, 201-293.
Spearman, C. (1927) The abilities of man: Their nature and measurement. New York: Macmillan.
Vernon, P.A. (1987) (Ed.) Speed of information-processing and intelligence. Norwood, NJ: Ablex.
Vernon, P.A. (1993) (Ed.) Biological approaches to the study of human intelligence. Norwood, NJ: Ablex.