OK, so my mind isn’t as great as Stephen J. Gould’s was, but when The Bell Curve was first published, I remember looking at the data appendices, and thinking, “These data are crap.” A few years later, I found an essay by Gould in The Bell Curve Wars that made the same point, albeit more eloquently. So why bring this up?
I’ve discussed the recent resurgence of idiotic statements about IQ and genetics, but something Atrios wrote about Saletan’s recent missive bugged me (boldface mine):
You know, when Saletan went down his courageous racist road it at first didn’t even occur to me to bother to revisit the racism of his new patron saint, Rushton, because I thought this was something that everyone already knew about. I mean, not everyone, of course, but everyone who had spent a bit of time reading about this stuff. I’ve been around this block several times, from when the Bell Curve came out, to when it was sliced and diced by reputable economists, to the multiple times the “Teh Science Says Black People Are Stupid” conversation has erupted in the blogosphere.
I’ll get to why the highlighted part bugs me in a moment. One of the most basic statistical tests that biologists routinely use is called analysis of variance (ANOVA). Basically, this technique attempts to assign variance in a trait to different experimental factors. For example, suppose you perform an experiment where you want to figure out what effect high and low nutrient treatments and high and low light exposure have on the height of plants. ANOVA will allow you to determine whether either factor influences growth, and if there synergistic interactions between the two effects (e.g., plants in high light and high nutrients grow better than all other treatments).
As you might imagine, ANOVA is fundamental to many biological analyses, so it bugs me when Atrios talks about economists ‘slicing and dicing’ The Bell Curve: there were biologists who did it too. Which brings me to The Bell Curve.
Leaving aside all of the other problems with the studies that falsely purport to demonstrate a racial effect on IQ, and those problems are legion, the data presented in the appendices of The Bell Curve are crap. First, the data are presented in a dishonest way. The factors in the analysis–the variables they’re testing–are presented unintelligibly. You can call “light level” “lile” in your datafile, but, in the published table, it should be called “light level.” Strong arguments can and will be presented clearly; weak ones can not.
(An aside: This has always made me wonder if Murray even understands the analyses–his coauthor,
James Richard Hernnstein, died during the writing of the book. The appendices appear as if whoever included them did not understand the data, and therefore it was not presented clearly.*)
The other dishonest presentation–and this directly addresses the quality and overall significance of the data–has to do with R-squared values. R-squared assesses the extent to which the model as a whole can account for the data. For example, a R-squared of 0.5 means that 50% of the total variation in whatever you’re measuring (in this case, IQ) can be accounted for by the variables in the model. The Bell Curve, for each of the analyses, reports R not R-squared (R multiplied by R). Not only isn’t this typically done, but it’s a pretty transparent attempt to make the results appear more significant than they truly are.
Most of the analyses have R-squared values of less than 0.1, and several have values close to 0.01. And this is the amount of variation accounted by all variables in the model: the effect of race is often a small fraction of that percent. Given the uncertainty surrounding IQ to begin, there really is no there there.
At the time, not only was I a wee lil’ Mad Biologist, but the internets had barely been invented by Al Gore**, so I was basically left with telling this to a few people who cared (or pretended to, anyway). What was worse is that nobody else seemed to get this. The popular discussion always assumed that the basic findings of The Bell Curve were legit, when, in fact, they were not.
A few years later, I was talking about this with a colleague, and he mentioned that Gould had written an essay, “Curveball”, in The Bell Curve Wars, where he made the same points (along with many others–this is the definitive debunking of The Bell Curve as far as I’m concerned). About ANOVA, Gould writes (boldface mine):
The book is also suspect in its use of statistics. As I mentioned, virtually all its data derive from one analysis-a plotting, by a technique called multiple regression, of social behaviors that agitate us, such as crime, unemployment, and births out of wedlock (known as dependent variables), against both IQ and parental sociometric status (known as independent variables). The authors fIrst hold IQ constant and consider the relationship of social behaviors to parental socioeconomic status. They then hold socioeconomic status constant and consider the relationship of the same social behaviors to IQ. In general, they find a higher correlation with IQ than with socioeconomic status; for example, people with low IQ are more likely to drop out of high school than people whose parents have low socioeconomic status.
But such analyses must engage two issues–the form and the strength of the relationship–and Herrnstein and Murray discuss only the issue that seems to support their viewpoint, while virtually ignoring (and in one key passage almost willfully hiding) the other. Their numerous graphs present only the form of the relationships; that is, they draw the regression curves of their variables against IQ and parental socioeconomic status. But, in violation of all statistical norms that I’ve ever learned, they plot only the regression curve and do not show the scatter of variation around the curve, so their graphs do not show anything about the strength of the relationships–that is, the amount of variation in social factors explained by IQ and socioeconomic status. Indeed, almost all their relationships are weak: very little of the variation in social factors is explained by either independent variable (though the form of this small amount of explanation does lie in their favored direction). In short, their own data indicate that IQ is not a major factor in determining variation in nearly all the social behaviors they study–and so their conclusions collapse, or at least become so greatly attenuated that their pessimism and conservative social agenda gain no significant support.
And about R-squared (here written as R2; boldface mine):
Herrnstein and Murray actually admit as much in one crucial passage, but then they hide the pattern. They write, “It [cognitive ability] almost always explains less than 20 percent of the variance, to use the statistician’s term, usually less than 10 percent and often less than 5 percent. What this means in English is that you cannot predict what a given person will do from his IQ score. . . . On the other hand, despite the low association at the individual level, large differences in social behavior separate groups of people when the groups differ intellectually on the average.” Despite this disclaimer, their remarkable next sentence makes a strong causal claim. “We will argue that intelligence itself, not just its correlation with socioeconomic status, is responsible for these group differences.” But a few percent of statistical determination is not causal explanation. And the case is even worse for their key genetic argument, since they claim a heritability of about 60 percent for IQ, so to isolate the strength of genetic determination by Herrnstein and Murray’s own criteria you must nearly halve even the few percent they claim to explain.
My charge of disingenuousness receives its strongest affirmation in a sentence tucked away on the first page of Appendix 4, page 593: the authors state, “In the text, we do not refer to the usual measure of goodness of fit for multiple regressions, R2, but they are presented here for the cross-sectional analyses.” Now, why would they exclude from the text, and relegate to an appendix that very few people will read, or even consult, a number that, by their own admission, is “the usual measure of goodness of fit”? I can only conclude that they did not choose to admit in the main text the extreme weakness of their vaunted relationships. Herrnstein and Murray’s correlation coefficients are generally low enough by themselves to inspire lack of confidence. (Correlation coefficients measure the strength of linear relationships between variables; the positive values run from 0.0 for no relationship to 1.0 for perfect linear relationship.) Although low figures are not atypical for large social-science surveys involving many variables, most of Herrnstein and Murray’s correlations are very weak-often in the 0.2 to 0.4 range. Now, 0.4 may sound respectably strong, but-and this is the key point-R2 is the square of the correlation coefficient, and the square of a number between zero and one is less than the number itself, so a 0.4 correlation yields an R-squared of only .16. In Appendix 4, then, one discovers that the vast majority of the conventional measures of R2, excluded from the main body of the text, are less than 0.1. These very low values of R2 expose the true weakness, in any meaningful vernacular sense, of nearly all the relationships that form the meat of The Bell Curve.
The point isn’t that I’m as smart as Gould (I’m not), but that anyone with basic statistical training could have figured this out had they bothered to actually read the appendices (or the book for that matter). This, of course, does not apply to the overwhelming majority of the Very Serious People….
*The other odd thing is that the chapters wherein the data are reported are very conservative (scientifically, not politically) in the assessment of what the data might mean, whereas the concluding chapters are unsubstantiated flights of fantasy. This also makes me wonder if there was substantial post-Herrnstein editing or writing.
**Yes, I know this is a false smear against Gore, but if he can joke about it, then I will too.