On David Brooks and the Real Challenge of (Inconvenient) Data

Someone urged me to read David Brooks’ recent column, “The Philosophy of Data.” My first reaction was, “The man has already engaged in ersatz sociology, I can’t even imagine the atrocities against statistics he’ll commit.” My second reaction was, well, pretty much along the lines of my first one. Despite the excitement about the age of Big Data (and here’s a much-needed antidote), the real issue is, regardless of size, will people choose to ignore it?

One of the themes that I continually flog like a rented mule that I often return to is the misuse of basic educational data. For example, the common claim that U.S. students have shown no improvement is both incorrect and widespread, to the point where even a smart, datamonkey like Bill Gates made this mistake (though he has since backed off this claim). In fact, this misperception was so widely held that several years ago, the only prominent bloggers (and print commentators too) that were making this point were Kevin Drum, Bob Somerby, and yours truly. And the only reason I consider including myself on the ‘prominent’ list is because there was no one else making this point; I win by default. If there were others, they were so few and far between that I, not being a professional education expert, never encountered them. So why were so few making this argument? Well, there were a lot of people ideologically, politically and economically invested in the idea that the U.S. educational system, across the board, was poor and making no gains*.

The reason I raise this example is that the revelatory data were not ‘big data.’ In fact, any NAEP dataset, by my day job’s standards (genomics) is quite puny (BIG DATAZ! I HAZ IT!). The NAEP dataset, in terms of memory, is about one-thousandth the size of a distillation, a summary of one of my genomics datasets. For the educational data, no difficult statistical analysis (or any analysis at all) was required.

So the problem is the same as it ever was: people choose to ignore inconvenient data. And big datasets aren’t going to convince them.

*Unfortunately, those who were rightly skeptical about these claims mostly fell back on the dodge of claiming that the data weren’t a good measure (e.g., tests don’t tell us what we need to know) rather than doing the heavy lifting of looking at the data and realizing that doomsayers were full of it–and tragically, missing some of the real educational crises.

  1. johnkrehbiel says:

    My own experience as a high school science teacher is rather mixed. In some ways students are doing better, but at the cost, in my opinion, of teaching them to think for themselves.

    For instance, if I give then a series of steps to follow and enough practice, then can look up the electronegativities of two elements and tell what kind of bond will form between them. But if I ask them “What is the difference between a highly polar covalent bond and an ionic bond?” I get confused looks and “How am I supposed to know that?”

    My impression is that they have become very good at reciting answers supplied by teachers in advance of the test, but not so good at figuring things out.

  4. Thanks for posting about this…it’s an important issue… I’m going to try to fish this out of the NAEP website, but it would have been nice if you could have summarized the data on progress in NAEP test scores… in a post on data, it would be nice to see some!
    johnrekbiel’s point is important too (that testing isn’t necessarily a reliable measure of how well students are doing intellectually)
    (and the post from “breaking news” is spam, hopefully you’ll delete it)

