The Original Sin of the Reinhart-Rogoff Paper Is the Errroneous Data

I realize many readers will have no idea what the hell “Reinhart-Rogoff” is. We’ll get to that in a bit. But first, as prelude, I want to dredge this bit from the depths of the archives about technique and technical details:

Something that bothers me (among many, many things that do…) is the use of the phrase ‘get technical’ as in “I don’t want to get technical.” I won’t claim I’ve never used the phrase, but it’s really inappropriate when science bumps up against public policy, since the quality and validity of the science depends on the technique. Good science, be it physical or social science, requires proper technique. That includes design and analysis, as well as the scope of claims one draws from the results. So technical details aren’t something to be glossed over–they have to be rigorously assessed. Put another way: if your methods are crap, then your results are garbage. If you are thinking of discussing those results, you need to shut your piehole.

This sciencey stuff is hard, and you have to be very rigorous and triple-check everything. And then check it again.

So onto the Reinhart-Rogoff scandal–and have no doubt this is a scientific scandal that makes the ENCODE kerfuffle look like child’s play. The short version is that economists Carmen Reinhart and Kenneth Rogoff released in 2010 a paper that claimed that when debt reaches ninety percent of GDP (or more), economic growth decreases dramatically. Naturally, the austerity über alles crowd jumped all over this, even though there were some serious methodological issues with the paper–for accessible but, erm, technical discussions, I refer you to Mike Konczal and Noah Smith.

But recently, an independent group got hold of their raw data, and to put it bluntly, they screwed up an Excel sheet (boldface mine):

As Herndon-Ash-Pollin puts it: “A coding error in the RR working spreadsheet entirely excludes five countries, Australia, Austria, Belgium, Canada, and Denmark, from the analysis. [Reinhart-Rogoff] averaged cells in lines 30 to 44 instead of lines 30 to 49…This spreadsheet error…is responsible for a -0.3 percentage-point error in RR’s published average real GDP growth in the highest public debt/GDP category.” Belgium, in particular, has 26 years with debt-to-GDP above 90 percent, with an average growth rate of 2.6 percent (though this is only counted as one total point due to the weighting above)…

This error is needed to get the results they published, and it would go a long way to explaining why it has been impossible for others to replicate these results. If this error turns out to be an actual mistake Reinhart-Rogoff made, well, all I can hope is that future historians note that one of the core empirical points providing the intellectual foundation for the global move to austerity in the early 2010s was based on someone accidentally not updating a row formula in Excel.

From what I’ve been able to gather, many economists think this is the least problematic error, that the decision to exclude certain data is more problematic, but I think that’s completely wrong. As best as I can tell, the spreadsheet is about twenty columns (max) by fifty rows. Somewhere around 1,000 cells. There’s nothing wrong with small datasets, but how the fuck do you not check every goddamn cell when the dataset is that small? In genomics, the summary table for some of the things we do has half a million cells. It’s challenging to check for errors, but that’s the job. Hell, yesterday I gave a talk where I alluded to a one-in-a-million error rate–and some physicists and engineers probably think we’re rank amateurs with that error rate (if you sequence the same exact strain of E. coli twice, that error rate yields ten false differences between the two strains; science is hard). In science, rigor is everything–you can’t be any smarter than your data are accurate. No field can allow itself to be this sloppy, especially when the stakes are that high.

Because this is what “the global move to austerity in the early 2010s” means:

children needlessly go hungry.
•developing new energy technologies to combat global warming are underfunded.
•the compact between the elderly and the young is slowly shredded.
•scientists, through not fault of their own, lose their livelihoods.
•our transportation systems continue to decay.

That’s only part of the short list.

Your field is only as good as the data are accurate.

  1. Min says:

    Unemployment is a public health issue. The death rate for long term unemployed people more or less doubles.

    Austerity kills.

  2. Clonal Antibody says:

    Why do people continue to say that R&R averaged rows 30-44 instead of rows 30-49. They did not do that. Rows 30, 32, 34-36, 38-43, 45 and 47-48 were averaged. This is clearly visible in the graphic of the spreadsheet that has circulated. So why do people continue to say that it was a dragging error?

    • Newcastle says:

      You are reading the sheet wrong. The sky blue cell highlighting is not what people are looking at. The dark blue line around cells L30:L44 define the values being averaged. The formula in cell L51 should have been =AVERAGE(L30:L49) but it was =AVERAGE(L30:L44).

  3. I would have to disagree. While an Excel coding error is certainly embarrassing, the real sin in my eyes is the complete lack of transparency in the methods and data. That hangs on the authors, the reviewers, and the journal and I think is far more revealing of a broader lack of integrity and research ethics.
    While it is certainly fun to toast Reinhart & Rogoff in our righteous fury, I feel we should ask ourselves whether the absence of their publication would have made any difference in the austerity policies adopted. After all, the austerity peddlers do not exactly have a track record of adjusting their position in response to facts…

  4. coloncancercommunity says:

    One of the problems that I have with the entire paper is that objectivity appears to never have been part of the picture. From what I can see, they set out to prove that a high level of debt would result in a tipping point where growth would reverse. and would probably have been unwilling to accept any other conclusion. You can’t do science that way. You may be right, you may be wrong, the point it is to find answers and to look at the data objectively.

