Nevins-Potti, Reinhart-Rogoff

There's an interesting parallel between the Nevins-Potti debacle (a true debacle, in my mind) and the recent Reinhart-Rogoff kerfuffle. Both were exposed via some essentially small detail that had nothing to do with the real problem.

In the case of Reinhart-Rogoff, the Excel error was what made them look ridiculous, but it was in fact the "unconventional weighting" of the data that had the most dramatic effect. Furthermore, ever since the paper had come out, academic economists were debating and challenging its conclusions from the get go. Even when legitimate scientific concerns were raised, policy-makers and other academics were not convinced. As soon as the Excel error was revealed, everything needed to be re-examined.

In the Nevins-Potti debacle, Baggerly and Coombes wrote article after article pointing out all the problems and, for the most part, no one in a position of power really cared. The Nevins-Potti errors were real zingers too, not some trivial Excel error (i.e. switching the labels between people with disease and people without disease). But in the end, it took Potti's claim of being a Rhodes Scholar to bring him down. Clearly, the years of academic debate beforehand were meaningless compared to lying on a CV.

In the Reinhart-Rogoff case, reproducibility was an issue and if the data had been made available earlier, the problems would have been discovered earlier and perhaps that would have headed off years of academic debate (for better or for worse). In the Nevins-Potti example, reproducibility was not an issue--the original Nature Medicine study was done using public data and so was reproducible (although it would have been easier if code had been made available). The problem there is that no one listened.

One has to wonder if the academic system is working in this regard. In both cases, it took a minor, but personal failing, to bring down the entire edifice. But the protestations of reputable academics, challenging the research on the merits, were ignored. I'd say in both cases the original research conveniently said what people wanted to hear (debt slows growth, personalized gene signatures can predict response to chemotherapy), and so no amount of research would convince people to question the original findings.

One also has to wonder whether reproducibility is of any help here. I certainly don't think it hurts, but in the case of Nevins-Potti, where the errors were shockingly obvious to anyone paying attention, the problems were deemed merely technical (i.e. statistical). The truth is, reproducibility will be most necessary in highly technical and complex analyses where it's often not obvious how an analysis is done. If you can show a flaw in an analysis that is complicated, what's the use if your work will be written off as merely concerned with technical details (as if those weren't important)? Most of the news articles surrounding Reinhart-Rogoff characterized the problems as complex and statistical (i.e. not important) and not concerned with fundamental questions of interest.

In both cases, I think science was used to push an external agenda, and when the science was called into question, it was difficult to back down. I'll write more in a future post about these kinds of situations and what, if anything, we can do to improve matters.

This entry was posted in Uncategorized. Bookmark the permalink.
  • Justin B. Kinney

    You make a good point. The problems are sociological more than technical. A fair number of high profile papers are known by experts in the various fields to be almost completely wrong. But there is little incentive for scientists to write refutations, and even when they do publishers (especially of the original work) often seem eager to not publicize concerns. Refutations are a central part of the scientific process. It's troubling how little incentive there is to pursue them.

  • http://twitter.com/KevinLDavenport Kevin Davenport

    Check out Problems With Using Microsoft Excel for Statistics by Dr. Jonathan D. Cryer

    kevinldavenport.info

  • Bhaskar Majumdar

    These episodes bring out the nature of peer reviews. The simplistic logic (debt slows growth) behind a complex analytical structure, should have been questioned. As everyone who has studied economics and finance knows, debt that is used for productive investments has a gearing effect and enhances it. Whereas debt that is used for unproductive assets and social redistribution does not improve growth. While the 2nd aspect is more prominent in the post crisis scenario, the Reinhart-Rogoff study was based on a long period of study, and should have been thoroughly investigated by peer review. It does call into question the mechanism of peer reviews these days.

  • http://pages.stern.nyu.edu/~dbackus/ David Backus

    One difference between the two: the RR conclusion -- debt can cause problems -- remains persuasive, despite the errors in this specific piece of work. We've seen that throughout history. What's also true is that there's no magic threshold like a 90% debt to GDP ratio. Another difference here is the political context of the RR result, with lots of countries now fighting both poor economic performance and high debt. So the errors are black and white, but the overall concern with debt remains gray.

    One more thing: this was not a peer-reviewed AER paper. It was a short note presented at the annual conference and published quickly.

  • Markus Spatz

    Interesting point, except I don't understand why the Excel mistake should count as a "personal" failing... Do you mean "intentional"? In that respect, it would seem that the blowup causes are precisely opposite; for RR the data cherry-picking seems more deliberate than the Excel errors, while for NP the resume seems more deliberate than technical slip-ups.

    But I fully agree with the general point that refutations should both be given more consideration and more means (by requiring reproducible research).

  • Erick Tatro

    I am troubled that a paper that did not go through peer review got so many citations and influence so quickly. Secondly, I think that peer review should not be anonymous. If we had signed peer review with the reviewers' names attached to manuscripts, I bet we'd see a hockey-stick in reproducibility.

    • Roger Peng

      My understanding is that this is par for the course for Economics, in that papers are generally not officially peer-reviewed in journals until long after they are released. However, if you consider the attention the paper got, esp. from economists, you could argue that it was one of the most peer-reviewed papers ever written.

  • rkostadi

    We just need "Forensic Biostatistics" and "Forensic Economics" majors in college if we want to train a generation of detectives whose job is investigating high-impact, high-potential-human-harm research studies.