Nevins-Potti, Reinhart-Rogoff

Roger Peng

There’s an interesting parallel between the Nevins-Potti debacle (a true debacle, in my mind) and the recent Reinhart-Rogoff kerfuffle. Both were exposed via some essentially small detail that had nothing to do with the real problem.

In the case of Reinhart-Rogoff, the Excel error was what made them look ridiculous, but it was in fact the “unconventional weighting” of the data that had the most dramatic effect. Furthermore, ever since the paper had come out, academic economists were debating and challenging its conclusions from the get go. Even when legitimate scientific concerns were raised, policy-makers and other academics were not convinced. As soon as the Excel error was revealed, everything needed to be re-examined.

In the Nevins-Potti debacle, Baggerly and Coombes wrote article after article pointing out all the problems and, for the most part, no one in a position of power really cared. The Nevins-Potti errors were real zingers too, not some trivial Excel error (i.e. switching the labels between people with disease and people without disease). But in the end, it took Potti’s claim of being a Rhodes Scholar to bring him down. Clearly, the years of academic debate beforehand were meaningless compared to lying on a CV.

In the Reinhart-Rogoff case, reproducibility was an issue and if the data had been made available earlier, the problems would have been discovered earlier and perhaps that would have headed off years of academic debate (for better or for worse). In the Nevins-Potti example, reproducibility was not an issue–the original Nature Medicine study was done using public data and so was reproducible (although it would have been easier if code had been made available). The problem there is that no one listened.

One has to wonder if the academic system is working in this regard. In both cases, it took a minor, but _personal _failing, to bring down the entire edifice. But the protestations of reputable academics, challenging the research on the merits, were ignored. I’d say in both cases the original research conveniently said what people wanted to hear (debt slows growth, personalized gene signatures can predict response to chemotherapy), and so no amount of research would convince people to question the original findings.

One also has to wonder whether reproducibility is of any help here. I certainly don’t think it hurts, but in the case of Nevins-Potti, where the errors were shockingly obvious to anyone paying attention, the problems were deemed merely technical (i.e. statistical). The truth is, reproducibility will be most necessary in highly technical and complex analyses where it’s often not obvious how an analysis is done. If you can show a flaw in an analysis that is complicated, what’s the use if your work will be written off as merely concerned with technical details (as if those weren’t important)? Most of the news articles surrounding Reinhart-Rogoff characterized the problems as complex and statistical (i.e. not important) and not concerned with fundamental questions of interest.

In both cases, I think science was used to push an external agenda, and when the science was called into question, it was difficult to back down. I’ll write more in a future post about these kinds of situations and what, if anything, we can do to improve matters.