15
Nov

Reproducible Research: With Us or Against Us?

Tweet about this on Twitter12Share on Facebook2Share on Google+0Share on LinkedIn0Email this to someone

Last night this article by Chris Drummond of the Canadian National Research Council (Conseil national de recherches Canada) popped up in my Google Scholar alert. The title of the article, “Reproducible Research: a Dissenting Opinion” would seem to indicate that he disagrees with much that has been circulating out there about reproducible research.

Drummond singles out the Declaration published by a Yale Law School Roundtable on Data and Code Sharing (I was not part of the roundtable) as an example of the main arguments in favor of reproducibility and has four main objections. What I found interesting about his piece is that I think I more or less agree with all his objections and yet draw the exact opposite conclusion from him. In his abstract, he concludes that “I would also contend that the effort necessary to meet the [reproducible research] movement’s aims, and the general attitude it engenders, would not serve any of the research disciplines well.”

Let’s take his objections one by one:
  1. Reproducibility, at least in the form proposed, is not now, nor has it ever been, an essential part of science. I would say that with the exception of mathematics, this is true. In math, usually you state a theorem and provide the proof. The proof shows you how to obtain the result, so it is a form of reproducibility. But beyond that I would argue that the need for reproducibility is a more recent phenomenon arising from the great complexity and cost of modern data analyses and the lack of funding for full replication. The rise of “consortium science” (think ENCODE project) diminishes our ability to fully replicate (what he calls “Scientific Replication”) an experiment in any reasonable amount of time.
  2. The idea of a single well defined scientific method resulting in an incremental, and cumulative, scientific process is highly debatable. He argues that the idea of a forward moving process by which science builds on top of previous results in an orderly and incremental fashion is a fiction. In particular, there is no single “scientific method” into which you can drop in reproducibility as a key component. I think most scientists would agree with this. Science not some orderly process—it’s messy and can seem haphazard and discoveries come at unpredictable times. But that doesn’t mean that people shouldn’t provide the details of what they’ve done so that others don’t have to essentially reverse engineer the process. I don’t see how the disorderly reality of science is an argument against reproducibility.
  3. Requiring the submission of data and code will encourage a level of distrust among researchers and promote the acceptance of papers based on narrow technical criteria. I don’t agree with this statement at all. First, I don’t think it will happen. If a journal required code/data, it would be burdensome for some, but it would just be one of the many requirements that journals have. Second, I don’t think good science is about “trust”. Sure, it’s important to be civilized but if you claim a finding, I’m not going to just trust it because we’re both scientists. Finally, he says “Submitting code — in whatever language, for whatever system — will simply result in an accumulation of questionable software. There may be a some cases where people would be able to use it but I would doubt that they would be frequent.” I think this is true, but it’s not necessarily an argument against submitting code. Think of the all the open source/free software packages out there. I would bet that most of that code has only been looked at by one person—the developer. But does that mean open source software as a whole is not valuable?
  4. Misconduct has always been part of science with surprisingly little consequence. The public’s distrust is likely more to with the apparent variability of scientific conclusions. I agree with the first part and am not sure about the second. I’ve tried to argue previously that reproducible research is not just about preventing fraud/misconduct. If someone wants to commit fraud, it’s easy to make the fraud reproducible.

In the end, I see reproducibility as not necessarily a new concept, but really an adaptation of an old concept, that is describing materials and methods. The problem is that the standard format for publication—journal articles—has simply not caught up with the growing complexity of data analysis. And so we need to update the standards a bit.

I think the benefit of reproducibility is that if someone wants to question or challenge the findings of a study, they have the materials with which to do so. Providing people with the means to ask questions is how science moves forward.

  • Chris Drummond

    As the author of that paper, I am surprised you could largely agree with what I said and yet disagree with my conclusions. I constantly see "good science" as the main justification for reproducible research. Your agreement with my first two points would suggest you have problems with this justification. But surely, take it away and the argument for reproducible research is considerably weakened. I would claim to the point where it shouldn't influence the review practice of journals and conferences

    In your commentary for the journal Science you say "Replication is the ultimate standard by which scientific claims are judged.". But now you say "the need for reproducibility is a more recent phenomenon". To call something scientific must surely be based on the previous practice of the whole scientific community. There may be specific fields where this practice has value, but the claim that "good science" is the reason is a claim much too broad.

    Chris Drummond