De-weaponizing reproducibility

Jeff Leek
2015-03-13

A couple of weeks ago Roger and I went to a conference on statistical reproducibility held at the National Academy of Sciences. The discussion was pretty wide ranging and I love that the thinking about reproducibility is coming back to statistics. There was pretty widespread support for the idea that prevention is the right way to approach reproducibility.

It turns out I was the last speaker of the whole conference. This is an unenviable position to be in with so many bright folks speaking first as they covered a huge amount of what I wanted to say. My talk focused on three key points:

  1. The tools for reproducibility already exist, the barrier isn’t tools
  2. We need to de-weaponize reproducibility
  3. Prevention is the right approach to reproducibility

 

In terms of the first point, tools like iPython, knitr, and Galaxy can be used to all but the absolutely largest analysis reproducible right now.  Our group does this all the time with our papers and so do many others. The problem isn’t a lack of tools.

Speaking to point two, I think many people would agree that part of the issue is culture change. One issue that is increasingly concerning to me is the “weaponization” of reproducibility.  I have been noticing is that some of us (like me, my students, other folks at JHU, and lots of particularly junior computational people elsewhere) are trying really hard to be reproducible. Most of the time this results in really positive reactions from the community. But when a co-author of mine and I wrote that paper about the science-wise false discovery rate, one of the discussants used our code (great), improved on it (great), identified a bug (great), and then did his level best to humiliate us both in front of the editor and the general public because of that bug (not so great).

I have seen this happen several times. Most of the time if a paper is reproducible the authors get a pat on the back and their code is either ignored, or used in a positive way. But for high-profile and important problems, people  largely use eproducibility to:

  1.  Impose regulatory hurdles in the short term while people transition to reproducibility. One clear example of this is the Secret Science Reform Act which is a bill that imposes strict reproducibility conditions on all science before it can be used as evidence for regulation.
  2. Humiliate people who aren’t good coders or who make mistakes in their code. This is what happened in my paper when I produced reproducible code for my analysis, but has also happened to other people.
  3. Take advantage of people’s code to plagiarize/straight up steal work. I have stories about this I’d rather not put on the internet

 

Of the three, I feel like (1) and (2) are the most common. Plagiarism and scooping by theft I think are actually relatively rare based on my own anecdotal experience. But I think that the “weaponization” of reproducibility to block regulation or to humiliate folks who are new to computational sciences is more common than I’d like it to be. Until reproducibility is the standard for everyone - which I think is possible now and will happen as the culture changes -  the people who are the early adopters are at risk of being bludgeoned with their own reproducibility. As a community, if we want widespread reproducibility adoption we have to be ferocious about not allowing this to happen.