Simply Statistics: Reproducible Research and Turkey

Over the Thanksgiving recent break I naturally started thinking about reproducible research in between salting the turkey and making the turkey stock. Clearly, these things are all related.

I sometimes get the sense that many people see reproducibility as essentially binary. A published paper is either reproducible, as in you can compute every single last numerical result to within epsilon precision, or it’s not. My feeling is that there is a spectrum of reproducibility when it comes to published scientific findings. Some papers are more reproducible than others. And that’s where cooking comes in.

I do a bit of cooking and I am a shameless consumer of food blogs/web sites. There seems pretty solid agreement (and my own experience essentially confirms) that the more you can make yourself and not have to rely on other people doing the cooking, the better. For example, for Thanksgiving, you could theoretically buy yourself a pre-roasted turkey that’s ready to eat. My brother tells me this is what homesick Americans do in China because so few people have an oven (I suppose you could steam a turkey?). Or you could buy an un-cooked turkey that is “flavor injected”. Or you could buy a normal turkey and brine/salt it yourself. Or you could get yourself one of those heritage turkeys. Or you could raise your own turkeys…. I think in all of these cases, the turkey would definitely be edible and maybe even tasty. But some would probably be more tasty than others.

And that’s the point. There’s a spectrum when it comes to cooking and some methods result in better food than others. Similarly, when it comes to published research there is a spectrum of what authors can make available to reproduce their work. On the one hand, you have just the paper itself, which reveals quite a bit of information (i.e. the scientific question, the general approach) but usually too few details to actually reproduce (or even replicate) anything. Some authors might release the code, which allows you to study the algorithms and maybe apply them to your own work. Some might release the code and the data so that you can actually reproduce the published findings. Some might make a nice R package/vignette so that you barely have to lift a finger. Each case is better than the previous, but that’s not to say that I would only accept the last/best case. Some reproducibility is better than none.

That said, I don’t think we should shoot low. Ideally, we would have the best case, which would allow for full reproducibility and rapid dissemination of ideas. But while we wait for that best case scenario, it couldn’t hurt to have a few steps in between.