Simply Statistics A statistics blog by Rafa Irizarry, Roger Peng, and Jeff Leek

When does replication reveal fraud?

Here’s a little thought experiment for your weekend pleasure. Consider the following:

Joe Scientist decides to conduct a study (call it Study A) to test the hypothesis that a parameter D > 0 vs. the null hypothesis that D = 0. He designs a study, collects some data, conducts an appropriate statistical analysis and concludes that D > 0. This result is published in the Journal of Awesome Results along with all the details of how the study was done.

Jane Scientist finds Joe’s study very interesting and tries to replicate his findings. She conducts a study (call it Study B) that is similar to Study A but completely independent of it (and does not communicate with Joe). In her analysis she does not find strong evidence that D > 0 and concludes that she cannot rule out the possibility that D = 0. She publishes her findings in the Journal of Null Results along with all the details.

From these two studies, which of the following conclusions can we make?

  1. Study A is obviously a fraud. If the truth were that D > 0, then Jane should have concluded that D > 0 in her independent replication.
  2. Study B is obviously a fraud. If Study A were conducted properly, then Jane should have reached the same conclusion.
  3. Neither Study A nor Study B was a fraud, but the result for Study A was a Type I error, i.e. a false positive.
  4. Neither Study A nor Study B was a fraud, but the result for Study B was a Type II error, i.e a false negative.

I realize that there are a number of subtle details concerning why things might happen but I’ve purposely left them out. My question is, based on the information that you actually have about the two studies, what would you consider to be the most likely case? What further information would you like to know beyond what was given here?_