Simply Statistics: Is most science false? The titans weigh in.

Some of you may recall that a few months ago my colleague and I posted a paper to the ArXiv on estimating the rate of false discoveries in the scientific literature. The paper was picked up by the Tech Review and led to a post on Andrew G.’s blog, on Discover blogs, and on our blog. One other interesting feature of our paper was that we put all the code/data we collected on Github.

At the time this whole thing blew up our paper still wasn’t published. After the explosion of interest we submitted the paper to Biostatistics. They liked the paper and actually solicited formal discussion of our approach by other statisticians. We were then allowed to respond to the discussions.

Overall, it was an awesome experience at Biostatistics - they did a great job of doing a thorough, but timely, review. They got some amazing discussants. Finally, they made our paper open-access. So much goodness. (conflict of interest disclaimer - I am an associate editor for Biostatistics)

Here are the papers that came out which I think are all worth reading:

I’m very proud of our paper and the rejoinder. The discussants were very passionate and added a huge amount of value, particularly in the collection/analysis of our data and additional data they collected.

I think it is 100% worth reading all of the papers over at Biostatistics but for the tldr crowd here are some take home messages I have from the experience and summarizing the discussion above:

Posting to ArXiv can be a huge advantage for a paper like ours but be ready for the heat.
Biostatistics (the journal) is awesome. Great job of reviewing/editing in a timely way and great job of organizing the discussion!
When talking about the science-wise false discovery rate you have to bring data.
We proposed the first formal framework for evaluating the science-wise false discovery rate which lots of people care about (and there are a ton of ideas in the discussion about ways to estimate it better).
I think based on our paper and the discussion that it is pretty unlikely that most published research is false. But that probably varies by your definition of false/what you mean by most/the journal type/the field you are considering/the analysis type/etc.
This is a question people care about. A lot.

Finally, I think this is the most important quote from our rejoinder:

We are encouraged, however, that several of the discussants collected additional data to evaluate the impact of the above decisions on the SWFDR estimates. The discussion illustrates the powerful way that data collection can be used to move the theoretical and philosophical discussion on to a more concrete, scientific footing—discussing the specific strengths and weaknesses of a particular empirical approach. Moreover, the interesting additional data collected by the discussants on study types, journals, and endpoints demonstrate that data beget data and lead to a stronger and more directed conversation.