Tag: publishing


2-D author lists

The order of authors on scientific papers matters a lot. The best places to be on a paper vary by field. But typically the first and the corresponding (usually last) authors are the prime real estate. When people are evaluated on the job market, for promotion, or to get grants, the number of first/corresponding author papers can be the difference between success and failure. 

At the same time, many journals list “authors contributions” at the end of the manuscript, but this is rarely prominently displayed. The result is that regardless of the true distribution of credit in a manuscript, the first and last authors get the bulk of the benefit. 

This system is antiquated for a few reasons:

  1. In multidisciplinary science, there are often equal and very different contributions from people working in different disciplines. 
  2. Science is increasing collaborative, even within a single discipline and papers are rarely the effort of 2 people anymore. 

How about a 2-D, resortable author list? Each author is a column and each kind of contribution is a row. The contributions are: (1) conceived the idea, (2) collected the data, (3) did the computational analysis, (4) wrote the paper (you could imagine adding others). Each category then gets a quantitative number, fraction of the effort to that component of the paper. Then you build an interactive graphic that allows you to sort the authors by each of the categories. So you could see who did what on the paper. 

To get an overall impression of which activities an author performs, you could average their contribution across papers in each of the categories. Creating a “heatmap of contributions”. Anyone want to build this? 


What is a major revision?

I posted a little while ago on a proposal for a fast statistics journal. It generated a bunch of comments and even a really nice follow up post with some great ideas. Since then I’ve gotten reviews back on a couple of papers and I think I realized one of the key issues that is driving me nuts about the current publishing model. It boils down to one simple question: 

What is a major revision? 

I often get reviews back that suggest “major revisions” in one or many of the following categories:

  1. More/different simulations
  2. New simulations
  3. Re-organization of content
  4. Re-writing language
  5. Asking for more references
  6. Asking me to include a new method
  7. Asking me to implement someone else’s method for comparison
I don’t consider any of these major revisions. Personally, I have stopped asking for them as major revisions. In my opinion, major revisions should be reserved for issues with the manuscript that suggest that it may be reporting incorrect results. Examples include:
  1. No simulations
  2. No real data
  3. The math/computations look incorrect
  4. The software didn’t work when I tried it
  5. The methods/algorithms are unreadable and can’t be followed
The first list is actually a list of minor/non-essential revisions in my opinion. They may improve my paper, but they won’t confirm that it is correct or not. I find that they are often subjective and are up to the whims of referees. In my own personal refereeing I am making an effort to remove subjective major revisions and only include issues that are critical to evaluate the correctness of a manuscript. I also try to divorce the issues of whether an idea is interesting or not from whether an idea is correct or not. 
I’d be curious to know what other peoples’ definitions of major/minor revisions are?


When should statistics papers be published in Science and Nature?

Like many statisticians, I was amped to see a statistics paper appear in Science. Given the impact that statistics has on the scientific community, it is a shame that more statistics papers don’t appear in the glossy journals like Science or Nature. As I pointed out in the previous post, if the paper that introduced the p-value was cited every time this statistic was used, the paper would have over 3 million citations!

But a couple of our readers* have pointed to a response to the MIC paper published by Noah Simon and Rob Tibshirani. Simon and Tibshirani show that the MIC statistic is underpowered compared to another recently published statistic for the same purpose that came out in 2009 in the Annals of Applied Statistics. A nice summary of the discussion is provided by Florian over at his blog. 

If the AoAS statistic came out first (by 2 years) and is more powerful (according to simulation), should the MIC statistic have appeared in Science? 

The whole discussion reminds me of a recent blog post suggesting that journals need to pick one between groundbreaking and definitive. The post points out that groundbreaking and definitive are in many ways in opposition to each other. 

Again, I’d suggest that statistics papers get short shrift in the glossy journals and I would like to see more. And the MIC statistic is certainly groundbreaking, but it isn’t clear that it is definitive. 

As a comparison, a slightly different story played out with another recent high-impact statistical method, the false discovery rate (FDR). The original papers were published in statistics journals. Then when it was clear that the idea was going to be big, a more general-audience-friendly summary was published in PNAS (not Science or Nature but definitely glossy). This might be a better way for the glossy journals to know what is going to be a major development in statistics versus an exciting - but potentially less definitive - method. 

* Florian M. and John S.


Free access publishing is awesome...but expensive. How do we pay for it?

I am a huge fan of open access journals. I think open access is good both for moral reasons (science should be freely available) and for more selfish ones (I want people to be able to read my work). If given the choice, I would publish all of my work in journals that distribute results freely. 

But it turns out that for most open/free access systems, the publishing charges are paid by the scientists publishing in the journals. I did a quick scan and compiled this little table of how much it costs to publish a paper in different journals (here is a bigger table): 

  • PLoS One  $1,350.00
  • PLoS Biology: $2,900.00
  • BMJ Open $1,937.28
  • Bioinformatics (Open Access Option) $3,000.00
  • Genome Biology (Open Access Option) $2,500.00
  • Biostatistics (Open Access Option) $3,000.00

The first thing I noticed is that it is minimum about $1,500 to get a paper published open access. That may not seem like a lot of money and most journals offer discounts to people who can’t pay. But it still adds up, this last year my group has published 7 papers. If I paid for all of them to be published open access, that would be at minimum $10,500! That is half the salary of a graduate student researcher for an entire year. For a senior scientist that may be no problem, but for early career scientists, or scientists with limited access to resources, it is a big challenge.

Publishers who are solely dedicated to open access (PLoS, BMJ Open, etc.) seem to have on average lower publication charges than journals who only offer open access as an option. I think part of this is that the journals that aren’t open access in general have to make up some of the profits they lose by making the articles free. I certainly don’t begrudge the journals the costs. They have to maintain the websites, format the articles, and run the peer review process. That all costs money. 

A modest proposal

What I wonder is if there was a better place for that money to come from? Here is one proposal (hat tip to Rafa): academic and other libraries pay a ton of money for subscriptions to journals like Nature and Science. They also are required to pay for journals in a large range of disciplines. What if, instead of investing this money in subscriptions for their university, academic libraries pitched in and subsidized the publication costs of open/free access?

If all university libraries pitched in, the cost for any individual library would be relatively small. It would probably be less than paying for subscriptions to hundreds of journals. At the same time, it would be an investment that would benefit not only the researchers at their school, but also the broader scientific community by keeping research open. Then neither the people publishing the work, nor the people reading it would be on the hook for the bill. 

This approach is the route taken by ArXiv, a free database of unpublished papers. These papers haven’t been peer reviewed, so they don’t always carry the same weight as papers published in peer-reviewed journals. But there are a lot of really good and important papers in the database - it is an almost universally accepted pre-print server.

The other nice thing about ArXiv is that you don’t pay for article processing, the papers are published as is. The papers don’t look quite as pretty as they do in Nature/Science or even PLoS, but it is also much cheaper. The only costs associated with making this a full fledged peer-reviewed journal would be refereeing (which scientists do for free anyway) and editorial responsibilities (again mostly volunteer by scientists). 

Related Posts:  Jeff on “Submitting scientific papers is too time consuming”


Errors in Biomedical Computing

Biomedical Computation Review has a nice summary (in which I am quoted briefly) by Kristin Sainani about the many different types of errors in computational research, including the infamous Duke incident and some other recent examples. The reproducible research policy at Biostatistics is described as an example for how the publication process might need to change to prevent errors from persisting (or occurring).