Last week I linked to an ad for a Data Editor position at Nature Magazine. I was super excited that Nature was recognizing data as an important growth area. But the ad doesn’t mention anything about statistical analysis skills; it focuses exclusively on data management expertise. As I pointed out in the earlier post, managing data is only half the equation - figuring out what to do with the data is the other half. The second half requires knowledge of statistics.
The folks over at Nature responded to our post on Twitter:
it’s unrealistic to think this editor (or anyone) could do what you suggest. Curation & accessibility are key. ^ng
I disagree with this statement for the following reasons:
1. Is it really unrealistic to think someone could have data management and statistical expertise? Pick your favorite data scientist and you would have someone with those skills. Most students coming out of computer science, computational biology, bioinformatics, or statistical genomics programs would have a blend of those two skills in some proportion.
But maybe the problem is this:
Applicants must have a PhD in the biological sciences
It is possible that there are few PhDs in the biological sciences who know both statistics and data management (although that is probably changing). But most computational biologists have a pretty good knowledge of biology and a very good knowledge of data - both managing and analyzing. If you are hiring a data editor, this might be the target audience. I’d replace PhD in the biological science in the ad with, knowledge of biology,statistics, data analysis, and data visualization. There would be plenty of folks with those qualifications.
2. The response mentions curation, which is a critical issue. But good curation requires knowledge of two things: (i) the biological or scientific problem and (ii) how and in what way the data will be analyzed and used by researchers. As the Duke scandal made clear, a statistician with technological and biological knowledge running through a data analysis will identify many critical issues in data curation that would be missed by someone who doesn’t actually analyze data.
3. The response says that “Curation and accessibility” are key. I agree that they are part of the key. It is critical that data can be properly accessed by researchers to perform new analyses, verify results in papers, and discover new results. But if the goal is to ensure the quality of science being published in Nature (the role of an editor) curation and accessibility are not enough. The editor should be able to evaluate statistical methods described in papers to identify potential flaws, or to rerun code and make sure that it performs the same/sensible analyses. A bad analysis that is reproducible will be discovered more quickly, but it is still a bad analysis.
To be fair, I don’t think that Nature is the only organization that is missing the value of statistical skill in hiring data positions. It seems like many organizations are still just searching for folks who can handle/process the massive data sets being generated. But if they want to make accurate and informed decisions, statistical knowledge needs to be at the top of their list of qualifications.