Dropping the Stick in Data Analysis

Roger Peng

When I was a kid growing up in rough-and-tumble suburban New York, one of the major summer activities was roller hockey, the kind with roller blades (remember roller blades?). My friends and I would be playing in some random parking lot and undoubtedly one of us would be just blowing it the whole game. This would usually lead to an impromptu intervention where the person screwing up (often me) would be told by everyone else on the team to “drop the stick”. The idea was you should stop playing, clear your head, skate around for a bit, and not try to do 20 things at once.

I don’t play much hockey now, but I do a bit more data analysis. Strangely, little has changed.

People come to me at various stages of data analysis. Close collaborators usually come to me with no data because they are planning a study and need some help. In those cases, I’m involved in the beginning and know how the data are generated. Usually, in those cases I analyze the data in the end so there’s less confusion.

Others usually come to me with data in hand wanting know what they should do now that they’ve got all this data. Often there’s confusion about where to start, what method to use, what program, what procedure, what function, what test, Bayesian or frequentist, mean or median, R or Stata, random effects or fixed effects, cat or dog, mice or men, etc. That’s usually the point where I tell them to “drop the stick”, or the data analysis version of that, which is “What question are you trying to answer?”

Usually, people know what question they’re trying to answer–they just forgot to tell me. But I’m always amazed at how this question can often be the subject of the entire discussion. We might end up answering a question the investigator hadn’t thought of yet, maybe a question that’s better suited to the data.

So, job #1 if you’re a statistician: Get more people to drop the stick.  You’ll make everyone play better in the end.