Jeff and I talk about Jeff's recently completed MOOC on Data Analysis.
Jeff and I talk about Jeff's recently completed MOOC on Data Analysis.
I'm not the first one to suggest that Biostatistics has been undervalued in the scientific community, and some of the shortcomings of epidemiology and biostatistics have been noted elsewhere. But this previous work focuses primarily on the contributions of statistics/biostatistics at the purely scientific level.
The Cox Proportional Hazards model is one of the most widely used statistical models in the analysis of data from clinical trials and other medical studies. The corresponding paper has been cited over 32,000 times; this is a dramatically low estimate of the number of times the model has been used. It is one of "those methods" that doesn't even require a reference to the original methods paper anymore.
Many of the most influential medical studies, including major studies like the Women's Health Initiative have used these methods to answer some of our most pressing medical questions. Despite the incredible impact of this statistical technique on the world of medicine and public health, it has not received the Nobel Prize. This isn't an aberration, statistical methods are not traditionally considered for Nobel Prizes in Medicine. They primarily focus on biochemical, genetic, or public health discoveries.
In contrast, many economics Nobel Prizes have been awarded primarily for the discovery of a new statistical or mathematical concept. One example is the ARCH model. The Nobel Prize in Economics in 2003 was awarded to Robert Engle, the person who proposed the original ARCH model. The model has gone on to have a major impact on financial analysis, much like the Cox model has had a major impact on medicine?
So why aren't Nobel Prizes in medicine awarded to statisticians more often? Other methods such as ANOVA, P-values, etc. have also had an incredibly large impact on the way we measure and evaluate medical procedures. Maybe as medicine becomes increasingly driven by data, we will start to see more statisticians recognized for their incredible discoveries and the huge contributions they make to medical research and practice.
Jeff and I talk with Brian Caffo about teaching MOOCs on Coursera.
A recent lunchtime discussion here at Hopkins brought up the somewhat-controversial topic of abstract thinking in our graduate program. We, like a lot of other biostatistics/statistics programs, require our students to take measure theoretic probability as part of the curriculum. The discussion started as a conversation about whether we should require measure theoretic probability for our students. It evolved into a discussion of the value of abstract thinking (and whether measure theoretic probability was a good tool to measure abstract thinking).
Brian Caffo and I decided an interesting idea would be a point-counterpoint with the prompt, “How important is abstract thinking for the education of statistics graduate students?” Next week Brian and I will provide a point-counterpoint response based on our discussion.
In the meantime we’d love to hear your opinions!
Statistics depends on math, like a lot of other disciplines (physics, engineering, chemistry, computer science). But just like those other disciplines, statistics is not math; math is just a tool used to solve statistical problems. Unlike those other disciplines, statistics gets lumped in with math in headlines. Whenever people use statistical analysis to solve an interesting problem, the headline reads:
“Math can be used to solve amazing problem X”
“The Math of Y”
Here are some examples:
The Mathematics of Lego - Using data on legos to estimate a distribution
The Mathematics of War - Using data on conflicts to estimate a distribution
Usain Bolt can run faster with maths (Tweet) - Turns out they analyzed data on start times to come to the conclusion
The Mathematics of Beauty - Analysis of data relating dating profile responses and photo attractiveness
These are just a few off of the top of my head, but I regularly see headlines like this. I think there are a couple reasons for math being grouped with statistics: (1) many of the founders of statistics were mathematicians first (but not all of them) (2) many statisticians still identify themselves as mathematicians, and (3) in some cases statistics and statisticians define themselves pretty narrowly.
With respect to (3), consider the following list of disciplines:
All of these disciplines could easily be classified as “applied statistics”. But how many folks in each of those disciplines would classify themselves as statisticians? More importantly, how many would be claimed by statisticians?
I posted a little while ago on a proposal for a fast statistics journal. It generated a bunch of comments and even a really nice follow up post with some great ideas. Since then I’ve gotten reviews back on a couple of papers and I think I realized one of the key issues that is driving me nuts about the current publishing model. It boils down to one simple question:
What is a major revision?
I often get reviews back that suggest “major revisions” in one or many of the following categories:
The game has been dominated for a long time by the folks over in CS. But the value of many recent startups is either based on, or can be magnified by, good data analysis. Here are a few startups that are based on data/data analysis:
To launch a startup you need just a couple of things: (1) a good, valuable source of data (there are lots of these on the web) and (2) a good idea about how to analyze them to create something useful. The second step is obviously harder than the first, but the companies above prove you can do it. Then, once it is built, you can outsource/partner with developers - web and otherwise - to implement your idea. If you can build it in R, someone can make it an app.
These are just a few of the startups whose value is entirely derived from data analysis. But companies from LinkedIn, to Bitly, to Amazon, to Walmart are trying to mine the data they are generating to increase value. Data is now being generated at unprecedented scale by computers, cell phones, even thremostats! With this onslaught of data, the need for people with analysis skills is becoming incredibly acute.
Statisticians, like computer scientists before them, are poised to launch, and make major contributions to, the next generation of startups.
We here at Simply Statistics are big fans of science news reporting. We read newspapers, blogs, and the news sections of scientific journals to keep up with the coolest new research.
But health science reporting, although exciting, can also be incredibly frustrating to read. Many articles have sensational titles, like “How using Facebook could raise your risk of cancer”. The articles go on to describe some research and interview a few scientists, then typically make fairly large claims about what the research means. This isn’t surprising - eye catching headlines are important in this era of short attention spans and information overload.
If just a few extra pieces of information were reported in science stories about the news, it would be much easier to evaluate whether the cancer risk was serious enough to shut down our Facebook accounts. In particular we thought any news story should report:
So we created a citizen-science website for evaluating health news reporting called HealthNewsRater. It was built by Andrew Jaffe and Jeff Leek, with Andrew doing the bulk of the heavy lifting. We would like you to help us collect data on the quality of health news reporting. When you read a health news story on the Nature website, at nytimes.com, or on a blog, we’d like you to take a second to report on the news. Just determine whether the 6 pieces of information above are reported and input the data at HealthNewsRater.
We calculate a score for each story based on the formula:
HNR-Score = (5 points for a link to the original article + 1 point each for the other criteria)/2
The score weights the link to the original article very heavily, since this is the best source of information about the actual science underlying the story.
In a future post we will analyze the data we have collected, make it publicly available, and let you know which news sources are doing the best job of reporting health science.
Update: If you are a web-developer with an interest in health news contact us to help make HealthNewsRater better!