Author Archives: Rafael Irizarry

Applied Statisticians: people want to learn what we do. Let's teach them.

In this recent opinion piece, Hadley Wickham explains how data science goes beyond Statistics and that data science is not promoted in academia. He defines data science as follows: I think there are three main steps in a data science … Continue reading

Posted in Uncategorized | 3 Comments

Academic statisticians: there is no shame in developing statistical solutions that solve just one problem

I think that the main distinction between academic statisticians and those calling themselves data scientists is that the latter are very much willing to invest most of their time and energy into solving specific problems by analyzing specific data sets. … Continue reading

Posted in Uncategorized | 7 Comments

The Big in Big Data relates to importance not size

In the past couple of years several non-statisticians have asked me "what is Big Data exactly?" or "How big is Big Data?". My answer has been "I think Big Data is much more about "data" than "big". I explain below. … Continue reading

Posted in Uncategorized | Tagged | 5 Comments

Confession: I sometimes enjoy reading the fake journal/conference spam

I've spent a considerable amount of time setting up filters to avoid getting spam from fake journals and conferences. Unfortunately, they are exceptionally good at thwarting my defenses. This does not annoy me as much as I pretend because, secretly, … Continue reading

Posted in Uncategorized | Tagged | 3 Comments

Correlation does not imply causation (parental involvement edition)

The New York Times recently published an article on education titled "Parental Involvement Is Overrated". Most research in this area supports the opposite view, but the authors claim that "evidence from our research suggests otherwise".  Before you stop helping your children … Continue reading

Posted in Uncategorized | Leave a comment

Writing good software can have more impact than publishing in high impact journals for genomic statisticians

Every once in a while we see computational papers published in science journals with high impact factors.  Genomics related methods appear quite often in these journals. Several of my junior colleagues express frustration that all their papers get rejected from these journals. … Continue reading

Posted in Uncategorized | 4 Comments

Data Analysis for Genomics edX Course

Mike Love (@mikelove) and I have been working hard the past couple of months preparing a free online edX course on data analysis for genomics. Our target audience are the postdocs, graduate students and research scientists that are tasked with … Continue reading

Posted in Uncategorized | 2 Comments

The fact that data analysts base their conclusions on data does not mean they ignore experts

Paul Krugman recently joined the new FiveThirtyEight hating bandwagon. I am not crazy about the new website either (although I'll wait more than one weeks before judging) but in a recent post Krugman creates a false dichotomy that is important to … Continue reading

Posted in Uncategorized | 4 Comments

How to use Bioconductor to find empirical evidence in support of π being a normal number

Happy π day everybody! I wanted to write some simple code (included below) to the test parallelization capabilities of my  new cluster. So, in honor of  π day, I decided to check for evidence that π is a normal number. A … Continue reading

Posted in Uncategorized | Tagged | Leave a comment

Per capita GDP versus years since women received right to vote

Below is a plot of per capita GPD (in log scale) against years since women received the right to vote for 42 countries. Is this cause, effect, both or neither? We all know correlation does not imply causation, but I … Continue reading

Posted in Uncategorized | 16 Comments