Author Archives: Rafael Irizarry

Writing good software can have more impact than publishing in high impact journals for genomic statisticians

Every once in a while we see computational papers published in science journals with high impact factors.  Genomics related methods appear quite often in these journals. Several of my junior colleagues express frustration that all their papers get rejected from these journals. … Continue reading

Posted in Uncategorized | 3 Comments

Data Analysis for Genomics edX Course

Mike Love (@mikelove) and I have been working hard the past couple of months preparing a free online edX course on data analysis for genomics. Our target audience are the postdocs, graduate students and research scientists that are tasked with … Continue reading

Posted in Uncategorized | 2 Comments

The fact that data analysts base their conclusions on data does not mean they ignore experts

Paul Krugman recently joined the new FiveThirtyEight hating bandwagon. I am not crazy about the new website either (although I'll wait more than one weeks before judging) but in a recent post Krugman creates a false dichotomy that is important to … Continue reading

Posted in Uncategorized | 4 Comments

How to use Bioconductor to find empirical evidence in support of π being a normal number

Happy π day everybody! I wanted to write some simple code (included below) to the test parallelization capabilities of my  new cluster. So, in honor of  π day, I decided to check for evidence that π is a normal number. A … Continue reading

Posted in Uncategorized | Tagged | Leave a comment

Per capita GDP versus years since women received right to vote

Below is a plot of per capita GPD (in log scale) against years since women received the right to vote for 42 countries. Is this cause, effect, both or neither? We all know correlation does not imply causation, but I … Continue reading

Posted in Uncategorized | 16 Comments

k-means clustering in a GIF

k-means is a simple and intuitive clustering approach. Here is a movie showing how it works:

Posted in Uncategorized | 5 Comments

loess explained in a GIF

Local regression (loess) is one of the statistical procedures I most use. Here is a movie showing how it works

Posted in Uncategorized | 6 Comments

The three tables for genomics collaborations

Collaborations between biologists and statisticians are very common in genomics. For the data analysis to be fruitful, the statistician needs to understand what samples are being analyzed. For the analysis report to make sense to the biologist, it needs to … Continue reading

Posted in Uncategorized | 1 Comment

Not teaching computing and statistics in our public schools will make upward mobility even harder

In his book Average Is Over, Tyler Cowen predicts that as automatization becomes more common, modern economies will eventually be composed of two groups: 1) a highly educated minority involved in the production of  automated services and 2) a vast majority … Continue reading

Posted in Uncategorized | 3 Comments

Missing not at random data makes some Facebook users feel sad

This article, published last week, explained how "some younger users of Facebook say that using the site often leaves them feeling sad, lonely and inadequate".  Being a statistician  gives you an advantage here because we know that naive estimates from missing … Continue reading

Posted in Uncategorized | 1 Comment