ENAR is in Baltimore - Here's What To Do

This year's meeting of the Eastern North American Region of the International Biometric Society (ENAR) is in lovely Baltimore, Maryland. As local residents Jeff and I thought we'd put down a few suggestions for what to do during your stay

How to use Bioconductor to find empirical evidence in support of π being a normal number

Happy π day everybody! I wanted to write some simple code (included below) to the test parallelization capabilities of my  new cluster. So, in honor of  π day, I decided to check for evidence that π is a normal number. A

Oh no, the Leekasso....

An astute reader (Niels Hansen, who is visiting our department today) caught a bug in my code on Github for the Leekasso. I had: lm1 = lm(y ~ leekX) predict.lm(lm1, Unfortunately, this meant that I was getting predictions for the

Per capita GDP versus years since women received right to vote

Below is a plot of per capita GPD (in log scale) against years since women received the right to vote for 42 countries. Is this cause, effect, both or neither? We all know correlation does not imply causation, but I

PLoS One, I have an idea for what to do with all your profits: buy hard drives

I've been closely following the fallout from PLoS One's new policy for data sharing. The policy says, basically, that if you publish a paper, all data and code to go with that paper should be made publicly available at the

Data Science is Hard, But So is Talking

Jeff, Brian, and I had to record nine separate introductory videos for our Data Science Specialization and, well, some of us were better at it than others. It takes a bit of practice to read effectively from a teleprompter, something

Here's why the scientific publishing system can never be "fixed"

There's been much discussion recently about how the scientific publishing system is "broken". Just the latest one that I saw was a tweet from Princeton biophysicist Josh Shaevitz: Editor at a 'fancy' journal to my postdoc "This is amazing work

Why do we love R so much?

When Jeff, Brian, and I started the Johns Hopkins Data Science Specialization we decided early on to organize the program around using R. Why? Because we love R, we use it everyday, and it has an incredible community of developers

k-means clustering in a GIF

k-means is a simple and intuitive clustering approach. Here is a movie showing how it works:

Repost: Ronald Fisher is one of the few scientists with a legit claim to most influential scientist ever

Editor's Note: Ronald  This is a repost of the post "R.A. Fisher is the most influential scientist ever" with a picture of my pilgrimage to his  gravesite in Adelaide, Australia.  You can now see profiles of famous scientists on Google Scholar citations.

