Data Analysis for the Life Sciences - a book completely written in R markdown

Rafael Irizarry
2015-09-23

The book Data Analysis for the Life Sciences is now available on Leanpub.

Data analysis is now part of practically every research project in the life sciences. In this book we use data and computer code to teach the necessary statistical concepts and programming skills to become a data analyst. Following in the footsteps of Stat Labs, instead of showing theory first and then applying it to toy examples, we start with actual applications and describe the theory as it becomes necessary to solve specific challenges. We use simulations and data analysis examples to teach statistical concepts. The book includes links to computer code that readers can use to program along as they read the book.

It includes the following chapters: Inference, Exploratory Data Analysis, Robust Statistics, Matrix Algebra, Linear Models, Inference for High-Dimensional Data, Statistical Modeling, Distance and Dimension Reduction, Practical Machine Learning, and Batch Effects.

The text was completely written in R markdown and every section contains a link to the document that was used to create that section. This means that you can use knitr to reproduce any section of the book on your own computer. You can also access all these markdown documents directly from GitHub. Please send a pull request if you fix a typo or other mistake! For now we are keeping the R markdowns for the exercises private since they contain the solutions. But you can see the solutions if you take our online course quizzes. If we find that most readers want access to the solutions, we will open them up as well.

The material is based on the online courses I have been teaching with Mike Love. As we created the course, Mike and I wrote R markdown documents for the students and put them on GitHub. We then usedjekyll to create a webpage with html versions of the markdown documents. Jeff then convinced us to publish it on LeanbupLeanpub. So we wrote a shell script that compiled the entire book into a Leanpub directory, and after countless hours of editing and tinkering we have a 450+ page book with over 200 exercises. The entire book compiles from scratch in about 20 minutes. We hope you like it.