# Prediction

## Sunday Data/Statistics Link Roundup (10/14/12)

A fascinating article about the debate on whether to regulate sugary beverages. One of the protagonists is David Allison, a statistical geneticist, among other things. It is fascinating to see the interplay of statistical analysis and public policy. Yet another example of how statistics/data will drive some of the most important policy decisions going forward.  A related article is this one on the way risk is reported in the media.

## Prediction contest

I have been seeing this paper all over Twitter/the blogosphere. It’s a sexy idea: can you predict how “high-impact” a scientist will be in the future. It is also a pretty flawed data analysis…so this weeks prediction contest is to identify why the statistics in this paper are so flawed. In my first pass read I noticed about 5 major flaws. _Editor’s note: I posted the criticisms and the authors respond here: http://disq.

## Prediction: the Lasso vs. just using the top 10 predictors

One incredibly popular tool for the analysis of high-dimensional data is the lasso. The lasso is commonly used in cases when you have many more predictors than independent samples (the n « p) problem. It is also often used in the context of prediction. Suppose you have an outcome Y and several predictors X1,…,XM, the lasso fits a model: Y = B + B1 X1 + B2 X2 + … + BM XM + E