After reposting on the comparison between the lasso and the always top 10 predictor (leekasso) I got some feedback that the problem could be I wasn’t debiasing the Lasso (thanks Tim T. on Twitter!). The idea behind debiasing (as I understand it) is to use the Lasso to do feature selection and then fit model without shrinkage to “debias” the coefficients. The debiased model is then used for prediction. Noah Simon, who knows approximately infinitely more about this than I do, kindly provided some code for fitting a debiased Lasso. He is not responsible for any mistakes/silliness in the simulation, he was just nice enough to provide some debiased Lasso code. He mentions a similar idea appears in the relaxo package if you set .
I used the same simulation set up as before and tried out the Leekasso, the Lasso and the Debiased Lasso. Here are the accuracy results (more red = higher accuracy):
The results suggest the debiased Lasso still doesn't work well under this design. Keep in mind as I mentioned in my previous post that the Lasso may perform better under a different causal model.
Update: Code available here on Github if you want to play around.comments powered by Disqus