Sunday Data/Statistics Link Roundup (10/28/12)

Admin
2012-10-28
  1. An important article about anti-science sentiment in the U.S. (via David S.). The politicization of scientific issues such as global warming, evolution, and healthcare (think vaccination) makes the U.S. less competitive. I think the lack of statistical literacy and training in the U.S. is one of the sources of the problem. People use/skew/mangle statistical analyses and experiments to support their view and without a statistically well trained public, it all looks “reasonable and scientific”. But when science seems to contradict itself, it loses credibility. Another reason to teach statistics to everyone in high school.
  2. Scientific American was loaded this last week, here is another article on cancer screening.  The article covers several of the issues that make it hard to convince people that screening isn’t always good. The predictive value of the positive confusion is a huge one in cancer screening right now. The author of the piece is someone worth following on Twitter @hildabast.
  3. A bunch of data on the use of Github. Always cool to see new data sets that are worth playing with for student projects, etc. (via Hilary M.). 
  4. A really interesting post over at Stats Chat about why we study seemingly obvious things. Hint, the reason is that “obvious” things aren’t always true. 
  5. A story on “sentiment analysis” by NPR that suggests that most of the variation in a stock’s price during the day can be explained by the number of Facebook likes. Obviously, this is an interesting correlation. Probably more interesting for hedge funders/stockpickers if the correlation was with the change in stock price the next day. (via Dan S.)
  6. Yihui Xie visited our department this week. We had a great time chatting with him about knitr/animation and all the cool work he is doing. Here are his slides from the talk he gave. Particularly check out his idea for a fast journal. You are seeing the future of publishing.  
  7. Bonus Link: R is a trendy open source technology for big data