Sunday data/statistics link roundup (5/19/2013)

Jeff Leek
  1. This is a 1. This is a on 20th versus 21st century problems and the rise of the importance of empirical science. I particularly like the discussion of what it means to be a “solved” problem and how that has changed.
  2. A discussion in Science about the (arguably) most important statistics among academics, the impact factor and h-index. This comes on the heels of the San Francisco Declaration of Research Assessment. I like the idea that we should focus on evaluating science for its own merit rather than focusing on summaries like impact factor. But I worry that the “gaming” people are worried about with quantitative numbers like IF will be replaced with “politicking” if it becomes too qualitative. (via Rafa)
  3. A write-up about a survey  in Britain that suggests people don’t believe statistics (surprise!). I think this is symptomatic of a bigger issue which is being raised over and over. In the era when scientific problems don’t have deterministic solutions how do we determine if a problem has been solved? There is no good answer for this yet and it threatens to undermine a major fraction of the scientific enterprise going forward.
  4. Businesses are confusing data analysis and big data. This is so important and true. Big data infrastructure is often critical for creating/running data products. But discovering new ideas from data often happens on much smaller data sets with good intuition and interactive data analysis.
  5. Really interesting article about how the baseball card numbering system matters and how changing it can upset collectors (via Chris V.).