Some interesting data/data visualizations about working conditions in the apparel industry. Here is the full report. Whenever I see reports like this, I wish the raw data were more clearly linked. I want to be able to get in, play with the data, and see if I notice something that doesn’t appear in the infographics. This is an awesome plain-language discussion of how a bunch of methods (CS and Stats) with fancy names relate to each other.
This plan has been making the rounds on Twitter and is being attributed to William Cleveland in 2001 (thanks to Kasper for the link). I’m not sure of the provenance of the document but it has some really interesting ideas and is worth reading in its entirety. I actually think that many Biostatistics departments follow the proposed distribution of effort pretty closely. One of the most interesting sections is the discussion of computing (emphasis mine): Data analysis projects today rely on databases, computer and network hardware, and computer and network software.
When it comes to computing, history has gone back and forth between what I would call the “owner model” and the “renter model”. The question is what’s the best approach and how do you determine that? Back in the day when people like John von Neumann were busy inventing the computer to work out H-bomb calculations, there was more or less a renter model in place. Computers were obviously quite expensive and so not everyone could have one.
So along with a few folks here around Hopkins we have been kicking around the idea of developing an app for the iPhone/Android. I’ll leave the details out for now (other than to say stay tuned!). But to start developing an app for the iPhone, you need a version of Xcode, Apple’s development environment. The latest version of Xcode is version 4, which can only be installed with the latest version of Mac OS X Lion (10.
There’s in interesting discussion over at reddit on the difference between a data scientist and a statistician. My crude summary of the discussion seems to be that by and large they are the same but the phrase “data scientist” is just the hip new name for statistician that will probably sound stupid 5 years from now. My question is why isn’t “statistician” hip? The comments don’t seem to address that much (although a few go in that direction).
Amazon EC2 is #42 on Top 500 supercomputer list