Tag: sports


Sunday data/statistics link roundup (1/27/2013)

  1. Wisconsin is decoupling the education and degree granting components of education. This means if you take a MOOC like mine, Brian's or Roger's and there is an equivalent class to pass at Wisconsin, you can take the exam and get credit. This is big. (via Rafa)
  2. This  is a really cool MLB visualisation done with d3.js and Crossfilter. It was also prototyped in R, which makes it even cooler. (via Rafa via Chris V.)
  3. Harvard is encouraging their professors to only publish in open access journals and to resign from closed access journals. This is another major change and bodes well for the future of open science (again via Rafa - noticing a theme this week?).
  4. This deserves a post all to itself, but Greece is prosecuting a statistician for analyzing data in a way that changed their deficit figure. I wonder what the folks at the International Year of Statistics think about that? (via Alex N.)
  5. Be on the twitters at 10:30AM Tuesday and follow the hashtag #jhsph753 if you want to hear all the crazy stuff I tell my students when I'm running on no sleep.
  6. Thomas at StatsChat is fed up with Nobel correlations. Although I'm still partial to the length of country name association.

Sunday data/statistics link roundup (11/25/2012)

  1. My wife used to teach at Grinnell College, so we were psyched to see that a Grinnell player set the NCAA record for most points in a game. We used to go to the games, which were amazing to watch, when we lived in Iowa. The system the coach has in place there is a ton of fun to watch and is based on statistics!
  2. Someone has to vet the science writers at the Huffpo. This is out of control, basically claiming that open access publishing is harming science. I mean, I'm all about being a curmudgeon and all, but the internet exists now, so we might as well get used to it. 
  3. This one is probably better for Steven's blog, but this is a pretty powerful graph about the life-saving potential of vaccines.  
  4. Roger posted yesterday about the NY Times piece on deep learning. It is one of our most shared posts of all time, you should also check out the comments, which are exceedingly good. Two things I thought I'd point out in response to a lot of the reaction: (1) I think part of Roger's post was suggesting that the statistics community should adopt some of CS's culture of solving problems with already existing, really good methods and (2) I tried searching for a really clear example of "deep learning" yesterday so we could try some statistics on it and didn't find any really clear explanations. Does anyone have a really simple example of deep learning (ideally with code) so we can see how it relates to statistical concepts? 

Sunday data/statistics link roundup (8/26/12)

First off, a quick apology for missing last week, and thanks to Augusto for noticing! On to the links:

  1. Unbelievably the BRCA gene patents were upheld by the lower court despite the Supreme Court coming down pretty unequivocally against patenting correlations between metabolites and health outcomes. I wonder if this one will be overturned if it makes it back up to the Supreme Court. 
  2. A really nice interview with David Spiegelhalter on Statistics and Risk. David runs the Understanding Uncertainty blog and published a recent paper on visualizing uncertainty. My favorite line from the interview might be: “There is a nice quote from Joel Best that “all statistics are social products, the results of people’s efforts”. He says you should always ask, “Why was this statistic created?” Certainly statistics are constructed from things that people have chosen to measure and define, and the numbers that come out of those studies often take on a life of their own.”
  3. For those of you who use Tumblr like we do, here is a cool post on how to put technical content into your blog. My favorite thing I learned about is the Github Gist that can be used to embed syntax-highlighted code.
  4. A few interesting and relatively simple stats for projecting the success of NFL teams.  One thing I love about sports statistics is that they are totally willing to be super ad-hoc and to be super simple. Sometimes this is all you need to be highly predictive (see for example, the results of Football’s Pythagorean Theorem). I’m sure there are tons of more sophisticated analyses out there, but if it ain’t broke… (via Rafa). 
  5. My student Hilary has a new blog that’s worth checking out. Here is a nice review of ProjectTemplate she did. I think the idea of having an organizing principle behind your code is a great one. Hilary likes ProjectTemplate, I think there are a few others out there that might be useful. If you know about them, you should leave a comment on her blog!
  6. This is ridiculously cool. Man City has opened up their data/statistics to the data analytics community. After registering, you have access to many of the statistics the club uses to analyze their players. This is yet another example of open data taking over the world. It’s clear that data generators can create way more value for themselves by releasing cool data, rather than holding it all in house. 
  7. The Portland Public Library has created a website called Book Psychic, basically a recommender system for books. I love this idea. It would be great to have a recommender system for scientific papers

Sunday data/statistics link roundup (6/24)

  1. We’ve got a new domain! You can still follow us on tumblr or here: http://simplystatistics.org/
  2. A cool article on MIT’s annual sports statistics conference (via @storeylab). I love how the guy they chose to highlight created what I would consider a pretty simple visualization with known tools - but it turns out it is potentially a really new way of evaluating the shooting range of basketball players. This is my favorite kind of creativity in statistics.
  3. This is an interesting article calling higher education a “credentials cartel”. I don’t know if I’d go quite that far; there are a lot of really good reasons for higher education institutions beyond credentialing like research, putting smart students together in classes and dorms, broadening experiences etc. But I still think there is room for a smart group of statisticians/computer scientists to solve the credentialing problem on a big scale and have a huge impact on the education industry. 
  4. Check out John Cook’s conjecture on statistical methods that get used: “The probability of a method being used drops by at least a factor of 2 for every parameter that has to be determined by trial-and-error.” I’m with you. I wonder if there is a corollary related to how easy the documentation is to read? 
  5. If you haven’t read Roger’s post on Statistics and the Science Club, I consider it a must-read for anyone who is affiliated with a statistics/biostatistics department. We’ve had feedback by email/on twitter from other folks who are moving toward a more science oriented statistical culture. We’d love to hear from more folks with this same attitude/inclination/approach. 

I don't think it means what ESPN thinks it means

Given ESPN’s recent headline difficulties it seems like they might want a headline editor or something…


Archetypal Athletes

Here is a cool paper on the ArXiv about archetypal athletes. The basic idea is to look at a large number of variables for each player and identify multivariate outliers or extremes. These outliers are the archetypes talked about in the title. 

According to his analysis the author claims the best players (for different reasons, i.e. different archetypes) in the NBA in 2009/2010 were:  Taj Gibson, Anthony Morrow, and Kevin Durant. The best soccer players were Wayne Rooney, Leonel Messi, and Christiano Ronaldo.

Thanks to Andrew Jaffe for pointing out the article. 

Related Posts: Jeff on “Innovation and Overconfidence”, Rafa on “Once in a lifetime collapse