Tag: Link

08
Jan
02
Jan

List of cities/states with open data - help me find more!

It’s the beginning of 2012 and statistics/data science has never been hotter. Some of the most important data is data collected about civic organizations. If you haven’t seen Bill Gate’s TED Talk about the importance of state budgets, you should watch it now. A major key to solving a lot of our economic problems lies in understanding and using data collected about cites and states. 

U.S. cities and states are jumping on this idea and our own Baltimore was one of the earliest adopters. I thought I’d make a list of all the cities that have made an effort to make civic data public. Here are a few I’ve found:

There are also open data sites for many states:

Civic organizations are realizing that opening their data through APIs or by hosting competitions can lead to greater transparency, good advertising, and new and useful applications. If I had one data-related wish for 2012, it would be that the critical mass of data/statistics knowledge being developed could be used with these data to help solve some of our most pressing problems. 

Update: Oh Canada! In the comments Ani Ruhil points to some Canadian cities/provinces with open data pages. 

24
Oct

Web-scraping

The internet is the greatest source of publicly available data. One of the key skills to being able to obtain data from the web is “web-scraping”, where you use a piece of software to run through a website and collect information. 

This technique can be used for collecting data from databases or to collect data that is scattered across a website. Here is a very cool little exercise in web-scraping that can be used as an example of the things that are possible. 

Related Posts: Jeff on APIs, Data Sources, Regex, and The Open Data Movement.

17
Oct
11
Oct
11
Oct
04
Oct

Defining data science

Rebranding of statistics as a field seems to be a popular topic these days and “data science” is one of the potential rebranding options. This article over at Revolutions is a nice summary of where the term comes from and what it means. This quote seems pretty accurate:

My own take is that Data Science is a valuable rebranding of computer science and applied statistics skills.

02
Oct
01
Oct
30
Sep

Battling Bad Science

Here is a pretty awesome TED talk by epidemiologist Ben Goldacre where he highlights how science can be used to deceive/mislead. It’s sort of like epidemiology 101 in 15 minutes. 

This seems like a highly topical talk. Over on his blog, Steven Salzberg has pointed out that Dr. Oz has recently been engaging in some of these shady practices on his show. Too bad he didn’t check out the video first. 

In the comments section of the TED talk, one viewer points out that Dr. Goldacre doesn’t talk about the role of the FDA and other regulatory agencies. I think that regulatory agencies are under-appreciated and deserve credit for addressing many of these potential problems in the conduct of clinical trials. 

Maybe there should be an agency regulating how science is reported in the news?