In my ongoing discussion in my mind about what makes for a good data analysis, one of the ideas that keeps coming back to me is this notion of being able to “reason about the data”. The idea here is that it’s important that a data analysis allow you to understand how the data, as opposed to other aspects of an analysis like assumptions or models, played a role in producing the outputs.
I just got back from the rOpenSci OzUnconf that was run in Melbourne last week. I’d like to give a big thanks to the organizers (Nick Tierney, Di Cook, Rob Hyndman and others) for putting on a great unconference. These events are always a great opportunity to meet people just getting started in the R community and to get them involved. As is typical for these unconferences, topic ideas were pitched via issues on the OzUnconf GitHub repo.
On the latest episode of Not So Standard Deviations I talked with Hilary about Apple’s efforts to train machine learning algorithms in their Face ID technology in the iPhone X. The gist of Face ID is that it recognizes your face using a mathematical representation and then unlocks the phone when it can confirm that it is you. In its keynote presentation, Apple mentioned that it’s using machine learning to do this and even had developed its own custom chips to do the computations.
I’m co-teaching a data science class at Johns Hopkins with John Muschelli. I gave the lectures on EDA and he just gave a lecture on how to create an “expository graph”. When we teach the class an exploratory graph is the kind of graph you make for yourself just to try to understand a data set. An expository graph is one where you are trying to communicate information to someone else.
I previously wrote about my editing workflow for podcasts and I thought I’d follow up with some details on how I record both Not So Standard Deviations and The Effort Report. This post is again going to be a bit Mac-specific because, well, that’s what I do. Communication Both of my podcasts have a co-host who is not in the same physical location as me. Therefore, we need to use some sort of Internet-based communication software (Skype, Google Hangouts, FaceTime, etc.
I thought I’d write a brief description of how I edit podcasts using Logic Pro X because when I was first getting into podcasts, I didn’t find a lot of useful stuff out there. A lot of it was YouTube videos of advanced editing or very basic stuff. I don’t consider myself a sound expert in any way, but I wanted a good workflow that would produce decent quality stuff.
I have been interested for a while now in how data scientists can better communicate data analysis activities to each other and to people outside the field. I believe that our current methods are inadequate because they have mostly been borrowed from other areas (notably, computer science). Many of those tools are useful, but they were not developed to communicate data analysis concepts specifically and often fall short. I talked about this problem in my Dean’s Lecture earlier this year and how the field of data science could benefit from developing its own theories, to simplify communication as other fields have done.
In a deeply reported article, Casey Ross and Ike Swetlitz report that IBM’s Watson isn’t living up to its hype when it comes to cancer care: The interviews suggest that IBM, in its rush to bolster flagging revenue, unleashed a product without fully assessing the challenges of deploying it in hospitals globally. While it has emphatically marketed Watson for cancer care, IBM hasn’t published any scientific papers demonstrating how the technology affects physicians and patients.
For a long time now—actually ever since we started the blog—I’ve wanted to do a series of deep dives into specific papers that I thought were great. Clearly, it’s taken a bit longer than I expected, but I figure better late than never. Actually, that’s become a bit of a theme for my work these days! One problem I have with much academic writing on the Internet is that I feel like most of it is devoted to (1) promoting one’s own work; or (2) identifying weaknesses in others’ work.
About nine months ago I announced that I was attempting a Chromebook experiment for the 2nd time. At first I thought it was going to be a short term experiment just to see if it was possible to function with only a Chromebook. But in an interesting twist I got used to it and have been working exclusively on a Chromebook for the last few months since the experiment started.