Podcast #6: Data Analysis MOOC Post-mortem

Jeff and I talk about Jeff's recently completed MOOC on Data Analysis.

This entry was posted in Podcast and tagged , , , . Bookmark the permalink.
  • http://twitter.com/hspter Hilary Parker

    so it's "m-OO-ck" instead of "mock"? I always say "mock"

    • Roger Peng

      How do you end up with "mock"? I'm dying to know the dialect of English that ends up with that pronunciation :)

      • http://twitter.com/hspter Hilary Parker

        well to be fair, maybe we are both wrong and it should rhyme with "book" -- I mean, look, book, took, cook... mooc.

        this is a terrible acronym.

    • Chris Stehlik

      Don't worry, we won't mock you for your mispronounciation.

    • http://twitter.com/BillvLee Bill

      And I thought it was Moh Oak Massive Open Online Course

  • Benjamin Haley

    I have suggestions for Jeff in regards to signal to noise on the forum and the problem of running user code. Fingers crossed that this gets to him.

    I think stack overflow has a great model for forum interaction. The open source clone osqa was used successfully in Thrun and Norvig's AI class (see aiqus.com). Its not too hard to get osqa running on ec2 (i.e. I've done it in an afternoon).

    I also think Udacity has managed to solve the problem of running user code. They support in browser programming. Andrew Ng also solved this problem in his first offering through a code submission system. It might be worth reaching out to them to ask for their insight into how its done. Still, I'm sure its a hard problem (i.e. I have not done it in an afternoon).

    • http://twitter.com/Drakpappan M. Palo

      There is also the option of using somewhat of the same system of assignment submission script that Roger was using in his class. There should be some good way of producing a good script for this.

  • Sophie

    Jeff, many thanks for this great course. I enjoyed it a lot and learned so much. I can only imagine how much work that must have been. So thanks for your dedication. I am fascinated by the Coursera concept, finally everyone can access education. It's just amazing.

  • http://twitter.com/Drakpappan M. Palo

    It was a very good class, and it helped my understand a lot of statistics that I have heard and read about before. The way of getting knowledge not only from reading but also trying it out on data was really helpful.

    The quizes in the end with only five questions was better because then they could go into more depth and I would have to give it a bit more thought before submission. What I missed a bit in the last few weeks was that there seemed to be less lecture material and more was up to us students to get the knowledge from somewhere. I see no problem in this but I would have liked some more in depth discussion on how and why to use a specific prediction method. I guess there is a lot of material on the web for this and also books to read, but I still like to see and hear it as well as being able to reproduce the analysis.

    The thought of having several different levels within a class is intriguing and would get me too take the course again.

    Thanks both of you for teaching me R and helping me to start getting to grips with data analysis.

  • Rory Mackay

    Great class! Thanks. As to the message boards they were the heart of the course most of the tweaks and trip ups were solved there for me. Thanks again.

  • Mary Howard

    I took both the Data Analysis course and the Computing for Data Analysis course; I can say, despite getting good grades in both, that I came to detest peer-grading and much preferred Roger's method of a submission program for grading. The problem with peer-grading in a class like this is that many students just do not have enough education or the language skills to properly grade an assignment. Also the idea that the professor really does not answer questions on the forums as to what techniques are correct, nor are the TA's connected to the professor to answer questions for the professor, was something I had a hard time getting used to. In programming courses like the R course this works a bit better, since a program will either work or not, but not so much for a data analysis course, where we really would have liked answers, or even suggestions on where to go, when we were wrestling with issues like should there be interaction terms on the first assignment or what to do about the within-subject effect on the second analysis. Nonethless, it was an interesting course and I'm glad that I did it.

  • Roland Kofler

    best thing is still coming whit the inclass kaggle competition.

  • http://twitter.com/Y_S_Bacchus Y.S.Bacchus

    Thanks for this fascinating conversation. I attended the course and really enjoyed it. Apart of the mathematical content the highlight for me was the peer grading system. To be honest I lost faith in this method during the course and not because of my own grades :) The weak link in the peer grading seems to be that the system provides no motivation for the grader to perform a proper grading. I wonder if you see a way to improve it and whether you done any statistical experiments related to peer grading. By experiments I mean for example comparing peer grading to random grading. From the conversation I guess that the answer is "no experiments", but still I am asking, because at the first glance it simply does not feel as a reasonable method to grade mathematical papers and without a statistical proof I would rather consider it a toy method, a part of the social interaction, not a tool to measure educational progress. Thanks for the effort put into preparation of the course!

  • jamesedavies

    I really enjoyed the course, and I really do think I have picked up some useful skills that I can immediately apply in my work. Good job. Thanks.

  • http://bit.ly/ZpOeeR L. Collado Torres

    What about emphasizing writing code & reports using Rmd through RStudio and submitting them to Rpubs? I think that Roger already talks about RStudio in his class, and well, it works well out of the box in any system. I know that you don't want to make the tech part of the class harder, but given some detailed instructions & videos, I think that it should be doable to get the students to learn this pipeline (Rmd -> Rpubs w/RStudio).

    Then, the part of reproducibility is easier to evaluate because you don't need to re-run the code. It's all bundled together.

    Plus, Rpubs allows commenting and sharing.

  • Lucas Castro

    Thanks Jeff! I enjoyed so much the course and I learned a lot. I also think that this course will help me to improve my actual work on BI area and to keep me hungry for new knowledge. Thanks!

  • Guest

    i re

  • http://www.facebook.com/hauser.quaid.3 Hauser Quaid

    I really hope that you'll do the class again. Unfortunately I didn't had time to go through this run. I was able to watch couple of first weeks, you're a great lecturer and the course was fantastic, lectures, homework's, and especially the projects.

    What I'd like Coursera to do is to make it easier for lecturers to do the runs again. I don't see why videos you've made couldn't be used again, and you could fix only those which you didn't like.

  • Crystal Humphries

    I really enjoyed the course. I didn't earn a certificate due to time constraints but I got 3 other people to watch the videos with me. I also sent the link of the you tube videos to my fellow students. The knowledge we gained was extremely valuable. Thanks!

  • L. George

    I completed both the Data Analysis course and the CDA course, and appreciate them very much - thanks!

    Regarding the Data Analysis course, I'm curious why there weren't 'official' solutions posted for the analysis assignments - if not specifics, then general comments about viable approaches and/or pitfalls. Forum threads were filled with contradictory statements, and assignment reviewers had a range of skills and backgrounds - huge variability. I'd love to see how an expert would approach the two problems. Otherwise it's like a cupcake without icing: quite good... and a little bit more would top it off :)

  • Chris Stehlik

    OMG , I thought that ambulance was coming from outside my window! :)

  • Chris Stehlik

    I completed the course and really liked it. I got most of it. However, I still don't get PCA, so I would spend more time on that, mindful that some people haven't had linear algebra. It also came at an odd point in the course, where I was sure for a while that it would be needed in the first assignment, so I worked hard trying to understand it and then realized (through the forums) that what I did know about it meant it would not be useful.

  • http://twitter.com/BillvLee Bill

    You could caption, outside the embedded chat video, that it is 17 minutes long, as people do when saying "[link to file] PDF 85 Mobytes." etc.
    That was one of the difficulties with the course, the time to download and watch videos later. Thank you for the splitting but it was simpler to find the (vague, obscure) icon to the subtitles texts and read them first, (and faster). That and Wiki for any unusual words ("I wonder what meaning he means with that term) ran me through the course.
    Pulling off the videos at work means that I could not listen to them in a 'corner window' on the screens. USB keys to the rescue.
    A pre-test? or a sort of "Try this with a mini-lecture from mid-level and a few questions would get us into the feel and style of the course.
    I made the mistake of signing up with 4 other courses.
    24 hours - 2 hours travel to - 8 hours work and - 8 hours sleep and - 2 hours meals etc. left less than 4 hours for all the courses, barely time to review without the first watching, and that split with fatigued morning and late evening time.
    I envy students with all them time devoted to reading, study (and the pub).