A programmer will always aim to take out human repetition. Usually, this means a small investment in time leads to a much larger windfall later when the magic happens with the click of a button (or more likely a command sent to the shell). Sometimes, however, the opposite happens.
I just turned in Assignment 2 for my Data Analysis Course so I can now share it on
here (Unfortunately, I’ve been warned that people have been plagiarizing so I’ve removed my files to prevent cheating… which ironically I did not list as a challenge for a MOOC below, but should be added). In this assignment, we were given sensor data from the Samsung Galaxy SII recorded while users performed specific activities. The goal was to develop a model on some training data to predict what activity the test subjects are performing. As usual, I wish I had more time to spend on it because I always feel like there is more I can add. Using random forests, I got a misclassification error rate of about 5% on the test subjects. Not too shabby, but at some point I would like to compare it to other models such as SVMs or Neural networks. Continue reading
I thought I’d share my first assignment from the Coursera class, Data Analysis. It’s a very simple analysis, but it did get me back into the feel of writing research papers (as opposed to the terse sentence fragments I email at work). I’m posting it here to demonstrate what level of work you can expect from such a course… and because charts in R blow Excel out of the water. Continue reading
One aspect of my New Year’s resolution for this year was to hop on the Coursera (and other MOOCs) bandwagon. I started a course last year called Probabilistic Graphical Models, which was fascinating for the one week I stuck with it. But, it required about 15 hours (too many) per week. This time around, I decided to start with baby steps. Take a few short refreshers to get into the habit and then expand from there. But there’s just too many awesome courses available right now! Continue reading