Saturday, June 15

Coursera: Machine Learning

I started the latest track of Andrew Ng's Machine Learning Coursera Course a month ago. There are about two more weeks in the class and my experience so far has been positive. We've covered Linear classifiers, Logistic regression by gradient descent, multi-layer neural networks, and SVMs. This week's topic is unsupervised learning via K-means clustering.

Part of the material has been review for me as I studied Ng's OpenClassroom materials in the past.  I posted on that experience a little bit in My Introduction to Machine Learning. Although my primary goal when working through those materials was to understand how to implement backpropagation of artificial neural networks for supervised learning, this iteration of the class on Coursera has greatly increased my intuition for how to improve and by what metrics to judge the performance of a learning algorithm.

One large contrast between this course and the OpenClassroom materials is the in-video questions and review quizzes. Distributed practice is a great method for learning concepts and keeping the viewer engaged; Coursera courses in general seem to do a great job of keeping the learning experience interactive. Ng's Machine Learning, being sort of the "flagship" Coursera course, is no exception.

Good Lectures

Ng's lectures are very good at explaining the motivations and the nuances of employing machine learning algorithms. Every algorithm as presented has some prototypical application upon which analogies and concepts are based. He provides a lot of insights based on his own experiences with the profession of machine learning as to common pitfalls that people tend to run into when implementing a classification algorithm. For instance, it is common for someone, when their algorithm does not perform well, to conclude that the solution is to find more training data examples or features. In real-world applications, 'finding more training data' can be a significant project on its own. Furthermore, in the case of over-fitting, it would actually be detrimental to increase the number of features in the training set.

This type of meta-knowledge for the application of learning algorithms is incredibly useful to me as an aspiring data scientist. Some of the techniques such as cross-validation or generating learning curves were entirely unknown to me when I was playing around with that Kaggle assignment. If I had been aware of and made use of those techniques, I would have generated much better classification accuracy in that project and done so in a much shorter amount of time by correctly tailoring my algorithm to the data.

Programming Exercises

The class involves mandatory weekly programming exercises that center around some particular algorithm from the week's lectures. The tasks can range from anywhere between classifying spam emails to teaching a neural network to recognize hand-written digits. The exercises are well developed and, in the interest of time, are provided with a lot more content than what the student generates. For instance, all of the data is imported and pre-processed in the provided script files and functions. All that the student is usually tasked with is implementing one or more cogs in the system - some particular cost function or kernel for the task at hand. The submission process is also completely automated by the supplied "submit" script - all that the user has to do is update the indicated code and run it.

This is all very impressive but it can be rather limiting. I understand that the motivation is to make the coding task as clean and standardized as possible to allow for reasonable evaluation of the student's work and for the student to focus on the particular concepts that they are trying to practice. However, the whole experience is so convenient and constrained that it feels too easy. I remember the satisfaction that I felt after implementing an ANN from scratch for the tutorial Kaggle Digit Recognition Challenge and these exercises, although they involve some incredibly exciting algorithms, don't invoke that sensation in me. I could be a complete outlier in that view - I haven't had a chance yet to learn what other students' opinions are on the forums.

Moving Forward

Nevertheless, I'm very excited by the knowledge and experience that I have obtained throughout this course and greatly appreciate the efforts of Andrew Ng and his TAs in providing this learning experience. There is a great deal to explore about just the application of machine learning algorithms - not to mention their potential adaptations. My next goal is to read up on the subject of neural networks and built a better context for understanding these machine learning algorithms. I recognize that the ANN implementation that Ng presents is just a special case called a 'feed forward' neural networks. It would be interesting to learn how neural networks behave in real time so I've found an old book called Introduction to the Theory of Neural Computation that looks promising and I've recently begun reading Sebastian Seung's Connectome.

No comments:

Post a Comment