Open classroom exercises
A couple of months ago, I worked through some of the exercises for Andrew Ng's OpenClassroom Machine Learning Class. I think the site is now defunct (or maybe morphed into coursera) but the assignments were still available at the time of this post. The experience left me with a pretty good basis for starting to learn about using artificial neural networks (in addition to other ML algorithms) to solve optimization and classification problems. This is a very exciting field for me to peer into as it involves some promising prospects for the creation of intelligent machines. Deep learning, one ML research area that seems to be prevalent, is essentially just the clever utilization of very large neural network models to learn complex relationships. One of my current goals is to comprehend some of the strengths, weaknesses, and challenges involved with deep learning networks and their implementation.
Childhood dream
My current conception
While neural networks may provide powerful and flexible algorithms, there is still a great deal to be discovered about their behavior and, historically, a kind of stigma surrounding their use. It seems that there was a little bit of hype followed by disenchantment in their history. Machines weren't powerful enough to implement them at the scale required for useful or impressive results and they were seen as flexible but inefficient toys for solving problems that weren't necessarily novel. Another limitation was the lack of an appropriate back-propagation algorithm that could allow for many-layer ANNs. Today, neural networks can be seen in a variety of useful applications, and the niche seems to be growing.
Moving Forward
The "ann" class
The neural network class "ann" can be instantiated with a vector of any length such that each element in the supplied vector would correspond to a layer in the constructed neural network. The value of each element in the supplied vector drives the number of nodes in that layer. For example:
>>A = ann([4,4,1]);
builds an ann object with four input arguments, one output argument, and one hidden layer with four nodes and assigns it to variable "A".
The primary challenge that I faced with Octave's object-oriented utilities is the fact that class methods can't be called to modify a given class instance without assignment. For example, if I wanted to change a value in A such as the learning rate "alpha", I would have to do the following:
A = set(A,'alpha',[foo]);
The problem here is that "A" doesn't point to some class structure that is modified. The entire class object "A" is being passed (copied) into the method "set" which returns the modified class. This is inefficient because all of the object data in "A" has to be copied into the set method and then copied again back into "A". A better implementation would modify that object - but this doesn't appear to be possible with Octave's available Object-Oriented utilities.
I tried my hand at the Kaggle Digit Recognition challenge with my "ann" code and was able to get about 90% accuracy on the first attempt. This is by no means exceptional, but considering that "ann" was working with unprocessed, raw data from a bitmap image to generate classifications about the hand written shapes in those images, I was pleased with the results.
Machine Learning Coursera Course
The code for the Octave ann class is available on this Github repository.
No comments:
Post a Comment