tl;dr: I have finally managed to build and train a very simple neural network, but before jumping right in and showing you everything I’ve discovered at once, I’ve decided to split it up, and start with a little background first. In this post, I will outline what a neural network actually is and why I want to build one, and in my next post I’ll show you what I’ve actually made.
In my first post, I talked briefly about my personal fascination with neural networks, but I didn’t really have anything to show anyone. Well today, that changes, because after days of scraping together lecture notes, googling terms, scouring documentation, and fiddling with software, I have finally managed not just to create a neural network, but to get it to learn, and apply what it has learnt for experimental purposes. Hypothetical ladies and gentlemen, behold my tiny, dumb Frankenstein:
I know it doesn’t seem all that impressive to you, but to me, this is little terminal printout represents my first real victory in my fascinatingly frustrating current personal quest.
Let me explain.
I. What’s a Neural Network.
My interest in neural networks was first piqued during this TedxWomen TedTalk, where a Google-award winning high-school student explained how she built a cloud-based artificial neural network for diagnosing breast cancer. Miss Wenger’s system piqued my curiosity, both in terms of what it seemed to do, and what it actually did.
On the one hand, it seemed like the first real step towards an artificial intelligence, a step towards a computer who could deal with decisions in a way a human could. I was amazed that our technology had advanced this quickly, and that we were already on the verge of create intelligent digital life (spoiler warning: we aren’t), and I resolved to dig into it for myself and learn what a neural network was; after all, if an incredibly smart highschool student can do it, how hard could it be?
As it turns out, pretty damn hard. For all my love of technology and science, I’m afraid I’ve never been much more than a fan-boy; most of the math I’ve been taught thus far in my life has been geared towards Finance and Economic problems, and is therefore not particularly useful at dealing with something this far out of my field. Neural networks are barely a few decades old, and are still intimidating to most computer scientists, so to an outsider like me, it was (and continues to be) a terrifyingly daunting quest.
The first time I tried looking, my search for resources ran me into wall after wall of material that was simply above my reach. I asked questions on Quora and Reddit in hope of finding direction, but I only kept running into more and more people who told me that tell me this was simply beyond my abilities, and I slowly started to believe them. While it was true I did learn bits and pieces here and there, I simply did not have context for what I was learning, and I grew increasingly frustrated with the realization that I simply could not advance without acquiring new math, and was therefore far away from any actual results.
However, I realized that while I’m definitely not the most gifted mathematician, I am very gifted at screwing around on the internet; so I wondered then, is it possible to just hack a neural network together?
“Hack: v.: To get a system to do what you want as quickly or easily as possible.“
It’s important to note that when I say ‘hack’, I don’t mean it any sort of dramatic or technical way, I use it to describe a mindset; basically, it is to get something to do what you want it to do. Hacking is therefore everything from tweaking your parking remote to open your front door, to Jailbreaking your iPhone, to tricking a bouncer at a club, to inventing Humingbirds. If the world is just a bunch of strings tied from one to another, then hacking is figuring out how to play something on them.
So, I decided that instead of wading around and doing math problems for a while, I’d jump in the deep end, and try to build whatever I could build from whatever I could find, and see where that got me. See, the one thing that I do know about me is that after I figure out what I want to do, I usually do whatever I have to to get it done so really, the trick was just figuring what I wanted to build.
But how do you figure out what you want to build without knowing what you can build?
The answer eluded me for a while; just reading about the technology behind networks seemed daunting, how could I possibly consider an application of it already?
Then I found some inspiration.
It was during yet another Google scavenger hunt about Backpropogation, when I ran into this little example by Devin McAuley (Don’t worry about the Java risk; its just a really big program), and while it certainly wasn’t very flashy, it was the first real example of what a real neural network actually looked like, and more importantly, the first one I came across that I could actually play with.
I learnt more in 5 minutes with this little applet than I had in days of searching, and also realized exactly what I wanted to build.
II. Seriously though, what’s a Neural Network.
Alright alright, here’s what I’ve learnt so far: To put it plainly, a neural network is a circuit of neurons divided into a bunch of layers, that connect input variables to output variables using a system of weights and biases to determine how much each input variable matters to each output variable.
Remember back in highschool biology, where someone told you what a neuron was? Well, all you need to remember is that a neuron is basically a cell in your brain that can receive and transmit an electrical signal (i.e. switch off and on), and when you put a bunch of these together though, their synapses connect, and then they can communicate with each other. Over time, they appear to specialize, and learn when to turn off and on in such a way that complex signals can be transferred to other parts of our body, which in turn create marvelous little side-effects like moving your arm and self-awareness.
As you can probably guess, we are nowhere near the technology to simulate something as beautifully complicated as our brains (and if we are, it’s probably being dealt with by people way smarter than me), but the principle seems fairly straightforward; a system of on/off switches that, when put together, are greater than the sum of their parts. Well, through some clever math and computational insights in the last few decades, we managed to construct a model of this system using weighted averages and biases.
However, in the early days, all we could construct was the simple Perceptron, which is just one single artificial neuron, and we really couldn’t get it to do much. However, when learnt how to put a bunch of them together, we realized we could suddenly deal with some more complex questions (like XOR ones), and once we figured that out, we managed to take another very important step further; we figured out how to teach them.
Sounds awesome right? Well, the reality behind it is slightly less awesome (but only slightly), by ‘teach‘, we mean we’ve figured out a way to minimize the error of the system in a meaningful way. Remember how we said earlier that a neural network is just a bunch of neurons connected to each other? Well, if you have a set of inputs and a set of targets, in principle, you can effectively tell it that it there is a ‘correct way’ to turn those switches on, and ‘teach’ it by correcting over it time. One such example of teaching is a method is called Backpropogation (in the example I showed you, this is done by hitting the ‘BP Learn’ button – for bonus points, notice what it does to the error over time.)
Backpropogation is actually an acronym, and it stands for “backward propogation of errors”, which, to poorly summarize, is a method of distributing error among the weights neurons in the hidden layer one step at a time, relative to the input and output layers, in such a way as to improve the network’s performance; it’s analogous to teaching a child by correcting him every time he’s wrong until he’s more or less right most of the time.
When we teach a child what a dog is, we keep showing him examples of dogs until he gets it right; this is essentially the same principle behind teaching a neural network.
So, we take our little unbiased and unweighted neural network by the hand, and keep pointing at an output and saying “See that? That’s called a Dog. Can you say dog? Can you remember what a dog looks like?” Well, as it turns out, neural networks can learn what a ‘Dog’ is, but unlike our clever (and smellier) human babies, it can’t really do so unless we sit down and tell it over and over again what a dog looks like, and even then it’s still just a knee-jerk response; there is no actual cognition happening, it just learns if the input is furry and the output is furry, and the output is called a ‘dog’, then dogs are probably furry, and therefore anything that’s furry is probably a dog.
Thus, you can probably see how neural networks are not terribly smart things; it has no understanding of what it’s learning or why. However, what a neural network can do that our brain can’t is learn the same thing over and over really, really quickly; a neural network is just code, and therefore can run through the same dataset hundreds of times in a matter of seconds, until it is very good (ideally, anyway) at knowing the right answers.
Think of it like a really dumb kid who doesn’t understand what he’s learning, but can memorize it very very quickly; He may never understand what a dog is or why it’s important (yet), but he can get very good at knowing what a dog looks like; just like a baby or you and I would be, if someone sat down and told you a thousand times.
So, now you know the basics of what I’m trying to do, but still nothing about what I’m trying to do. Well, now that my little primer is over and you have a little context, I can finally sit down and show you what I’ve actually made…in my next post. I know, I know, that’s completely unfair, but the actual application of what I’ve learnt will take a little while to explain, and I didn’t want to make one terrifying monster post. Still, you’ve stuck around this long (in theory, anyway), so let me at least tell you what I’m about to do:
I’m going to create a basic neural network, and teach it to tell me if someone is a gamer or not.
See you guys in Part 2.