Baby steps towards machine learning 02 – Supervised, Unsupervised and Reinforcement learning

When facing with a problem which needs a machine learning solution one of the first things that we should do (after analyzing the data and understanding the required output of the final model) is to decide which type of learning approach we need to take. This is about deciding whether we are going with a supervised, unsupervised or reinforcement learning approach.

Okayyy, but I don’t know what these approaches mean‘, one might say. So lets take a simple look at what these are and also we will take our future cats and dogs identifying robot with us for this. (If you are unfamiliar with who this robot is, its an example we took in our previous article ‘Baby steps towards machine learning 01 – An Introduction‘.  So the idea is that we have a robot who’s task is to learn how to identify between cats and dogs when we show it pictures.)

Okay so the most widely used approach is ‘supervised learning’. The name itself gives us a clue. The learning of the robot is under supervision. Its like having a teacher. A teacher who teaches the robot first about how cats look like and how dogs look like. So by observing what the teacher is showing the robot will recognize patterns and learn. (remember? in machine learning we search for patterns).

What happens is that we give the robot, say 50,000 images of cats and dogs. And each image is named as cat or dog. So the robot takes these pictures one by one and study how the picture looks and what is the name (‘label’) that picture has. As it looks through each image and its name, it begins to see patterns. ‘okay, the images which are having white colored small animals are mostly named cats. So that’s a pattern i should remember‘.

Like that it goes through the images over and over again and comes to a good understanding of the patterns that differentiate cats from dogs. We provided it with ‘labeled data’ (images with proper names). So we supervised its learning the same way a teacher would guide a student. This is why its called ‘supervised’ learning. Also see that we didn’t write any rules, the robot by its own analyzes the images and finds patterns. Therefore it is ‘supervised machine learning’.

Then there is ‘unsupervised learning’. Like before the name gives us a clue on the what is going on. If the previous one was supervised and that meant having a guidance, then this one is learning without guidance.

What happens is like before we give the robot the set of images to study. But this time there is no name tag (‘unlabeled’) with each images. So it will observe the images and see patterns and put the similar patterned images in to similar groups. Its like giving a Lego set to a child and asking him to put the same colored Legos in to same group. Ones he has done it we give him a new Lego and asks him to which group does the Lego belong to. So the child will look at the pattern (in this case the pattern is the color) and tell this belongs to this group. The robot will do the same thing when asked to learn unsupervised. Since there are no ‘labels’ the robot doesn’t have a concept of what the words cat or dog means. In fact in the robot’s world there are no such words. So it will study the images and put similar patterned images to groups. And after learning when we give it a image of a cat what it will do is it will say ‘okay this picture you gave me have lots of resemble to that group of images, so i am going to predict that this images belongs with those‘. A good example of this is search by image engines where we can give an image to a model and get all the similar or related images to that (Google search by image).

Why use unsupervised learning when it can’t say if its a cat or a dog (only thing it can do is say that this is probably belong to this group or that). There are lots of reasons. One is in applications such as ‘google search by image’ this is exactly what we need. We don’t need the image’s name, we need to find out to which group the image is closely related to and get more images from that group. And also the ‘labeled data’ is very expensive. Imagine 50,000 pictures of cats and dogs; who is going to rename all these images saying ‘cat-1’, ‘cat-2’… ‘dog-25637’. Usually we either have to label these ourselves or pay others to do it. Both of which are not feasible most of the time. But the world is filled with ‘unlabeled data’. Therefore ‘unsupervised learning’ has huge potential in coming years.

Okay, two down, one to go. The last type we will be discussing will be ‘reinforcement learning’. (I must admit this is where my interest lies).

In both previous types we gave the model (in our case the robot) a learning period. But with ‘reinforcement learning’ from day one we are showing it images and asking it ‘tell me its name‘. Based on its answer we give it positive or negative reinforcement (rewards or punishment). On the very first day I show it a image of a dog and it answers (since it doesn’t have even a slightest clue on whats going on, it will just make a guess). Say it said ‘dog’. So I reinforce it with ‘good robot’.

Then I show it a picture of a cat. It remembers what happened last time, when it said ‘dog’ it got positive feedback. So it thinks ‘ah saying dog is good‘. It says ‘dog’ this time too. And then since its wrong I reinforce it with a negative feedback. I say ‘bad robot’. Now the robot wonders ‘what the hell, this guy said that that picture was a dog, now this guys says that this picture is a cat, I must find out a pattern which differentiate two. Then I will know how to answer correctly and also get good robot feed back

So bit by bit with trial and error the robot learns patterns in the pictures which separate cats from dogs. With time it becomes very good at answering correctly.

In my opinion ‘reinforcement learning’ might be the most interesting approach when it comes to using machine learning for robotics while other two methods are highly effective in data analysis and related fields.

Here we are at the end of the second step. We will go through supervised learning algorithms and try to develop simple python codes to try out our algorithms in the next few steps. Next step ‘Baby steps towards machine learning 03 – Nearest Neighbor Algorithm


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s