Understanding Deep Learning (Neural Networks) visually

//Understanding Deep Learning (Neural Networks) visually

Understanding Deep Learning (Neural Networks) visually

Till now, you might have heard of Machine Learning algorithms like Deep Learning or Artificial Neural Network or Neural Networks. These all work on the same principle and similar algorithms. In This blog post – let us see visually how Deep Learning or Artificial Neural Network (Short form ANN) learns by itself from data.

Let us check out this website which has excellent visuals  – playground.tensorflow.org

The Homepage looks like this:

We are looking at a lot of different things here – a play button, few controlling options like Learning Rate, Activation, Problem Type. We also have the option to choose data (left), features, hidden layers and at the right end we have visual output.

Let us understand few key terms that you need to know:

1. Play Button  

When clicked, the algorithm starts learning from Train data and applies to Test Data if that option is enabled.

2. Learning Rate  

The rate at which you want your algorithm to learn from data – you need to choose the optimum speed based on your experience with data.

3. Problem Type  

Here we have two options Classification or Regression. Classification problems are like determining Pass/Fail, Cancer / No Cancer, Terrorist or No Terrorist from given data. In Regression Problems output is in form of numbers. For Eg- Predicting Stock values, rain amount etc.

4. Data  

There are four different Datasets are given – which you can run and test your algorithms on . For our post, we will select “Exclusive Or” dataset.

5. Epoch

The number of steps algorithms takes to learn from data. (Batch Size – another term which just means taking random set/rows of train data). For each epoch, one batch of data is used for learning.

6. Features

Simply means names of columns in dataset except for the output column. These are input data – that might be affecting output. In this problem, we have Features X1 and X2. There are other features available which are variants of X1 and X2 – square of X1 and X2, sine function , or multiplication of both. These extra features are used when we find that square of X1 has more relation with output than X1. Or Algorithms learns better in the latter case.

7. Output

Visual representation of data. Let us imagine this is Cancer cells data – Where Orange cells are cancerous cells and Blue ones are not. Each cell represents a patient and we need to build an algorithm which learns itself which cells are cancerous or not – based on input data – X1 ( length of the cell ) or X2 ( breadth of the cell ).

8. Ratio of Training to Test data and Show data option

This simply means out of 100% rows by default 50% of rows are used for learning (train data) and 50% of remaining data to be used to test algorithms on (test data)

Show data option in right below corner shows you cells on which algorithm will be tested on – whether it will give right output when working with new data.

 

 

We know enough terms for now – let’s make a few changes in the Webpage for our convenience. In the bottom part of webpage, find the below options & change them as per the image below and refresh the page.

 

Now scroll back up and make the below changes –

1Data – choose second dataset “Exclusive Or ” for our problem. As soon as you select, you will see Output in right changes as per data.

2. Ratio of training to test data – by default it is 50%, make it 70%. In real world applications – except it’s not rocket science problems – we tend to choose this ratio. You can later change this to your interest.

3. Show Test Data – Select option to see the cell that will be new for the algorithm and will be tested on.

4. Learning Rate – Keep it as it is 0.03

Now click on Play button and keep an eye on Output visual closely

 

 

 

The algorithm has not learned yet. You can see Orange cells and blue cells – on X axis is length and Y axis is the breadth of the cell. You can see visually there are 4 sections – 2 of them are cancerous cells.

 

 

 

 

 

 

At Epoch 39, you can see this output from the algorithm. Pause it after few seconds and you will get below screen.

You can see the machine is learning to identify different cells and trying to predict that there are two sections. At steps 39, means algorithm had 39 different sets of cells each containing 10 cells to identify 2 different types of cells.

Note that we haven’t given any information that how many types of cells we have.

 

 

 

Now click on Play button again till epoch 000183. And below output will be shown.

 

 

You can see algorithm has almost learned – and it is showing 4 sections and with 2 different types of cells.  The Cells with dark border are test data 0 which is not being used to learn in the application.

 

 

 

 

 

 

 

You can see Test Loss and Train Loss as 0.002 – which is a really good error rate. In simple terms without losing it’s meaning – you can say the algorithm is performing 99.98% correct (train loss) and same performance for test data ( test loss).

Whenever a problem occurs when you have less train loss and large test loss – it’s an overfitting problem. It means algorithm is learning too much from its training data that it is not performing well when new data comes to test on.

You can see a simple line chart above – which is showing how much algorithm is learning based on the number of steps(epoch). You can see after a while – after epoch 35-45, learning starts to decline as machine says there is no new type of cells are coming in which helps to learn. Hence optimum level of epoch is good to have a better performance where you can specify in code how many steps algorithms should take. This becomes evident when we have large datasets of millions of rows where each step(epoch) takes few minutes to hours to complete.

I hope you have now “seen” how it works. Note that in this we haven’t used any logic to determine output

  1. Data is Brain behind output
  2. Rather than giving output in one steps – it learns slowly
  3. You need to keep testing it using remaining test data
  4. It’s extremely important that software does not say a cell is cancerous when it is not – Type II error. In this problem correctness of 99.98% is also not enough

These are all characteristics of most of the machine learning algorithms. We are trying to train testers & re-skill in AI and Machine Learning algorithms via Learn With Moolya initiative – we have already covered more than 60% of the syllabus. If you are interested in joining us – please let us know by emailing at learn@moolya.com .

You can watch the Learn-With-Moolya sessions from this YouTube playlist – click here.

Thank you and happy testing!

[If you have missed any session – please watch the videos on our Official YouTube Channel. You can subscribe to be automatically notified about new sessions.]

Author – Riyaj Shaikh | Chief Data Officer at Moolya

2017-08-02T05:37:07+00:00

About the Author:

Leave A Comment