Skip to main content

Introduction to Neural Networks

With the evolution of Machine Learning, Artificial Intelligence has also taken a vast leap. Deep Learning is an improvised take on Machine Learning which uses Neural Networks to solve distinctly complex cases which involve the analysis of multi-dimensional data.

A Neural Network was built to mimic the functionality of a human brain.

A Neural Network mainly consists of 3 different layers:
1. Input Layer
2. Hidden Layer
3. Output Layer

Simple neural network diagram. | Download Scientific Diagram


Since the entire concept of Neural Networks is very huge, we will proceed step by step.

Firstly, we will try to implement some ideas using Logistic Regression which is an algorithm fro Binary Classification. This is a learning algorithm which we use when the output Y in a Supervised Learning model is either zero or one.

Taking a very basic example:
If we want to write an algorithm which will recognize whther an image is either a picture of a dog or not a picture of a dog, we will output our prediction(lets call it ŷ) which will be the overall estimate of all our predictions(Y). So let's assume we are given an input of an image which is a 64 by 64 pixel image. Now our system looks into each image as a matrix of red, blue and green pixel instensity, commonly known as RGB. So the overall matrix we get is 64x64x3, where 3 is due to the RGB factor. Now we will unroll all these pixel values into an input feature vector X, which will now be our input matrix. Now W will be the parameters we are supplied with, which will also be an X dimensional vector. And we will also have a real number 'b', which will act as a bias in this case.

Now the next question: Given all the input, how do we generate the output ŷ?

We have the formula as
ŷ=W.T+b

But this value ends up as a very large number or in the worst case, could even be negative. But since we need our output to be either 0 or 1, we will apply the sigmoid function to this value, where sigmoid function of a value z stands for the quantity 1/(1+e^-.z).

So,
 ŷ=sigmoid(W.T+b)

Now to train the parameter W and b, we also need a cost function. So now to train our parameters we will be given a training set of m training examples.
 So for a given {(x(1),y(1)),(x(2),y(2)),...,(x(m),y(m))}, we will get an output ŷ((i)) which will be approximately equals to actual value y(i). Now we will be required to measure the loss function to see how well our algorithm is performing.
We define the loss function as:
L(ŷ,y) -(ylogŷ + (1-y)log(1-ŷ))

Now finally defining the cost function:
J(W,b)  = (1/m)( ΣL(ŷ(i),y(i))


So this cost function will be used to train the parameters of our Logistic Regression Model.

So now that we have got a clear concept of how to apply the Logistic Regression model, we will move to the next step towards our goal of learning Neural Network in the upcoming blog.

Comments