The ABCs of Neural Networks
I, Rushi Prajapati, Welcome you, to my another blog in my “Simplifying Series”, in which I’m trying to explain complex topics by simplifying them. My first blog in this series was on computer vision; my second was on ML-DL; and now I’m bringing you another one on the interesting topic of Neural Networks.
“Imagine a network of interconnected virtual brain cells working together, learning from experience, and continuously improving their performance — that’s the beauty of artificial neural networks.”
What Is Neurons?
Think of neurons as tiny messengers that send and receive messages throughout our body, allowing us to think, feel, move, and interact with the world around us. They are specialized cells that transmit information through electrical and chemical signals.
Need Of Artificial Neural Network?
They can help computers learn and make decisions, just like how our brains do. They are like smart algorithms that can recognize patterns, solve complex problems, make predictions, and even understand things like images, sounds, and language. It’s like giving a computer the ability to think and learn.
PERCEPTRON
Perceptron is the most simple artificial neural network. It consists of one neuron. An artificial neural network(ANN) consist of layers of nodes, or a neurons connected with the edges.
The perception function in a manner similar to a biological neuron. A biological neuron receive electrical signal from its dendrites, modulates the electrical signal in a various amounts and then fires an output signals through its synapses only when the total strength of the input signals exceed a certain threshold the output is then fed to another neuron and so forth.
To model/copy the biological neuron phenomenon the artificial neuron network performs two consecutive operations:
First operation : Calculate the weighted sum(i.e. linear combination) of the inputs to represent the total strength of input signals. The weighted sum function is the sum of all inputs multiplied by the weights, and then added to a bias term.(This function will produce a straight line equation)
Second operation : Gives the sum of weighted signals to activation function to determine whether a neuron should fire the output or not.
All the input features are useful or important. To represent that, each input node is assigned a weight value called connection weight to reflect its importance but not all the input features are equal important(useful) feature. Each input feature (X1) is assigned its own weight (w1) that reflects its importance in decision-making process.
Input assigned greater weights have greater effect on the output and input assigned low weight have lower effect on the output. If the weight is high then it amplifies the signal and the weight is low it diminishes the input signal.
Now, you might be wondering how exactly the perceptron learns. Well, let me break it down for you in a simpler way.
Perceptron follows a four-step approach
Feed Forward: Sum all the inputs and multiply weights with the inputs, add bias and then applies the activation function to make the prediction.
Loss Function: Compare the output with correct label to calculate the error.
Back Propagation: Then update the weights according to the error and prediction. If the prediction is two high , adjust the weight to make the lower prediction the next time and vice-versa.
Repeat: Repeat the process until the model gives right accuracy.
Perceptron’s limitation
Perceptron is a made of a one neuron and it is a linear function which means the trained neuron will produce a straight line that separate our data. Linearly separable data can be separated by the straight line but more complex data cannot be separated by the straight line.
For high dimensional data we need multiple neuron in the network to perform well.
The Perceptron (single neuron) works fine with the simple data that can be separated by the line(i.e. linear dataset). When we have more Complex dataset that cannot be separated by the straight line(non-linear dataset), then we need MLP(Multilayer Perceptron).
Multilayer Perceptron
Think of a multi-layer perceptron as an extension of the basic perceptron we discussed earlier. While the perceptron has only one layer of neurons, an MLP consists of multiple layers of neurons, hence the term “multi-layer”.
The key idea behind an MLP is that each neuron in a layer is connected to every neuron in the subsequent layer. These connections have associated weights that the MLP learns during the training process.
Multilayer Perceptron Architecture
- Input layer : Contains all the feature vectors
- Hidden Layer : The core feature learning takes place in this layer. The neurons are stack on the top of each other in the hidden layer. They are called hidden layer because we don’t see your control the input going to into this layer or the output.
- Weight connections: Weights are assigned to each connection between the nodes to reflect the importance on the final output prediction.
- Output layer: This is the final layer of the MLP, which produces the output or prediction. The number of neurons in the output layer depends on the nature of the problem you are trying to solve.
To make a prediction with a multi-layer perceptron (MLP), the input data is passed through each layer, starting from the input layer and moving through the hidden layers until reaching the output layer.
At each neuron in the MLP, the weighted inputs are added up, a bias term is included, and the result is then passed through an activation function. This process of summing the weighted inputs, adding a bias, and applying an activation function is repeated for each neuron in each layer. By going through this process layer by layer, the MLP gradually transforms the input data and generates the final output. Each layer’s neurons receive inputs, perform calculations, and pass their results to the next layer until the output layer produces the prediction.
In simpler terms, the input travels through the MLP, and at each step, calculations are performed by each neuron using weights, biases, and activation functions. This sequential process continues until the final output is obtained.
Additionally, techniques like transfer learning, reinforcement learning, and generative adversarial networks (GANs) have expanded the capabilities of neural networks, allowing them to tackle diverse problem domains and generate realistic outputs.
CONCLUTION
In conclusion, neural networks have proven to be a powerful tool in various domains, ranging from computer vision and natural language processing to finance and healthcare. They excel at learning complex patterns and making accurate predictions based on large amounts of data. With advancements in hardware and algorithms, neural networks continue to evolve, pushing the boundaries of what is possible in artificial intelligence. Artificial Neural Networks (ANNs), Perceptrons, and Multi-Layer Perceptrons (MLPs) are fundamental concepts in the field of Deep learning that every aspiring data scientist or AI enthusiast should be familiar with.
In upcoming blogs, I will delve into more advanced neural networks, such as convolutional neural networks (CNNs) and R-CNNs and also on the topics of computer vision like image classification, object detection providing detailed explanations in simple terms.
With each blog in my “Simplifying Series”, I want to make these complex topics easier to understand. I want you to feel confident exploring and learning about data science.
Keep an eye out for more blogs in the “Simplifying Series.”