🔰 What is a Neural Network?
A neural network is a type of algorithm in Artificial Intelligence that tries to mimic how the human brain processes information.
Just like the brain has neurons connected to each other, a neural network is made up of artificial neurons (also called nodes or units) arranged in layers. These neurons work together to learn patterns from data and make decisions or predictions.
⚙️ Basic Working
- ➡️ Forward Propagation: Data flows from input to output.
- 🤖 Prediction: The network makes a prediction.
- 📉 Loss Calculation: The error is measured.
- 🔄 Backward Propagation: The error is sent backward to improve the model.
🧪 Example
Suppose you want to teach a neural network to identify if an image contains a cat:
- 📥 Input: You feed it images.
- 🧠 Hidden Layers: The network tries to learn features like eyes, ears, shape.
- 📤 Output: Finally, it predicts “cat” or “no cat”.
🧠 Perceptron and Multi-Layer Perceptron (MLP)
What is a Perceptron?
A Perceptron is the simplest type of neural network unit — it’s like a mini decision-making machine.
How it works:
- It takes multiple input values (like numbers),
- Multiplies each input by a weight,
- Adds them up with a bias,
- Passes the result through an activation function to decide the output (e.g., 0 or 1).
Output = Activation(w₁×x₁ + w₂×x₂ + ... + wₙ×xₙ + b)
📘 Example:
Imagine a student deciding if they should go out:
- x₁ = “Is it raining?” → 0 or 1
- x₂ = “Do I have an umbrella?” → 0 or 1
- The perceptron weighs these and decides Yes/No.
🤖 What is a Multi-Layer Perceptron (MLP)?
An MLP is a type of feedforward neural network that has:
- ✅ One input layer
- ✅ One or more hidden layers
- ✅ One output layer
- Input layer = raw materials
- Hidden layers = machines that refine and process the material
- Output layer = finished product (decision)
⚡ Activation Functions
What is it?
An activation function decides whether a neuron should be “activated” (i.e., pass its signal forward). It adds non-linearity, allowing neural networks to learn complex patterns.
📚 Why Do We Need It?
Without activation functions, the entire neural network would act like a simple linear equation, no matter how many layers you add. That means it couldn’t model real-world, non-linear data like images, language, or sound.
🧪 Common Activation Functions
Function | Formula | Range | Use Case / Notes |
---|---|---|---|
Sigmoid | 1 / (1 + e−x) | (0, 1) | Good for binary output; can cause vanishing gradients |
Tanh | (ex − e−x) / (ex + e−x) | (−1, 1) | Better than sigmoid in hidden layers (centered around 0) |
ReLU | max(0, x) | [0, ∞) | Very fast, popular for hidden layers; can cause “dead neurons” |
Leaky ReLU | x if x > 0, else αx | (−∞, ∞) | Avoids dead neurons with a small slope for negative x |
🖼️ Simple Analogy
Think of an activation function like a filter:
- 🔹 If a signal is strong enough, it passes through (ReLU says “ok, you’re positive!”).
- 🔸 If it’s too weak, it’s blocked or squashed (Sigmoid says “you’re close to 0”).
🔄 Forward and Backward Propagation
These two processes are how a neural network learns from data. Think of them as the input-to-output journey and the error correction feedback loop.
1️⃣ Forward Propagation
This is when the input moves forward through the network, layer by layer, until it reaches the output. 🔹 Steps:- Input data enters the input layer.
- Each neuron:
- Multiplies inputs by weights
- Adds a bias
- Applies an activation function
- The result is passed to the next layer until the final output is produced.
2️⃣ Backward Propagation (Backprop)
Once the prediction is made, the network checks how wrong it was — this is where learning happens. 🔹 Steps:- Calculate error (difference between predicted and actual result) using a loss function.
- Use Gradient Descent to:
- Calculate how much each weight contributed to the error (via chain rule in calculus).
- Update weights to reduce the error next time.
🎯 Combined Cycle:
- Forward Propagation → Output
- Compare with True Output → Loss
- Backward Propagation → Adjust Weights
- Repeat for many examples (epochs)
💥 Loss Functions
A loss function measures how wrong the model’s prediction is. The goal of training a neural network is to minimize the loss, so that the model becomes more accurate.
🔍 Why is it important?
Loss is like a teacher’s feedback — it tells the network how bad its answer was. Backpropagation uses this feedback to adjust the model and make it better.
Loss Function | Formula (Simplified) | Use Case | Explanation |
---|---|---|---|
Mean Squared Error (MSE) | 1/n * Σ(y – ŷ)² | Regression | Penalizes large errors more (squares the difference). |
Mean Absolute Error (MAE) | 1/n * Σ |y − ŷ| | Regression | Measures average absolute difference; less sensitive to outliers. |
Cross-Entropy Loss | −Σ y × log(ŷ) | Classification | Great for probabilistic output; harsh on confident wrong predictions. |
📘 y = actual label (ground truth), ŷ = model prediction
🧠 Simple Analogy:
- If the correct answer is 10:
- Guess 5 → MSE = (10−5)² = 25 → ❌ big penalty
- Guess 9.5 → MSE = (10−9.5)² = 0.25 → ✅ small penalty
- Cross-Entropy would punish you more if you were very confident and wrong.
✅ Summary of This Section
- ✅ Neural networks use forward propagation to make predictions.
- ✅ They use backward propagation to learn from mistakes.
- ✅ Loss functions are the key to telling the model how to improve.