🔰 What is a Neural Network?

A neural network is a type of algorithm in Artificial Intelligence that tries to mimic how the human brain processes information. Just like the brain has neurons connected to each other, a neural network is made up of artificial neurons (also called nodes or units) arranged in layers. These neurons work together to learn patterns from data and make decisions or predictions.

⚙️ Basic Working

  • ➡️ Forward Propagation: Data flows from input to output.
  • 🤖 Prediction: The network makes a prediction.
  • 📉 Loss Calculation: The error is measured.
  • 🔄 Backward Propagation: The error is sent backward to improve the model.

🧪 Example

Suppose you want to teach a neural network to identify if an image contains a cat:
  • 📥 Input: You feed it images.
  • 🧠 Hidden Layers: The network tries to learn features like eyes, ears, shape.
  • 📤 Output: Finally, it predicts “cat” or “no cat”.

🧠 Perceptron and Multi-Layer Perceptron (MLP)

What is a Perceptron? A Perceptron is the simplest type of neural network unit — it’s like a mini decision-making machine. How it works:
  • It takes multiple input values (like numbers),
  • Multiplies each input by a weight,
  • Adds them up with a bias,
  • Passes the result through an activation function to decide the output (e.g., 0 or 1).
🧮 Mathematical Form: Output = Activation(w₁×x₁ + w₂×x₂ + ... + wₙ×xₙ + b) 📘 Example: Imagine a student deciding if they should go out:
  • x₁ = “Is it raining?” → 0 or 1
  • x₂ = “Do I have an umbrella?” → 0 or 1
  • The perceptron weighs these and decides Yes/No.

🤖 What is a Multi-Layer Perceptron (MLP)?

An MLP is a type of feedforward neural network that has:
  • ✅ One input layer
  • ✅ One or more hidden layers
  • ✅ One output layer
🔸 Each neuron in one layer is connected to every neuron in the next — hence “fully connected”. 🔸 Hidden layers help extract features and model complex patterns. 🛠️ Real-world analogy: Think of an MLP like a decision factory:
  • Input layer = raw materials
  • Hidden layers = machines that refine and process the material
  • Output layer = finished product (decision)

⚡ Activation Functions

What is it? An activation function decides whether a neuron should be “activated” (i.e., pass its signal forward). It adds non-linearity, allowing neural networks to learn complex patterns. 📚 Why Do We Need It? Without activation functions, the entire neural network would act like a simple linear equation, no matter how many layers you add. That means it couldn’t model real-world, non-linear data like images, language, or sound.

🧪 Common Activation Functions

Function Formula Range Use Case / Notes
Sigmoid 1 / (1 + e−x) (0, 1) Good for binary output; can cause vanishing gradients
Tanh (ex − e−x) / (ex + e−x) (−1, 1) Better than sigmoid in hidden layers (centered around 0)
ReLU max(0, x) [0, ∞) Very fast, popular for hidden layers; can cause “dead neurons”
Leaky ReLU x if x > 0, else αx (−∞, ∞) Avoids dead neurons with a small slope for negative x

🖼️ Simple Analogy

Think of an activation function like a filter:
  • 🔹 If a signal is strong enough, it passes through (ReLU says “ok, you’re positive!”).
  • 🔸 If it’s too weak, it’s blocked or squashed (Sigmoid says “you’re close to 0”).

🔄 Forward and Backward Propagation

These two processes are how a neural network learns from data. Think of them as the input-to-output journey and the error correction feedback loop.

1️⃣ Forward Propagation

This is when the input moves forward through the network, layer by layer, until it reaches the output. 🔹 Steps:
  • Input data enters the input layer.
  • Each neuron:
    • Multiplies inputs by weights
    • Adds a bias
    • Applies an activation function
  • The result is passed to the next layer until the final output is produced.
🧠 Analogy: Like making a prediction without knowing if it’s right yet — like guessing an exam answer.

2️⃣ Backward Propagation (Backprop)

Once the prediction is made, the network checks how wrong it was — this is where learning happens. 🔹 Steps:
  • Calculate error (difference between predicted and actual result) using a loss function.
  • Use Gradient Descent to:
    • Calculate how much each weight contributed to the error (via chain rule in calculus).
    • Update weights to reduce the error next time.
📉 Goal: Minimize the loss by adjusting weights in the right direction.

🎯 Combined Cycle:

  • Forward Propagation → Output
  • Compare with True Output → Loss
  • Backward Propagation → Adjust Weights
  • Repeat for many examples (epochs)

💥 Loss Functions

A loss function measures how wrong the model’s prediction is. The goal of training a neural network is to minimize the loss, so that the model becomes more accurate. 🔍 Why is it important? Loss is like a teacher’s feedback — it tells the network how bad its answer was. Backpropagation uses this feedback to adjust the model and make it better.
Loss Function Formula (Simplified) Use Case Explanation
Mean Squared Error (MSE) 1/n * Σ(y – ŷ)² Regression Penalizes large errors more (squares the difference).
Mean Absolute Error (MAE) 1/n * Σ |y − ŷ| Regression Measures average absolute difference; less sensitive to outliers.
Cross-Entropy Loss −Σ y × log(ŷ) Classification Great for probabilistic output; harsh on confident wrong predictions.

📘 y = actual label (ground truth), ŷ = model prediction

🧠 Simple Analogy:

  • If the correct answer is 10:
  • Guess 5 → MSE = (10−5)² = 25 → ❌ big penalty
  • Guess 9.5 → MSE = (10−9.5)² = 0.25 → ✅ small penalty
  • Cross-Entropy would punish you more if you were very confident and wrong.

✅ Summary of This Section

  • ✅ Neural networks use forward propagation to make predictions.
  • ✅ They use backward propagation to learn from mistakes.
  • Loss functions are the key to telling the model how to improve.