Architecture Basics - AI Tutorial

🏗️ Architecture Basics

The architecture of a Large Language Model (LLM) refers to the internal structure and mechanisms that allow the model to understand, process, and generate human-like text. Most modern LLMs are based on a highly successful design known as the Transformer architecture.

🏛️ Main Types of AI/LLM Architectures

1️⃣ Transformer Architecture

The backbone of modern LLMs

Introduced by Google in “Attention is All You Need” (2017)
Uses Self-Attention to process entire sequences at once
Great for long-range dependencies
Examples: GPT series, BERT, T5, LLaMA, Gemini

2️⃣ Encoder-Decoder Architecture

Common in translation and summarization

Encoder: Reads and understands input text
Decoder: Generates output from encoded understanding
Best when input ≠ output (e.g., translation)
Examples: T5, BART, mT5

3️⃣ Decoder-Only Architecture

Widely used for text generation tasks

Uses only the decoder block
Predicts the next token from previous context
Efficient for chatting, writing, answering
Examples: GPT-2, GPT-3, GPT-4, Claude

4️⃣ Encoder-Only Architecture

Focused on understanding, not generating

Uses only the encoder
Great for classification, semantic search, NLU
Examples: BERT, RoBERTa, DeBERTa

🎯 Quick Summary Table

Architecture Type

Focus

Examples

Transformer

Language understanding/generation

GPT, BERT, T5

Encoder-Decoder

Translation, Summarization

T5, BART, mT5

Decoder-Only

Text generation

GPT-2, GPT-3, GPT-4

Encoder-Only

Text understanding

BERT, RoBERTa, DeBERTa

Retrieval-Augmented (RAG)

Knowledge retrieval + generation

RAG, OpenAI Retrieval Plugin

Mixture of Experts (MoE)

Scale efficiently

Switch Transformer, GShard, Mixtral

Multimodal

Text + Images + Video

GPT-4V, Gemini 1.5, Flamingo