Generative AI Notes

1. What is Generative AI (Gen AI)?

Generative AI (Gen AI) is a type of AI that can create new content – like text, images, audio, or video.

It learns from existing data and produces something new, similar to how humans create.

Examples:

ChatGPT → writes text
DALL·E → creates images

Example Scenario:
You type:

"Write a short poem about cats."

AI generates a brand new poem by itself.

2. What is ChatGPT?

ChatGPT is an AI chatbot developed by OpenAI.
It is based on GPT (Generative Pretrained Transformer) models.

How it works:

You type text (prompt).
Transformer model understands the context of your input.
Predicts the next word repeatedly to form a complete reply.

Example:
Input:

"Hello, how are you?"

Output:

"I'm good! How about you?"

3. Google Article – “Attention is All You Need”

Published by Google in 2017.
Introduced Transformer architecture, replacing old RNNs & LSTMs.
Core Idea: Self-Attention → model decides which words are important without reading sequentially.

Foundation: GPT, BERT, etc.

Example:
Sentence: "The cat sat on the mat"

Word "cat" pays more attention to "sat" → action is important for meaning.

4. What is a Transformer & How it Predicts the Next Word

Transformer is a model that uses self-attention to understand relationships between words.

Steps:

Input sequence (tokens)
Apply embeddings + self-attention + feed-forward layers
Output → probability distribution of next word (via Softmax)

5. How GPT Generates the Next Word

Input text → tokenized into embeddings
Passed through Transformer layers
Output layer gives probabilities for all vocabulary words
Model picks the word with highest probability (or samples)
Repeat until sentence or stopping condition is reached

Example:
Input: "I am feeling"
Probabilities:

"happy" → 0.6
"sad" → 0.3
"hungry" → 0.1

GPT picks "happy" → Output: "I am feeling happy"

6. What are Input Tokens?

Tokens = smallest units of input text (word, subword, or character).

Example:
Text: "ChatGPT is cool"
Tokens → [Chat, G, PT, is, cool]

7. What is Input Sequence?

Ordered list of tokens fed into the model.

Example:
Text: "I love AI"
Input sequence → [I, love, AI]

8. Vocabulary, Encoding, Decoding & AI Model

Vocabulary (Vocab): Set of all tokens the model knows
Encoding: Converts text → tokens/IDs
Decoding: Converts tokens/IDs → text
AI Model: Neural network (like GPT) that processes input and generates output

9. Transformer Architecture

Two main parts:

Encoder: Used in BERT, translation tasks
Decoder: Used in GPT, text generation

Core components:

Input Embedding
Positional Encoding
Self-Attention
Multi-Head Attention
Feed Forward Network
Output Layer

10. What is Tokenizer?

Tokenizer splits text into tokens and maps them to IDs the model can understand.

Example:
Text: "Hello" → Token ID [15496]

11. What is Input Embedding?

Converts tokens into numerical vectors that represent meaning & context.

12. What is Positional Embedding?

Adds position information (word order) because Transformers do not read text sequentially.

Example:

"I love AI" ≠ "AI love I"

Positional encoding ensures the model understands correct word order.

13. What is Self-Attention?

Each word looks at all other words to decide importance.

Example:
Sentence: "The cat sat on the mat"

"cat" focuses more on "sat" (the main action).

14. What is Multi-Head Self-Attention?

Runs multiple attentions in parallel (heads).

Each head learns different relationships → syntax, meaning, context

Example:
Sentence: "The cat sat on the mat"

Head 1 → Subject-action relationship
Head 2 → Word positions
Head 3 → Contextual relationships

15. Transformer Phases (Training & Inference)

Training: Model learns patterns from large datasets using backpropagation and gradient descent
Inference: Model uses trained knowledge to generate outputs (like ChatGPT replying)

Example:

Training: Reads millions of sentences → learns patterns
Inference: You type "I am feeling" → model predicts next word "happy"

16. What is Softmax Function?

Converts raw scores (logits) → probabilities that sum to 1.
Used to pick the next word in generation.

Example:

Logits: [2.1, 1.0, 0.1] 
Probabilities: [0.65, 0.24, 0.11]

Highest probability → next word selected

Generative AI Notes

1. What is Generative AI (Gen AI)?

2. What is ChatGPT?

3. Google Article – “Attention is All You Need”

4. What is a Transformer & How it Predicts the Next Word

5. How GPT Generates the Next Word

6. What are Input Tokens?

7. What is Input Sequence?

8. Vocabulary, Encoding, Decoding & AI Model

9. Transformer Architecture

10. What is Tokenizer?

11. What is Input Embedding?

12. What is Positional Embedding?

13. What is Self-Attention?

14. What is Multi-Head Self-Attention?

15. Transformer Phases (Training & Inference)

16. What is Softmax Function?

Comments

More from this blog

Explain Tokenization to Fresher

Explain Vector Embedding to your Mom

Explain GPT Simply for Kids

Command Palette

1. What is Generative AI (Gen AI)?

2. What is ChatGPT?

3. Google Article – “Attention is All You Need”

4. What is a Transformer & How it Predicts the Next Word

5. How GPT Generates the Next Word

6. What are Input Tokens?

7. What is Input Sequence?

8. Vocabulary, Encoding, Decoding & AI Model

9. Transformer Architecture

10. What is Tokenizer?

11. What is Input Embedding?

12. What is Positional Embedding?

13. What is Self-Attention?

14. What is Multi-Head Self-Attention?

15. Transformer Phases (Training & Inference)

16. What is Softmax Function?

Comments

More from this blog