Hands on Lab 2


✅ Lab Setup (5–10 min)

Step 0 — Create folder

Create: genai-section2-lab/ with:

Step 1 — Install packages

pip install torch transformers tokenizers datasets accelerate sentencepiece matplotlib numpy pandas scikit-learn

Step 2 — Verify GPU (optional)

python -c "import torch; print('CUDA:', torch.cuda.is_available())"

🧪 Part A — Anatomy of Transformers (2.1)

A1) Visualize Self-Attention on a Sentence

Goal: Extract real attention weights from a transformer and visualize which tokens attend to which.

Step A1.1 — Load a small model

Use: bert-base-uncased (encoder) OR distilbert-base-uncased (lighter).

Step A1.2 — Prepare a sentence

Example:

“The mechanic inspected the engine because it was noisy.”

Step A1.3 — Run model with output_attentions=True

Step A1.4 — Plot one attention head as heatmap

Deliverable: 1 heatmap + 3 bullet observations (e.g., which token attends to “engine” / “noisy”).

A2) Understand Positional Encoding (By Breaking It)

Goal: Prove positions matter by scrambling token order and observing output changes.

Step A2.1 — Create two inputs

  1. Original: “The cat sat on the mat”

  2. Scrambled: “Mat the on sat cat the”

Step A2.2 — Compare embeddings (encoder model)

Step A2.3 — Explain outcome

Deliverable: similarity scores + short explanation: “Why position changes meaning.”

A3) Encoder vs Decoder Architecture (Hands-On)

Goal: Experience difference between “understanding” models vs “generation” models.

Step A3.1 — Encoder task (BERT): Fill-Mask

Input:

“Transformers are [MASK] at understanding context.”

Step A3.2 — Decoder task (GPT2): Generate

Prompt:

“Transformers are powerful because”

Step A3.3 — Compare in writing

Deliverable: a short comparison paragraph + outputs.

🧪 Part B — Tokens, Embeddings & Context Windows (2.2)

B1) Tokenization Strategies: Word vs Subword

Goal: See why subword tokenization exists.

Step B1.1 — Compare 2 tokenizers

Use:

Step B1.2 — Test words that break naïve tokenizers

Try:

Step B1.3 — Print token lists + token counts

Deliverable: tokenization table + 2 observations:

B2) Embeddings & Semantic Meaning (Mini Semantic Search)

Goal: Prove embeddings cluster meaning.

Step B2.1 — Create 12 sentences (4 topics × 3 sentences)

Example topics:

Step B2.2 — Convert each sentence to embeddings

Use one:

Step B2.3 — Measure similarity

Step B2.4 — Visualize embedding space

Deliverable: similarity matrix + 2D plot + brief conclusion.

B3) Context Window Limits (Practical Experiment)

Goal: Experience how long context fails and why pruning matters.

Step B3.1 — Create a long prompt

Step B3.2 — Ask at the end

“What is the secret key?”

Step B3.3 — Test with 2 settings

Observe:

Deliverable: 2 outputs + explanation:

🧪 Part C — How LLMs Are Trained (2.3)

C1) Pretraining Objectives (Mini Demo)

You’ll simulate what models learn during pretraining.

Option 1: Masked Language Modeling (BERT style)

Goal: Predict missing tokens.

Step C1.1 — Make 10 custom sentences (domain-specific)

Example:

Step C1.2 — Mask random tokens

Replace 15% with [MASK].

Step C1.3 — Run BERT fill-mask predictions

Deliverable: masked sentence results + basic “accuracy”.

Option 2: Next Token Prediction (GPT style)

Goal: Predict next word distribution.

Step C1.1 — Choose prompt

“In aviation maintenance, reliability depends on”

Step C1.2 — Generate with low temperature

Deliverable: outputs + note: “This is what pretraining optimizes.”

C2) Fine-Tuning vs Instruction Tuning (Hands-On)

Goal: See difference between fine-tuning for a task vs instruction-following behavior.

Step C2.1 — Create a tiny dataset (20 rows)

Make a CSV with:

Example tasks:

Step C2.2 — Run inference BEFORE tuning

Step C2.3 — Do a lightweight instruction tuning (conceptual + optional)

If you want a runnable approach without heavy compute:

Deliverable: “Before vs after” comparison table:

If you want full tuning, we can do LoRA/PEFT in Section 6+ where it fits better.

C3) RLHF (High-Level Intuition) — Mini Simulation

Goal: Understand RLHF as “preference optimization.”

Step C3.1 — Create 6 prompts

Example:

Step C3.2 — Generate 2 candidate answers each

Use two settings:

Step C3.3 — Rank each pair (human preference)

Create a simple table:

Step C3.4 — Explain RLHF mapping

Deliverable: preference table + 5-line explanation of RLHF.

✅ Final Submission Checklist (Section 2 Lab)

Students submit:

  1. Attention heatmap + observations

  2. Positional encoding experiment results (similarity scores)

  3. Encoder vs decoder outputs + comparison paragraph

  4. Tokenization comparison table

  5. Embedding similarity matrix + PCA/t-SNE plot

  6. Context window experiment outputs + explanation

  7. Pretraining objective demo results

  8. Instruction tuning comparison (base vs instruction-following)

  9. RLHF preference ranking table