Hands on Lab 8


✅ Lab Setup (10–15 min)

Step 0 — Create project folder

Create: genai-section8-agents-lab/ with:

Step 1 — Install dependencies

pip install fastapi uvicorn pydantic python-dotenv requests
pip install jsonschema tenacity rich
pip install chromadb sentence-transformers
pip install networkx

Optional (if using APIs):

pip install openai anthropic google-generativeai

🧪 Part A — Agents vs Prompt-Based Systems (8.1)

A1) Same Task, Two Approaches: Prompt-Only vs Agent Loop

Goal: Feel the difference between a single prompt and an autonomous agent.

Step A1.1 — Choose a realistic task

Pick one:

  1. “Plan a 3-day study schedule based on topics + constraints”

  2. “Analyze a support issue and create a resolution checklist”

  3. “Summarize docs and answer questions with citations (mini RAG)”

Step A1.2 — Run Prompt-Only baseline

Step A1.3 — Run an Agent loop version

Agent loop should:

Deliverable: reports/prompt_vs_agent.md (1 page)

A2) Introduce the 3 agent capabilities: Planning, Memory, Execution

Goal: Make each capability visible and measurable.

Step A2.1 — Planning test

Ask the agent: “Show your plan as steps with success criteria.”

Step A2.2 — Memory test

Give a preference:

Step A2.3 — Execution test

Have it call a tool (even mocked) such as:

Deliverable: outputs/capabilities_demo.json

🧪 Part B — Agent Architectures (8.2)

B1) Implement ReAct Pattern (Reason + Act + Observe)

Goal: Build a ReAct agent that alternates between thinking and tool use.

Step B1.1 — Define tool schema contract

Create src/tool_schemas.py with JSON schemas for:

Step B1.2 — Build a simple tool server (FastAPI)

Create tools/server.py with endpoints:

Run tools:

uvicorn tools.server:app --reload --port 8002

Step B1.3 — Implement ReAct loop

Create src/react_agent.py:
Loop for max_steps=6:

  1. model decides: tool call OR final answer

  2. validate tool call JSON

  3. execute tool

  4. append tool result as “observation”

  5. repeat

Deliverable: outputs/react_runs.json for 5 tasks:

B2) Planner–Executor Architecture

Goal: Separate “planning” from “doing” for better reliability.

Step B2.1 — Implement planner

Create src/planner.py:

Step B2.2 — Implement executor

Create src/executor.py:

Step B2.3 — Compare with ReAct

Run same 3 tasks with both:

Deliverable: reports/react_vs_planner_executor.md

B3) Multi-Agent System (Simple but Real)

Goal: Build 2–3 agents with roles and a coordinator.

Step B3.1 — Define agents

Step B3.2 — Coordinator flow

Create src/multi_agent.py:

  1. Research agent gathers sources

  2. Writer drafts answer

  3. Verifier checks for:

    • missing citations

    • contradictions

    • policy violations

  4. Writer revises if needed

Deliverable: outputs/multi_agent_demo.json for 2 tasks:

🧪 Part C — Building Practical Agents (8.3)

C1) Task Decomposition (Make It Automatic)

Goal: Teach the agent to break a goal into smaller tasks reliably.

Step C1.1 — Add decomposition prompt template

Require the agent to output:

Step C1.2 — Execute sub-tasks

Run sub-tasks sequentially, storing results.

Deliverable: outputs/task_decomposition.json

C2) Long-Term Memory Strategies (Short-term vs Persistent)

Goal: Implement memory that persists across sessions.

Step C2.1 — Short-term memory (session)

Store:

Step C2.2 — Long-term memory (vector store)

Use Chroma:

Step C2.3 — Memory retrieval trigger

Before responding:

Deliverable: reports/memory_demo.md

C3) Human-in-the-Loop (HITL) Control

Goal: Add safe stopping points and approvals.

Step C3.1 — Add “approval required” actions

Mark actions as high-risk:

Step C3.2 — Implement HITL gate

If action requires approval:

Step C3.3 — Simulate human approval

Run 3 tasks where agent must ask for approval before final output.

Deliverable: outputs/hitl_runs.json

🧪 Part D — Observability + Failure Recovery (Production-Ready Agents)

D1) Observability logging

Log every step to logs/agent.jsonl:

Deliverable: log file with 3 full runs

D2) Failure recovery strategies

Simulate tool failures (20% random errors). Implement:

Deliverable: reports/failure_recovery.md with examples

✅ Final Submission Checklist (Section 8 Lab)

Students submit:

  1. Prompt-only vs agent loop comparison

  2. ReAct agent implementation + 5 runs

  3. Planner–Executor plan JSON + execution logs

  4. Multi-agent demo with verifier pass

  5. Task decomposition outputs

  6. Long-term memory demo (Chroma)

  7. HITL approval workflow runs

  8. Observability logs + failure recovery report