Hands on Lab 7


✅ Lab Setup (10–15 min)

Step 0 — Create folder

Create: genai-section7-tools-lab/ with:

Step 1 — Install dependencies

pip install fastapi uvicorn pydantic python-dotenv requests
pip install jsonschema tenacity rich
pip install transformers torch sentencepiece

Optional (if using APIs):

pip install openai anthropic google-generativeai

🧪 Part A — Tool-Using LLMs (7.1)

A1) Define Tool Schemas (Function Calling Contract)

Goal: Teach the model how to call tools with structured JSON arguments.

Step A1.1 — Create 3 tools (schemas first)

Create src/tool_schemas.py defining JSON schemas for:

  1. get_weather(city: str, units: str)

  2. convert_currency(amount: float, from: str, to: str)

  3. search_faq(query: str) (local KB search)

Each schema includes:

Deliverable: tool schema definitions (model-readable)

A2) Build a Structured Output Parser

Goal: Enforce that tool calls are valid JSON.

Step A2.1 — Create src/json_parser.py

If invalid:

Deliverable: parser that never crashes on bad model output

A3) Run First Tool Call (Single-Step)

Goal: LLM decides which tool to call.

Step A3.1 — Create data/tool_tasks.json

Add 10 tasks:

Step A3.2 — Prompt the model to respond in tool-call JSON

Format:

{"tool":"convert_currency","arguments":{"amount":120,"from":"USD","to":"PKR"}}

Step A3.3 — Execute the tool

Deliverable: outputs/single_step_results.json

🧪 Part B — Designing Tools for LLMs (7.2)

B1) Build Tools as APIs (FastAPI Tool Server)

Goal: Wrap tools behind an API like real production systems.

Step B1.1 — Create tools/server.py (FastAPI)

Endpoints:

For the lab:

Deliverable: running tool server:

uvicorn tools.server:app --reload --port 8001

B2) Input Validation & Error Handling (Tool Layer)

Goal: Make tools robust to bad inputs from the model.

Step B2.1 — Add Pydantic validation

Examples:

Step B2.2 — Add predictable error responses

Return JSON errors:

{"ok": false, "error": {"code": "INVALID_INPUT", "message": "..."}}

Deliverable: reports/tool_validation.md with 5 bad-input tests

B3) Stateless vs Stateful Tools

Goal: Understand when a tool must remember state.

Step B3.1 — Stateless tool examples

Step B3.2 — Create a stateful tool: “Task List Manager”

Add endpoints:

Store tasks in memory (for now).

Deliverable: working stateful tool + example task list

🧪 Part C — Multi-Step Reasoning with Tools (7.3)

C1) Tool Chaining: Multi-Step Agent Loop

Goal: Build an agent that calls tools multiple times to solve one request.

Step C1.1 — Create multi-step tasks in data/agent_tasks.json

Examples:

  1. “Plan my day in NYC: check weather, then create outfit recommendation.”

  2. “Convert 200 USD to PKR, then summarize what I can buy in PKR (short).”

  3. “Add 3 tasks to my list and mark one complete.”

Step C1.2 — Implement an agent loop src/agent.py

Loop:

  1. Send current state + user goal to model

  2. Model outputs either:

    • tool call JSON OR

    • final answer JSON

  3. Validate tool call

  4. Execute tool via API

  5. Append tool result to conversation state

  6. Repeat up to max_steps=6

Deliverable: successful tool chaining for all 3 tasks

C2) Observability: Log Every Tool Call

Goal: Make tool usage debuggable.

Step C2.1 — Add structured logs to logs/agent_runs.jsonl

Each line includes:

Step C2.2 — Add a console trace (optional)

Use rich to print:

Deliverable: log file with at least 3 complete runs

C3) Failure Recovery Strategies

Goal: Handle tool failures gracefully.

Step C3.1 — Simulate failures

Make tools randomly fail 20% of the time (return 500 or error JSON).

Step C3.2 — Add recovery behaviors in the agent

Step C3.3 — Validate recovery works

Run tasks multiple times and confirm agent completes.

Deliverable: reports/failure_recovery.md showing:

🧪 Part D — Mini Project: “Support Assistant with Tools” (45–60 min)

D1) Build a Customer Support Agent

Goal: Combine tool calling + state + safety.

Step D1.1 — Tools

Step D1.2 — Guardrails

Step D1.3 — Test scenarios

Deliverable: outputs/support_agent_demo.json with 3 full runs

✅ Final Submission Checklist (Section 7 Lab)

Students submit:

  1. Tool schemas + strict JSON tool-call format

  2. Tool server (FastAPI) with validation + predictable error outputs

  3. Single-step tool calling results

  4. Multi-step agent loop (tool chaining)

  5. Observability logs (JSONL) showing each tool call + latency

  6. Failure recovery report with retries + argument fixing

  7. Mini project: support agent demo with tools + state + guardrails