✅ Lab Setup (10–15 min)
Step 0 — Create folder
Create: genai-section7-tools-lab/ with:
src/tools/data/outputs/logs/reports/
Step 1 — Install dependencies
pip install fastapi uvicorn pydantic python-dotenv requests pip install jsonschema tenacity rich pip install transformers torch sentencepiece
Optional (if using APIs):
pip install openai anthropic google-generativeai
🧪 Part A — Tool-Using LLMs (7.1)
A1) Define Tool Schemas (Function Calling Contract)
Goal: Teach the model how to call tools with structured JSON arguments.
Step A1.1 — Create 3 tools (schemas first)
Create src/tool_schemas.py defining JSON schemas for:
get_weather(city: str, units: str)convert_currency(amount: float, from: str, to: str)search_faq(query: str)(local KB search)
Each schema includes:
namedescriptionparameterswith types + required fields
✅ Deliverable: tool schema definitions (model-readable)
A2) Build a Structured Output Parser
Goal: Enforce that tool calls are valid JSON.
Step A2.1 — Create src/json_parser.py
Extract tool call JSON from model output
Validate JSON strictly
Validate schema with
jsonschema
If invalid:
return an error object:
error_typemessageraw_output
✅ Deliverable: parser that never crashes on bad model output
A3) Run First Tool Call (Single-Step)
Goal: LLM decides which tool to call.
Step A3.1 — Create data/tool_tasks.json
Add 10 tasks:
“What’s the weather in Chicago tomorrow?”
“Convert 120 USD to PKR”
“Search FAQ: refund policy”
Step A3.2 — Prompt the model to respond in tool-call JSON
Format:
{"tool":"convert_currency","arguments":{"amount":120,"from":"USD","to":"PKR"}}
Step A3.3 — Execute the tool
Run tool function
Print tool result
Ask model to produce final response based on tool result
✅ Deliverable: outputs/single_step_results.json
🧪 Part B — Designing Tools for LLMs (7.2)
B1) Build Tools as APIs (FastAPI Tool Server)
Goal: Wrap tools behind an API like real production systems.
Step B1.1 — Create tools/server.py (FastAPI)
Endpoints:
POST /weatherPOST /convertPOST /faq
For the lab:
Weather can be mocked (static dictionary) to avoid external API dependency.
Currency can be mocked with sample rates in a dict.
FAQ search can be simple keyword match on
data/faq.md.
✅ Deliverable: running tool server:
uvicorn tools.server:app --reload --port 8001
B2) Input Validation & Error Handling (Tool Layer)
Goal: Make tools robust to bad inputs from the model.
Step B2.1 — Add Pydantic validation
Examples:
City must be non-empty string
Currency must be 3 letters
Amount must be > 0
Step B2.2 — Add predictable error responses
Return JSON errors:
{"ok": false, "error": {"code": "INVALID_INPUT", "message": "..."}}
✅ Deliverable: reports/tool_validation.md with 5 bad-input tests
B3) Stateless vs Stateful Tools
Goal: Understand when a tool must remember state.
Step B3.1 — Stateless tool examples
Currency conversion
FAQ search
Step B3.2 — Create a stateful tool: “Task List Manager”
Add endpoints:
POST /tasks/addGET /tasks/listPOST /tasks/complete
Store tasks in memory (for now).
✅ Deliverable: working stateful tool + example task list
🧪 Part C — Multi-Step Reasoning with Tools (7.3)
C1) Tool Chaining: Multi-Step Agent Loop
Goal: Build an agent that calls tools multiple times to solve one request.
Step C1.1 — Create multi-step tasks in data/agent_tasks.json
Examples:
“Plan my day in NYC: check weather, then create outfit recommendation.”
“Convert 200 USD to PKR, then summarize what I can buy in PKR (short).”
“Add 3 tasks to my list and mark one complete.”
Step C1.2 — Implement an agent loop src/agent.py
Loop:
Send current state + user goal to model
Model outputs either:
tool call JSON OR
final answer JSON
Validate tool call
Execute tool via API
Append tool result to conversation state
Repeat up to max_steps=6
✅ Deliverable: successful tool chaining for all 3 tasks
C2) Observability: Log Every Tool Call
Goal: Make tool usage debuggable.
Step C2.1 — Add structured logs to logs/agent_runs.jsonl
Each line includes:
timestamp
run_id
step
tool_name
args
response
latency_ms
error (if any)
Step C2.2 — Add a console trace (optional)
Use rich to print:
tool called
inputs
outputs
next decision
✅ Deliverable: log file with at least 3 complete runs
C3) Failure Recovery Strategies
Goal: Handle tool failures gracefully.
Step C3.1 — Simulate failures
Make tools randomly fail 20% of the time (return 500 or error JSON).
Step C3.2 — Add recovery behaviors in the agent
Retry tool call (max 2)
If input invalid → ask model to fix arguments
If repeated failures → fallback response and ask user
Step C3.3 — Validate recovery works
Run tasks multiple times and confirm agent completes.
✅ Deliverable: reports/failure_recovery.md showing:
failures encountered
how recovery fixed them
🧪 Part D — Mini Project: “Support Assistant with Tools” (45–60 min)
D1) Build a Customer Support Agent
Goal: Combine tool calling + state + safety.
Step D1.1 — Tools
FAQ search (KB)
Task creation (“create support ticket”)
Ticket status lookup (stateful)
Step D1.2 — Guardrails
Force JSON tool call schema
Block prompt injection patterns in user input
Output validation (no secrets, no system prompt leakage)
Step D1.3 — Test scenarios
“My refund isn’t processed—what do I do?”
“Create a ticket for my broken product.”
“Check status of ticket #123.”
✅ Deliverable: outputs/support_agent_demo.json with 3 full runs
✅ Final Submission Checklist (Section 7 Lab)
Students submit:
Tool schemas + strict JSON tool-call format
Tool server (FastAPI) with validation + predictable error outputs
Single-step tool calling results
Multi-step agent loop (tool chaining)
Observability logs (JSONL) showing each tool call + latency
Failure recovery report with retries + argument fixing
Mini project: support agent demo with tools + state + guardrails