WEBVTT

00:00.150 --> 00:02.160
-: In the previous lesson, we looked at tool calling.

00:02.160 --> 00:03.570
So how to define a tool,

00:03.570 --> 00:06.270
how to use a tool within a large language model.

00:06.270 --> 00:07.470
In this video we're gonna have a look

00:07.470 --> 00:10.350
at how you can specifically create custom agents

00:10.350 --> 00:11.670
using these tools,

00:11.670 --> 00:13.260
putting them into a loop,

00:13.260 --> 00:15.600
and then breaking that loop accordingly.

00:15.600 --> 00:17.970
The first thing I want you to do is open the notebook

00:17.970 --> 00:20.220
in the openai_features_and_functionality folder

00:20.220 --> 00:22.801
called learning_agents_from_scratch.

00:22.801 --> 00:25.097
So what we're gonna do is import openai,

00:25.097 --> 00:27.420
numpy, matplotlib, and pydantic.

00:27.420 --> 00:31.050
We're gonna have a look at is using a agentic loop

00:31.050 --> 00:32.430
that we'll build from scratch,

00:32.430 --> 00:34.260
and you can either copy

00:34.260 --> 00:36.600
or you can follow along and code it from scratch.

00:36.600 --> 00:38.340
The thing we're gonna be coded from scratch

00:38.340 --> 00:41.160
is a simple agent that's in a while true loop.

00:41.160 --> 00:44.730
So firstly you need to import your OpenAI package,

00:44.730 --> 00:48.780
and the model that we'll be using is gpt-4.1-mini.

00:48.780 --> 00:51.930
Now, we've already set up the get_weather function

00:51.930 --> 00:53.610
and the weather tool for you.

00:53.610 --> 00:56.220
You're going to need to replace your OpenAI key here,

00:56.220 --> 00:57.510
so I'll just let you do that.

00:57.510 --> 00:59.460
Okay, so we have the get_weather function,

00:59.460 --> 01:01.980
we have a weather_tool dictionary here,

01:01.980 --> 01:04.860
and we also have our tools Python list that we've set up

01:04.860 --> 01:07.530
with the information for a single weather tool.

01:07.530 --> 01:09.990
Now, if you scroll down, here's where it gets interesting.

01:09.990 --> 01:11.820
So we're gonna set our messages up

01:11.820 --> 01:13.950
and what we're gonna do for our messages

01:13.950 --> 01:15.990
is we're gonna start with one message.

01:15.990 --> 01:18.633
So I'm gonna put role as equal to user,

01:19.740 --> 01:21.990
and the content is, "What's the weather like

01:21.990 --> 01:23.430
in Paris today?"

01:23.430 --> 01:24.930
Let's get rid of this section here.

01:24.930 --> 01:27.510
So I'm just gonna do just the weather in Paris.

01:27.510 --> 01:29.040
And then what we need to do

01:29.040 --> 01:31.590
is firstly we create our initial response,

01:31.590 --> 01:33.360
given this message,

01:33.360 --> 01:36.840
and we have the model is equal to our gpt-4.1 model.

01:36.840 --> 01:38.700
The input is equal to the messages

01:38.700 --> 01:41.700
and our tools is equal to the tools that we defined earlier.

01:41.700 --> 01:44.130
Then what we're gonna do is process all function calls

01:44.130 --> 01:45.300
in the response to output.

01:45.300 --> 01:48.150
So if there's a response to output,

01:48.150 --> 01:50.880
what we're then gonna do is we're gonna loop through

01:50.880 --> 01:52.620
all of the output items.

01:52.620 --> 01:54.150
So you'll see here, we're going

01:54.150 --> 01:57.180
for output_item in response to output.

01:57.180 --> 01:59.580
If we have a .type

01:59.580 --> 02:01.530
and it's equal to a function call,

02:01.530 --> 02:04.080
what we're then gonna do is append the function call

02:04.080 --> 02:05.400
to the messages.

02:05.400 --> 02:07.470
After that, we're gonna get the tool call

02:07.470 --> 02:10.642
so that would be tool call = output_item

02:10.642 --> 02:12.810
and then we're gonna load in the args,

02:12.810 --> 02:14.410
json.loads(tool_call.arguments).

02:16.620 --> 02:18.990
After that, we're going to execute the function.

02:18.990 --> 02:20.187
So you've got this get_weather

02:20.187 --> 02:23.340
and we're passing in the args of latitude and longitude

02:23.340 --> 02:24.720
and we're printing out the fact

02:24.720 --> 02:26.610
that we've executed that tool call.

02:26.610 --> 02:27.720
And then what we're gonna do

02:27.720 --> 02:30.270
is now that we've executed that tool call,

02:30.270 --> 02:34.530
we need to then append the messages back into the LLM.

02:34.530 --> 02:36.783
So that'll be type function_call_output.

02:36.783 --> 02:40.380
The call_id is equal to the tool_call.call_id,

02:40.380 --> 02:42.900
and the output is the string of this result.

02:42.900 --> 02:45.480
Now, this is similar to what you have with function calling.

02:45.480 --> 02:47.070
The only difference is,

02:47.070 --> 02:48.450
what we're now gonna check for

02:48.450 --> 02:51.120
is if there's any output text.

02:51.120 --> 02:54.150
So we're gonna say if there's any output text,

02:54.150 --> 02:56.283
and we'll do this out here,

02:58.380 --> 03:00.120
if there's any output text,

03:00.120 --> 03:03.300
then we have a final output and we have a break.

03:03.300 --> 03:05.207
And if there isn't a response to output,

03:05.207 --> 03:06.720
we are then just gonna break

03:06.720 --> 03:09.300
just to be on the safe side of things.

03:09.300 --> 03:10.770
So what's gonna happen is,

03:10.770 --> 03:12.810
we have this client.responses.create,

03:12.810 --> 03:15.150
we have a query that's gonna trigger a tool call,

03:15.150 --> 03:16.230
if there's an output,

03:16.230 --> 03:17.637
we're gonna loop through every output

03:17.637 --> 03:19.080
and if it's a function call,

03:19.080 --> 03:21.450
then we're gonna add that tool call to the message.

03:21.450 --> 03:22.830
We're then gonna get that tool call,

03:22.830 --> 03:24.750
get the various bits of information,

03:24.750 --> 03:26.820
we're gonna get the weather and invoke that.

03:26.820 --> 03:29.190
We're then gonna tell the messages

03:29.190 --> 03:31.410
that this is the output of that tool call.

03:31.410 --> 03:33.450
And then this is the clever stuff here

03:33.450 --> 03:36.360
where you can basically say, if we have an output text,

03:36.360 --> 03:38.670
then that basically means we're finished at this point

03:38.670 --> 03:40.890
and we'll break out of the endless while loop.

03:40.890 --> 03:42.270
Now, if we didn't have an output before,

03:42.270 --> 03:43.680
just for simplicity,

03:43.680 --> 03:45.960
then we would also add a break here.

03:45.960 --> 03:46.980
So if you run this now,

03:46.980 --> 03:49.770
what you'll see is if you scroll up here,

03:49.770 --> 03:51.840
it's gonna execute that tool call

03:51.840 --> 03:53.550
and it returns the final output.

03:53.550 --> 03:55.440
Now, we can take this a step further

03:55.440 --> 03:58.050
and just wrap all the code that you just wrote

03:58.050 --> 04:00.360
into a python function called an agent_loop,

04:00.360 --> 04:02.820
which just takes in messages and tools.

04:02.820 --> 04:05.100
And then we can try and run that on a query

04:05.100 --> 04:07.890
that has two different types of intents.

04:07.890 --> 04:10.260
So you'll see here we've got a role of developer,

04:10.260 --> 04:12.990
that's basically the same as the system message,

04:12.990 --> 04:16.410
which is useful for instructional pieces of content.

04:16.410 --> 04:18.210
Then we have a content key,

04:18.210 --> 04:19.417
and under that it says,

04:19.417 --> 04:21.480
"What's the weather like in Paris today?

04:21.480 --> 04:22.980
Before applying, I want you to also

04:22.980 --> 04:24.510
get the weather for Berlin."

04:24.510 --> 04:26.580
Now, when we run our agent_loop function,

04:26.580 --> 04:28.050
what you'll see is it actually decides

04:28.050 --> 04:31.230
to execute two tool calls before responding with the output.

04:31.230 --> 04:34.230
So again, the important point is we put the agent

04:34.230 --> 04:36.330
into a while true loop

04:36.330 --> 04:37.770
and we loop through

04:37.770 --> 04:40.140
and anytime we have tool calls, we're continuing,

04:40.140 --> 04:42.540
but if we have that .output_text property,

04:42.540 --> 04:44.040
then we're breaking out of the loop.

04:44.040 --> 04:45.780
You can also do it slightly differently

04:45.780 --> 04:47.730
with a objective function.

04:47.730 --> 04:49.200
And basically what that means is you have

04:49.200 --> 04:50.700
some kind of custom logic

04:50.700 --> 04:52.470
where rather than just endlessly being

04:52.470 --> 04:54.780
in a while true loop, you might decide,

04:54.780 --> 04:59.250
okay, the agent has a maximum of five times that it can run

04:59.250 --> 05:01.500
or a certain amount of time that it can run.

05:01.500 --> 05:03.780
In our case, we have an objective function

05:03.780 --> 05:06.840
with this objective met in terms of the search count,

05:06.840 --> 05:09.510
is it greater than the maximum number of searches?

05:09.510 --> 05:10.770
We can then do something like this

05:10.770 --> 05:12.960
where the role of developer, you know,

05:12.960 --> 05:14.017
you have the content,

05:14.017 --> 05:15.450
"Your goal is to gather weather

05:15.450 --> 05:17.160
for at least five different cities.

05:17.160 --> 05:19.710
Once you've done that, respond with task complete."

05:19.710 --> 05:21.780
And then what you can see is we have this search,

05:21.780 --> 05:23.100
the weather in Berlin.

05:23.100 --> 05:25.680
Now, we can still use a while true,

05:25.680 --> 05:29.130
but what we can do then is we can then basically do

05:29.130 --> 05:30.630
the same kind of logic,

05:30.630 --> 05:33.240
however we increment the search count.

05:33.240 --> 05:35.400
So then what we do is check if the objective is met

05:35.400 --> 05:36.630
based on that search count,

05:36.630 --> 05:39.990
and if necessary, then we will break out of the loop.

05:39.990 --> 05:42.030
So you can see here, when we run this,

05:42.030 --> 05:44.280
what will happen is we'll be getting weather.

05:46.980 --> 05:48.570
So it's decided to do New York

05:48.570 --> 05:49.680
and it's getting some weather

05:49.680 --> 05:52.260
and it's then it's searched in six cities

05:52.260 --> 05:55.320
and after it's searched over a certain number,

05:55.320 --> 05:57.360
it has exited the while true loop

05:57.360 --> 05:59.940
because of this objective_met function.

05:59.940 --> 06:02.160
So basically all you're doing here is creating

06:02.160 --> 06:05.100
a custom function so that it doesn't endlessly go around

06:05.100 --> 06:06.150
in a while true loop.

06:06.150 --> 06:10.110
It can exit before, by using a custom objective function.

06:10.110 --> 06:12.030
The other common ways to do this

06:12.030 --> 06:14.740
is have, like, a MAXIMUM_NUMBER_OF_ITERATIONS

06:16.321 --> 06:18.450
and then you can then say,

06:18.450 --> 06:20.790
when you're doing these kind of iterations,

06:20.790 --> 06:22.560
you will then have something like this

06:22.560 --> 06:25.383
where if you are over the number of iterations,

06:26.354 --> 06:28.533
we also have to store the current iterations,

06:28.533 --> 06:30.960
CURRENT_NUMBER_oF_ITERATIONS,

06:30.960 --> 06:32.880
if the current number of iterations

06:32.880 --> 06:33.960
is greater than or equal to

06:33.960 --> 06:36.150
the maximum number of iterations,

06:36.150 --> 06:39.300
then we're breaking out of the loop.

06:39.300 --> 06:40.290
And you can see here,

06:40.290 --> 06:41.280
every time we go around,

06:41.280 --> 06:43.950
we're incrementing the current number of iterations.

06:43.950 --> 06:46.230
So this is another way that you could also break

06:46.230 --> 06:47.280
out that while loop

06:47.280 --> 06:49.260
so that your agent doesn't endlessly go on

06:49.260 --> 06:52.620
if it doesn't meet a custom objective function.

06:52.620 --> 06:54.210
Okay, just to recap,

06:54.210 --> 06:56.220
we looked at how we could use the while true loop

06:56.220 --> 06:59.280
to basically create an agentive while loop,

06:59.280 --> 07:00.570
and then we looked at the fact

07:00.570 --> 07:02.220
that we're doing function calling here

07:02.220 --> 07:04.590
and we also have a custom step

07:04.590 --> 07:07.230
which will break us out of the whiled true loop.

07:07.230 --> 07:10.290
This is how a lot of the agents under the hood are designed

07:10.290 --> 07:12.360
and it's well worthwhile, you getting familiar

07:12.360 --> 07:13.920
with this type of pattern.

07:13.920 --> 07:15.390
Secondly, we looked at the fact

07:15.390 --> 07:18.480
that agents can do parallel tool calling.

07:18.480 --> 07:19.770
This mixed intent query

07:19.770 --> 07:21.720
is searching for two types of weather

07:21.720 --> 07:24.420
and we can see that it executed two tool calls

07:24.420 --> 07:26.820
before responding to the user.

07:26.820 --> 07:30.630
Thirdly, we looked at making custom objective functions

07:30.630 --> 07:33.300
and using an objective function

07:33.300 --> 07:35.490
or the maximum number of iterations

07:35.490 --> 07:37.530
to break out of a while loop

07:37.530 --> 07:40.170
to avoid the fact that these agents could get stuck

07:40.170 --> 07:41.520
for larger periods of time.

07:41.520 --> 07:43.270
Cool, I'll see you in the next one.
