WEBVTT

00:00.240 --> 00:02.490
-: And in this video, we're gonna have a look at agents.

00:02.490 --> 00:05.910
Agents are a specific type of workflow

00:05.910 --> 00:08.730
where the LLM will run in a continuous loop,

00:08.730 --> 00:11.160
interacting with its environments and receiving feedback

00:11.160 --> 00:13.470
to refine its actions and decisions.

00:13.470 --> 00:17.280
Agents use tools and then they go through an agentic loop.

00:17.280 --> 00:20.700
They receive, after making an action,

00:20.700 --> 00:23.490
they'll get some feedback from the environment.

00:23.490 --> 00:26.373
After a point in time, the agent will decide to stop.

00:34.110 --> 00:35.820
Some of the use cases for agents

00:35.820 --> 00:38.310
include personal research assistants,

00:38.310 --> 00:40.200
automated code reviewing,

00:40.200 --> 00:42.030
customer service agents is a big one.

00:42.030 --> 00:43.740
And also managing social media presence

00:43.740 --> 00:47.160
by analyzing trending topics, generating relevant content

00:47.160 --> 00:48.543
and scheduling posts.

00:50.040 --> 00:53.340
Okay, let's start by importing OpenAI and Pydantic.

00:53.340 --> 00:55.560
We'll also make sure that we have the JSON, OS

00:55.560 --> 00:58.350
and OpenAI packages installed and imported.

00:58.350 --> 01:00.360
And we'll then also make sure that the client

01:00.360 --> 01:02.103
and model have been created.

01:03.540 --> 01:04.620
The first thing we're gonna do

01:04.620 --> 01:07.683
is we're gonna create a search_knowledge_base function.

01:10.591 --> 01:13.758
(instructor coughing)

01:21.080 --> 01:22.782
The first thing we're gonna do,

01:22.782 --> 01:26.532
(instructor clearing throat)

01:27.834 --> 01:30.917
(instructor gulping)

01:32.151 --> 01:33.979
the first thing we're gonna do,

01:33.979 --> 01:34.812
the first thing we're gonna do

01:34.812 --> 01:36.900
is create a search_knowledge_base function.

01:36.900 --> 01:39.030
And this is just gonna be a fake function

01:39.030 --> 01:41.400
that we're gonna take a query, which is a string.

01:41.400 --> 01:43.050
We're going to return a string

01:43.050 --> 01:47.400
and all we're gonna put in here is return summary for,

01:47.400 --> 01:48.840
and then we'll put query

01:48.840 --> 01:53.837
and then we'll put, this is a simulated summary

01:56.160 --> 01:57.453
from the knowledge base.

01:58.500 --> 02:00.180
Okay, cool, now that we have that function,

02:00.180 --> 02:02.580
the next thing we need to determine is tools.

02:02.580 --> 02:05.370
So obviously tools allow the agent to take actions.

02:05.370 --> 02:08.070
Now, you will have to write JSON schema out.

02:08.070 --> 02:13.070
So the first thing is you have to have a type of function.

02:13.080 --> 02:14.850
And then after that, then the next thing

02:14.850 --> 02:17.823
that we need to do is we need to step into that function.

02:19.200 --> 02:20.590
So we do function

02:21.960 --> 02:23.100
and then we specifically have

02:23.100 --> 02:25.170
to start listing out the individual things.

02:25.170 --> 02:27.030
So for example, what is the name of that function?

02:27.030 --> 02:28.710
Well, it's search_knowledge_base.

02:28.710 --> 02:30.750
What is the description?

02:30.750 --> 02:35.040
Query a knowledge base to retrieve relevant info on a topic.

02:35.040 --> 02:38.340
And you can see it's populated the parameters required

02:38.340 --> 02:39.360
and additional properties.

02:39.360 --> 02:41.310
And I'm gonna get rid of this one here.

02:42.761 --> 02:43.740
And there we go.

02:43.740 --> 02:47.670
So we've got our function, which basically is a schema.

02:47.670 --> 02:50.520
This is a JSON schema definition of a tool.

02:50.520 --> 02:52.110
Now we need to generate some messages.

02:52.110 --> 02:53.590
So we'll have a messages list

02:54.840 --> 02:57.510
and we'll have a role of user and content.

02:57.510 --> 02:59.970
I'd like to know more about quantum computing.

02:59.970 --> 03:02.580
Could you give me a quick summary?

03:02.580 --> 03:05.250
Then we create a chat completions request

03:05.250 --> 03:08.340
with client.chat.completions.create,

03:08.340 --> 03:11.040
and we pass in the tools alongside the model

03:11.040 --> 03:12.150
and the messages.

03:12.150 --> 03:16.063
Now, what you'll see here is it has a .tool_calls.

03:17.040 --> 03:18.840
And basically what this is gonna mean

03:18.840 --> 03:21.630
is that ChatGPT wants to make a tool call.

03:21.630 --> 03:23.430
And the idea of that is this,

03:23.430 --> 03:26.220
and this is the function with the arguments

03:26.220 --> 03:28.380
and the name of the function that it wants to call.

03:28.380 --> 03:31.170
So we can tell ChatGPT, these are the tool_calls.

03:31.170 --> 03:33.210
So let's just do that manually to start with.

03:33.210 --> 03:38.210
So we'll say tool_calls = completion_1.choices,

03:39.028 --> 03:42.300
[0].message.tool_calls.

03:42.300 --> 03:43.620
And then we can get those tool_calls out

03:43.620 --> 03:44.460
and just have a look at them.

03:44.460 --> 03:45.900
So pasting in tool_calls here,

03:45.900 --> 03:48.870
you can see that we have a ChatCompletionMessageToolCall,

03:48.870 --> 03:50.910
which provides a tool ID

03:50.910 --> 03:54.180
and also the function with the arguments of the function

03:54.180 --> 03:55.410
and the name of the function.

03:55.410 --> 03:57.240
Now, the first thing we're gonna do is make sure

03:57.240 --> 04:01.500
that the messages also has the most recent message

04:01.500 --> 04:04.320
specifically from the chat completion_1.

04:04.320 --> 04:07.904
So we'll just do completion_1.choices,

04:07.904 --> 04:09.900
[0].message.

04:09.900 --> 04:11.760
And then that just basically means

04:11.760 --> 04:16.230
that this messages will now have this additional one

04:16.230 --> 04:19.170
in here, which is the ChatCompletionMessage.

04:19.170 --> 04:21.630
So now how can we access these tool_calls?

04:21.630 --> 04:22.463
We're gonna do a for loop.

04:22.463 --> 04:24.480
So we'll save for tool_call in tool_calls,

04:24.480 --> 04:26.400
we're gonna get the function name,

04:26.400 --> 04:31.110
which is the tool_call.function.name.

04:31.110 --> 04:32.790
And we're also gonna get the function args,

04:32.790 --> 04:34.590
and then we're gonna have something like this where we say

04:34.590 --> 04:37.410
if the function name is the search_knowledge_base,

04:37.410 --> 04:40.500
we're gonna execute that function with here.

04:40.500 --> 04:42.300
And then we're gonna add the messages

04:42.300 --> 04:45.180
and we just need to add a role of tool,

04:45.180 --> 04:47.220
the tool_call_id and the result.

04:47.220 --> 04:49.680
And if it's not this function, then we're gonna go

04:49.680 --> 04:51.210
and just raise a ValueError.

04:51.210 --> 04:53.130
So I'm just gonna raise a ValueError

04:53.130 --> 04:56.163
and then we're gonna have a look at the messages as well.

04:57.120 --> 04:59.010
And now you can see here we've got the original message.

04:59.010 --> 05:01.530
We've got this ChatCompletionMessage in the middle,

05:01.530 --> 05:04.500
and we also have the tool_call_id that specifically responds

05:04.500 --> 05:06.750
to this one here, the first tool_call.

05:06.750 --> 05:09.270
Now we can create the completion_2 request

05:09.270 --> 05:11.430
and then passing in the original messages and tools,

05:11.430 --> 05:13.580
and looking at the content of that message.

05:15.390 --> 05:18.150
And it says quantum area computing is an area of computing.

05:18.150 --> 05:20.640
So it is basically decided this is the final answer.

05:20.640 --> 05:23.070
So it executed that search on the knowledge base

05:23.070 --> 05:25.440
and then it hasn't come up with any more tool_calls

05:25.440 --> 05:28.530
and therefore, it's given us a final answer here.

05:28.530 --> 05:29.880
You can confirm that is the case

05:29.880 --> 05:34.230
by looking at the completion_2.choices[]0.message.tool_calls

05:34.230 --> 05:35.490
and it is essentially empty.

05:35.490 --> 05:37.440
Now, there is an alternative way to write this

05:37.440 --> 05:39.330
that is a little bit better in my opinion.

05:39.330 --> 05:40.950
So you have a messages list

05:40.950 --> 05:42.885
and we're just gonna write it in a while truth.

05:42.885 --> 05:46.210
So we'll say messages.append

05:47.220 --> 05:48.750
and then we're gonna basically

05:48.750 --> 05:51.960
just have some information, "role": "user".

05:51.960 --> 05:54.660
And then instead of asking about quantum computing,

05:54.660 --> 05:57.170
we're going to ask about ChatGPT.

05:57.170 --> 06:01.350
So we'll say, can you find some information

06:01.350 --> 06:04.623
about ChatGPT and the AI knowledge base?

06:05.520 --> 06:07.710
We're also gonna create a function just above this.

06:07.710 --> 06:10.320
So we'll say def search_knowledge_base.

06:10.320 --> 06:12.750
And we're gonna say some information about ChatGPT

06:12.750 --> 06:13.740
just to mimic this.

06:13.740 --> 06:14.730
So we'll say ChatGTP

06:14.730 --> 06:17.250
is a large language model developed by OpenAI.

06:17.250 --> 06:18.540
And then what we're gonna do

06:18.540 --> 06:20.730
is we're gonna set up our while true loop.

06:20.730 --> 06:23.160
And then this is where you can basically have all

06:23.160 --> 06:25.680
of your agentic code running in a while true loop.

06:25.680 --> 06:27.750
So you start with your first completion

06:27.750 --> 06:28.583
and then what you're doing

06:28.583 --> 06:32.760
is saying client.chat.completions.create.

06:32.760 --> 06:35.340
And then after that, you paste in the model,

06:35.340 --> 06:37.293
we paste in the messages,

06:39.657 --> 06:41.550
and we're also pasting in the tools.

06:41.550 --> 06:44.130
We then always paste back in the message.

06:44.130 --> 06:45.540
if there aren't tool_calls,

06:45.540 --> 06:48.060
we can then break out of the for loop.

06:48.060 --> 06:50.850
If there is a tool_call, then we get the function name

06:50.850 --> 06:52.170
and the arguments.

06:52.170 --> 06:56.130
And then we also get the result of that tool_call.

06:56.130 --> 07:00.120
We pass in the role, the tool_call_id and the content.

07:00.120 --> 07:03.180
And then what you'll see is then basically this will run

07:03.180 --> 07:06.810
in a while true loop and it finishes after 2.7 seconds.

07:06.810 --> 07:09.090
So it actually does finish

07:09.090 --> 07:11.070
because the reason why is it decided

07:11.070 --> 07:12.690
that there weren't any tool calls

07:12.690 --> 07:15.093
and therefore, all of the agentic work was done.

07:15.960 --> 07:17.550
So you can have a look and you can see we've got

07:17.550 --> 07:18.930
that initial first message.

07:18.930 --> 07:20.586
Then we have our ChatCompletionMessage

07:20.586 --> 07:22.530
that did ask to do a tool_call.

07:22.530 --> 07:23.790
We then did that tool_call.

07:23.790 --> 07:25.440
We got some information about the fact

07:25.440 --> 07:28.140
that ChatGPT is a large language model

07:28.140 --> 07:31.440
and then also, it decided to use that information

07:31.440 --> 07:32.550
to then respond.

07:32.550 --> 07:33.720
So the key point here

07:33.720 --> 07:36.900
is that you are basically stuck in a while true loop

07:36.900 --> 07:38.970
and you also have the ability

07:38.970 --> 07:42.030
to break out if there are no more tool_calls to be made.

07:42.030 --> 07:43.890
Now, just to give you a bit of a word of warning,

07:43.890 --> 07:45.780
the problem that you can run into with this

07:45.780 --> 07:47.640
is that it endlessly makes tool_calls,

07:47.640 --> 07:50.790
trying to search or do additional steps.

07:50.790 --> 07:52.950
So a nice quality of life feature

07:52.950 --> 07:55.080
is to add a maximum number of iterations.

07:55.080 --> 07:58.890
So you can say maximum number of iterations,

07:58.890 --> 08:00.510
and you can just put maybe 10.

08:00.510 --> 08:03.060
And then we can say we can adjust the while true loop.

08:03.060 --> 08:04.440
So I'm just gonna use cursor to do this.

08:04.440 --> 08:06.940
So we'll say adjust the while true loop

08:07.800 --> 08:12.030
to work only up to 10 iterations.

08:12.030 --> 08:14.010
And so that's just gonna say every single time

08:14.010 --> 08:16.800
we do an iteration, we're basically just gonna add on one

08:16.800 --> 08:18.180
to the iteration variable.

08:18.180 --> 08:20.790
And then after that, then we make sure that we don't hit

08:20.790 --> 08:22.140
that number of iterations,

08:22.140 --> 08:24.240
and then we can track how many iterations we make

08:24.240 --> 08:25.080
in real time.

08:25.080 --> 08:27.750
So this can be like another good way of figuring out,

08:27.750 --> 08:30.360
so we made two iterations, the zero and then the one.

08:30.360 --> 08:31.890
And this can be a very good way

08:31.890 --> 08:33.690
of avoiding endless while loops.

08:33.690 --> 08:35.520
Something else that I also want you to consider

08:35.520 --> 08:36.990
is you might also not want

08:36.990 --> 08:39.360
to break when there are no tool_calls.

08:39.360 --> 08:41.460
You might want to break on a specific goal.

08:41.460 --> 08:44.220
So imagine you're trying to get a certain number of leads

08:44.220 --> 08:46.320
or a certain number of email addresses scraped.

08:46.320 --> 08:49.950
You could have that as your goal inside of the while loop

08:49.950 --> 08:53.490
and you could endlessly scan new areas of Google

08:53.490 --> 08:55.980
to specifically go and look for that information.

08:55.980 --> 08:57.600
I actually created a demo version

08:57.600 --> 08:59.880
of this in an AI-powered outreach assistant.

08:59.880 --> 09:02.310
So I'll put a link to the public repository

09:02.310 --> 09:03.540
for you to go and have a look at.

09:03.540 --> 09:05.070
But if we scroll down and have a look

09:05.070 --> 09:06.690
at the agentic while loop,

09:06.690 --> 09:08.670
you can see this is where we start to do this.

09:08.670 --> 09:10.560
So we have our while true loop.

09:10.560 --> 09:13.470
We also then decide okay, if we've gone past the time,

09:13.470 --> 09:14.760
we break out of the loop

09:14.760 --> 09:16.170
and if we've hit our objective,

09:16.170 --> 09:18.030
we also break out of the loop.

09:18.030 --> 09:20.100
So those are some easy ways to make sure

09:20.100 --> 09:21.960
that we always break out of the loop.

09:21.960 --> 09:24.150
And also, the other thing you'll want to be aware of

09:24.150 --> 09:27.240
is when you're running agents for a large amount of time,

09:27.240 --> 09:29.220
the amount of chat message histories

09:29.220 --> 09:32.250
that you get will become exponentially more expensive.

09:32.250 --> 09:34.500
So you might want to prune the messages,

09:34.500 --> 09:37.860
but you must keep the tool pairs, the tool_call

09:37.860 --> 09:41.340
and the tool_call result paired messages next to each other

09:41.340 --> 09:44.100
otherwise OpenAI will throw an error at you.

09:44.100 --> 09:45.540
So make sure that if you are going

09:45.540 --> 09:47.010
to prune the message history,

09:47.010 --> 09:50.460
have a look at the helpers.py inside of here

09:50.460 --> 09:52.800
where you can see there is some advanced pruning

09:52.800 --> 09:55.391
so that we can specifically always keep the system message

09:55.391 --> 10:00.060
and have these remainder pairs of the tool_call_id

10:00.060 --> 10:03.240
with the tool_call result chat message history.

10:03.240 --> 10:05.490
Cool, hopefully this is a good intro into agents

10:05.490 --> 10:08.153
and feel free to let me know if you've got any questions.
