WEBVTT

00:00.570 --> 00:02.790
Instructor: Just one minor change we need to make

00:02.790 --> 00:04.470
to our get text function.

00:04.470 --> 00:08.340
So sometimes the LLM may pass as our text

00:08.340 --> 00:12.030
some unnecessary characters which are non-alphabetical.

00:12.030 --> 00:16.350
So for example, backslash N or some single quote.

00:16.350 --> 00:19.980
So here I'm simply removing those and cleaning it up.

00:19.980 --> 00:22.980
Let's now review the React algorithm again.

00:22.980 --> 00:27.210
And after the tool execution, we got back the observation.

00:27.210 --> 00:30.200
It's the result of the tool after executing it.

00:30.200 --> 00:31.770
So now we have a choice,

00:31.770 --> 00:33.810
whether we want to give back an answer

00:33.810 --> 00:35.389
because we have enough information

00:35.389 --> 00:38.010
or we want to take the observation,

00:38.010 --> 00:40.080
the tool execution result,

00:40.080 --> 00:43.110
and we want to run another iteration of the React loop.

00:43.110 --> 00:45.060
But this time with all the history

00:45.060 --> 00:46.650
of what's been done so far

00:46.650 --> 00:49.960
so the agent won't be making redundant steps.

00:49.960 --> 00:53.340
So that's what we'll be implementing in this video.

00:53.340 --> 00:54.995
So let's go back to our code

00:54.995 --> 00:57.390
and let's head up to the React prompt.

00:57.390 --> 00:59.880
Now, remember that we removed the agent scratchpad

00:59.880 --> 01:01.876
from the original prompt.

01:01.876 --> 01:04.440
So now we're going to bring it back.

01:04.440 --> 01:06.060
And now we have better understanding

01:06.060 --> 01:07.140
what it's supposed to do.

01:07.140 --> 01:09.480
This is going to contain all the history

01:09.480 --> 01:11.459
and all the information that we had so far

01:11.459 --> 01:13.283
in the React execution.

01:13.283 --> 01:15.930
So we need to also add it to the dictionary

01:15.930 --> 01:17.310
that will be sending our prompt.

01:17.310 --> 01:21.060
So we'll have here another key of agent scratchpad.

01:21.060 --> 01:23.160
And just like before we're going to fill it up

01:23.160 --> 01:26.550
with a Lambda function that is going to access the key

01:26.550 --> 01:27.923
of agent scratchpad.

01:27.923 --> 01:31.620
So in order to keep track of the history of our agent

01:31.620 --> 01:32.850
and what happened so far,

01:32.850 --> 01:34.830
we're going to create a new variable

01:34.830 --> 01:36.420
called intermediate steps.

01:36.420 --> 01:38.220
And this is going to be a list

01:38.220 --> 01:40.710
which is going to start as an empty list.

01:40.710 --> 01:43.800
So let's plug it in into our agent invocation.

01:43.800 --> 01:45.731
So we're going to add here another he,

01:45.731 --> 01:48.420
and it's going to be the agent scratch pad.

01:48.420 --> 01:52.080
And this is going to hold the value of the empty list,

01:52.080 --> 01:54.480
which is the intermediate steps.

01:54.480 --> 01:56.670
Now, every time we run an iteration,

01:56.670 --> 01:59.066
we want to update this list and append to it the history

01:59.066 --> 02:01.410
and what we have performed.

02:01.410 --> 02:05.520
So after we perform the tool selection and tool execution

02:05.520 --> 02:07.784
and get back the observation, I'm going to append

02:07.784 --> 02:10.648
to our intermediate steps which hold the history

02:10.648 --> 02:12.630
our agent step.

02:12.630 --> 02:14.310
So this is what we got back

02:14.310 --> 02:16.445
after we parsed the LLM's answer,

02:16.445 --> 02:19.200
and I'm going to append the observation,

02:19.200 --> 02:21.990
what we got back from the tool after running it.

02:21.990 --> 02:24.510
So that way our agent will have both

02:24.510 --> 02:28.500
its reasoning engine history and what it has chosen already

02:28.500 --> 02:31.350
and also the result of what was the result

02:31.350 --> 02:33.390
of the tool execution.

02:33.390 --> 02:34.680
Let's run it in debug.

02:34.680 --> 02:37.747
And we want to examine the values of agent step

02:37.747 --> 02:39.480
and the observation

02:39.480 --> 02:41.910
and what we're saving in the intermediate steps,

02:41.910 --> 02:43.288
which is our history.

02:43.288 --> 02:46.680
So I'm simply going to select everything here

02:46.680 --> 02:48.205
and evaluate it.

02:48.205 --> 02:52.740
Now everything looks good so far except for the fact

02:52.740 --> 02:56.850
that the first element in our typo is an agent action,

02:56.850 --> 02:58.950
which is an LangChain object.

02:58.950 --> 03:01.890
And the LLM doesn't understand LangChain object.

03:01.890 --> 03:05.100
So we will need to translate this into text.

03:05.100 --> 03:07.143
So we'll need to format it nicely.

03:08.049 --> 03:10.950
So don't worry, we're not going to do it ourselves.

03:10.950 --> 03:12.570
LangChain is going to supply us

03:12.570 --> 03:15.026
with a utility function that performs it.

03:15.026 --> 03:19.873
So I'm going to import the function format log to string.

03:19.873 --> 03:23.850
And this function is going to take our intermediate steps,

03:23.850 --> 03:28.350
which is a list that contains the tool of the agent action,

03:28.350 --> 03:31.350
what the agent has decided, which tool to use,

03:31.350 --> 03:34.620
and the observation, what was the result of the tool.

03:34.620 --> 03:37.470
So it's then going to construct the scratchpad

03:37.470 --> 03:39.690
that is going to enable the agent

03:39.690 --> 03:41.910
to continue in its thought process

03:41.910 --> 03:45.329
and basically it's going to format all those strings nicely.

03:45.329 --> 03:49.110
So instead of passing the original agent scratchpad,

03:49.110 --> 03:51.665
we're going to first format it nicely

03:51.665 --> 03:55.680
by using the format log to STR function,

03:55.680 --> 03:59.040
and that way we'll pass to the LLM only strings

03:59.040 --> 04:01.500
and that's way the LLM would be able

04:01.500 --> 04:03.930
to take account of our history.

04:03.930 --> 04:06.810
So let's add another step of the agent.

04:06.810 --> 04:10.650
So I'm simply going to copy this agent step invocation

04:10.650 --> 04:12.180
and going to paste it.

04:12.180 --> 04:15.423
So let's print what we get in the second step of the agent.

04:16.650 --> 04:18.690
And now let's run everything in debug

04:18.690 --> 04:19.990
and let's see what we get.

04:21.660 --> 04:24.480
So when we run it in debug, we'll get an error

04:24.480 --> 04:29.480
that the output parser got as an input an input

04:29.790 --> 04:34.225
that also has a final answer and also a parsible action.

04:34.225 --> 04:36.360
And that confuses the LLM

04:36.360 --> 04:38.310
because it should either have an action

04:38.310 --> 04:39.930
and action input to perform

04:39.930 --> 04:41.460
or it should have the final answer.

04:41.460 --> 04:42.450
It can't have both.

04:42.450 --> 04:44.010
This is the React algorithm

04:44.010 --> 04:47.130
and this is the logic of the output parser.

04:47.130 --> 04:50.610
So from my experience, the usually fix for those kinds

04:50.610 --> 04:53.160
of issue is to work around

04:53.160 --> 04:56.100
and refine the prompt that is being sent to the agent.

04:56.100 --> 04:58.020
So you can see right now that in this prompt,

04:58.020 --> 05:00.720
we have a space at the end before the question mark,

05:00.720 --> 05:02.223
so we're going to remove it.

05:04.710 --> 05:07.707
And let's also not forget the previous invocation

05:07.707 --> 05:09.540
of the agent.

05:09.540 --> 05:11.409
So we also need to fix that.

05:11.409 --> 05:13.526
And let's now rerun everything

05:13.526 --> 05:16.209
and examine the output.

05:16.209 --> 05:19.740
We can see right now we got the first step of the agent,

05:19.740 --> 05:20.940
the first invocation.

05:20.940 --> 05:22.710
It was the first iteration.

05:22.710 --> 05:25.650
We saw that we got the answer from the LLM

05:25.650 --> 05:28.590
that decided that we should use the tool get text length.

05:28.590 --> 05:31.320
We then used the output parser to parse it,

05:31.320 --> 05:33.329
and we then chose the correct tool

05:33.329 --> 05:35.910
and we invoked its function.

05:35.910 --> 05:39.780
We did that using the agent action object of LangChain,

05:39.780 --> 05:42.960
which holds all the information about the tool to execute.

05:42.960 --> 05:44.340
We then executed the tool

05:44.340 --> 05:45.707
and we have now the result

05:45.707 --> 05:48.840
of that tool of get text length.

05:48.840 --> 05:51.585
So this result is considered as an observation,

05:51.585 --> 05:55.230
and then we take the agent action and the observation

05:55.230 --> 05:57.606
and we append it to the intermediate step list

05:57.606 --> 06:01.920
and we send it as the agent scratchpad.

06:01.920 --> 06:04.860
So this is the history of what has been done so far,

06:04.860 --> 06:08.040
what tools were chosen and what was the result.

06:08.040 --> 06:10.566
And this is being sent to the second iteration

06:10.566 --> 06:12.630
of the React loop.

06:12.630 --> 06:16.049
So now we're starting all over again, but with the history.

06:16.049 --> 06:20.460
However, now in the second invocation in the scratchpad,

06:20.460 --> 06:22.050
the agent has the observation,

06:22.050 --> 06:23.940
which is the result of the tool

06:23.940 --> 06:27.600
that counts the number of letters in the word doc.

06:27.600 --> 06:29.363
So we have the final answer.

06:29.363 --> 06:32.550
And actually the object that we're dealing with right now

06:32.550 --> 06:35.446
is called agent finish and not agent action.

06:35.446 --> 06:37.530
And I want to run it in debug

06:37.530 --> 06:39.787
in order for you to see it for yourself.

06:39.787 --> 06:44.040
So, remember that we told that the output parsing return

06:44.040 --> 06:46.860
as either agent action and agent finish.

06:46.860 --> 06:49.969
So in this case it returned the agent finish.

06:49.969 --> 06:53.820
So if we'll go to the agent step, we can see

06:53.820 --> 06:56.065
that we're dealing with an agent finish object,

06:56.065 --> 07:00.150
and this is what the output passer returned for us.

07:00.150 --> 07:02.010
And why did it return an agent finish

07:02.010 --> 07:03.540
and not an agent action?

07:03.540 --> 07:05.940
Because if you take a look at the log entry,

07:05.940 --> 07:08.785
the LLM responded to us, I know the final answer

07:08.785 --> 07:11.370
and then backslash then final answer,

07:11.370 --> 07:14.700
and the output parser parsed it and noticed it,

07:14.700 --> 07:17.730
and it decided that it now has the final answer,

07:17.730 --> 07:20.760
so it should return an agent finish object.

07:20.760 --> 07:23.280
Lastly, in the field of return values,

07:23.280 --> 07:25.804
we have a dictionary with the output key

07:25.804 --> 07:29.820
and the value is the answer of the agent for our question.

07:29.820 --> 07:33.450
So now I remove the print statement of the agent action,

07:33.450 --> 07:36.780
and we're checking if the agent step is an instance

07:36.780 --> 07:38.340
of the class agent finish.

07:38.340 --> 07:41.130
And if this is the case, then the agent step is going

07:41.130 --> 07:44.730
to have the field of return values and we can print it

07:44.730 --> 07:47.100
and finish our execution.

07:47.100 --> 07:48.843
So let's now rerun it.

07:51.840 --> 07:56.040
And we can indeed see that we had two React iterations

07:56.040 --> 07:58.410
and we got the final answer for our agent,

07:58.410 --> 07:59.493
which used our tool.
