WEBVTT

00:00.110 --> 00:00.710
Hey there.

00:00.740 --> 00:01.310
Eden here.

00:01.310 --> 00:06.500
And in this video I'm going to show what are the differences between the function calling agent and

00:06.500 --> 00:12.500
the react agent in the long chain implementation, when to use one over the other, and what are the

00:12.500 --> 00:15.350
advantages and disadvantages of each one?

00:16.460 --> 00:22.730
And like all things in software today, it all boils down where do we shift our responsibility to who

00:22.730 --> 00:27.770
is responsible for the agent selecting the correct tool with the correct input?

00:27.770 --> 00:34.910
And I think the analogy to serverless is appropriate here, because in serverless we shift all the responsibility

00:34.910 --> 00:42.530
for the availability, durability, scalability, all those ilities we shift it to the vendor, to the

00:42.530 --> 00:43.640
cloud provider.

00:43.940 --> 00:48.500
Now, when it comes to agents, we have two paradigms of tool selection.

00:48.500 --> 00:50.330
The one is function calling.

00:50.330 --> 00:57.350
And in function calling we supply a certain schema to the vendor, specifying all of our functions that

00:57.350 --> 00:59.450
we want to equip our agent with.

00:59.450 --> 01:03.900
And after that, our LM is sort of equipped with those tools.

01:03.930 --> 01:10.620
The LM may respond to us in its response that we need to invoke a tool with certain arguments according

01:10.650 --> 01:12.540
to the specification we send it.

01:12.780 --> 01:19.050
Now, just like in serverless here, we are not exposed for the logic that happens under the hood,

01:19.050 --> 01:23.370
for how the LM is determining how to select the tool.

01:23.520 --> 01:29.940
By the way, my guess here is that this is a fine tuned model, specifically fine tuned for tool selection

01:29.940 --> 01:32.040
similar to the gorilla paper.

01:32.820 --> 01:38.670
So in the function calling agent, the responsibility for selecting the tools is on the vendor, the

01:38.670 --> 01:39.510
LM vendor.

01:39.510 --> 01:45.750
And of course I have to note it's a shared responsibility model because we need to make sure that our

01:45.750 --> 01:49.920
tool descriptions are not ambiguous and that they are well defined.

01:49.950 --> 01:52.740
So the LM would have an easier job to do it.

01:53.190 --> 01:56.790
And today many vendors support function calling or tool calling.

01:57.000 --> 01:58.200
OpenAI.

01:58.230 --> 01:59.160
Google.

01:59.190 --> 02:00.030
Anthropic.

02:00.030 --> 02:00.690
Mistral.

02:00.720 --> 02:02.700
They all support tool calling.

02:02.700 --> 02:09.420
And luckily for us, Late Chain implemented an abstraction for this a well defined interface for using

02:09.450 --> 02:12.450
tool, calling for every vendor that supports.

02:12.480 --> 02:18.570
So at the moment, switching between models that support tool calling is super easy.

02:18.570 --> 02:22.260
And we can switch models with tool calling like we switch our socks.

02:22.770 --> 02:23.550
Alrighty.

02:23.580 --> 02:30.780
Now in a react agent, it's an agent that specifically is using the react prompting, which is based

02:30.780 --> 02:32.460
on the react paper.

02:32.520 --> 02:37.770
Here it all boils down to the react prompt that is sent to the LM.

02:37.770 --> 02:45.150
This is a well crafted prompt that the Link Chain team wrote and was inspired by the react paper, and

02:45.150 --> 02:50.250
incorporates a lot of prompt engineering techniques like chain of thought and few shot prompting.

02:50.250 --> 02:55.800
And I have to say, I think it's the most beautiful prompt today in prompt engineering, so well, job

02:55.800 --> 02:56.370
link chain.

02:56.370 --> 02:57.720
This is an awesome prompt.

02:57.720 --> 03:04.900
Anyways, the whole idea of this prompt is to make the LM come our reasoning engine to select which

03:04.900 --> 03:05.950
tools to use.

03:05.980 --> 03:11.680
And it turns out that this prompt is actually very useful for turning the LLM into a reasoning engine,

03:11.680 --> 03:18.580
and that in a lot of cases, it does return the correct tool to use with the correct inputs, and the

03:18.580 --> 03:26.110
LLM output is usually action and action input, which Linkchain knows how to parse and to deduce from

03:26.110 --> 03:26.290
it.

03:26.320 --> 03:27.970
Which tool do we need to use?

03:28.000 --> 03:30.520
They do it with some regular expressions.

03:30.550 --> 03:37.090
Now, after we run the tool and we get a result back from our tool execution, then Linkchain labels

03:37.090 --> 03:38.800
it as an observation.

03:38.800 --> 03:44.080
And then we start to reiterate and to run again a prompt with them.

03:44.080 --> 03:49.060
But this time it will contain the history of what the LLM decided so far.

03:49.060 --> 03:52.060
Which tool did we use and what was the result of that tool?

03:52.060 --> 03:53.410
What was the observation?

03:53.410 --> 03:57.970
And this was a very high level explanation of the react loop.

03:57.970 --> 04:05.530
But basically here it all boils down to the react prompt and the react prompt is what leverages the

04:05.530 --> 04:07.660
LLM to become our reasoning engine.

04:07.660 --> 04:10.660
So we have full control of this prompt.

04:10.660 --> 04:15.730
And if we want, we can tweak it around and we can customize it according to our needs.

04:15.730 --> 04:19.990
So we have much more flexibility here for the tool selection part.

04:20.410 --> 04:21.250
Alrighty.

04:21.250 --> 04:22.990
So which one is better?

04:22.990 --> 04:25.870
And to be honest, I don't have the exact answer for that.

04:25.870 --> 04:28.510
Like everything in software, it depends.

04:28.660 --> 04:34.180
If we want full control and full flexibility, we can use the react prompt.

04:34.180 --> 04:41.830
But the problem here is that all the responsibility is on us as developers while in function calling.

04:41.830 --> 04:48.460
Then the tool selection responsibility is on the vendor and this saves us a lot of work and a lot of

04:48.460 --> 04:53.200
thinking, because all the work for the tool selection was done by the vendor.

04:53.200 --> 04:54.730
We have here less control.

04:54.760 --> 05:01.000
However, we do have here much less headache and that was pretty much it.

05:01.000 --> 05:02.380
I hope you enjoyed the video.

05:02.380 --> 05:04.360
Please let me know in the comments what you think.
