WEBVTT

00:00.180 --> 00:01.440
-: Hey there, Eden here.

00:01.440 --> 00:03.210
So today I want to talk about

00:03.210 --> 00:05.370
LangChain's tool calling feature.

00:05.370 --> 00:08.850
And this feature in my opinion is not getting enough hype

00:08.850 --> 00:10.740
and is super-important

00:10.740 --> 00:13.620
because it gives us a lot of flexibility

00:13.620 --> 00:17.310
switching between models that support function calling.

00:17.310 --> 00:20.670
So up until now if we wanted to use LangChain

00:20.670 --> 00:24.510
for function calling, we pretty much had to use OpenAI,

00:24.510 --> 00:26.400
because all the implementation,

00:26.400 --> 00:28.920
for example on the function_calling agent

00:28.920 --> 00:32.670
was specifically tailor-made to the OpenAI's API.

00:32.670 --> 00:34.950
And when other vendors came out

00:34.950 --> 00:39.540
with function calling like Vertex Gemini or Anthropic Sonnet

00:39.540 --> 00:42.180
and they had different API schemas for function calling,

00:42.180 --> 00:45.210
and because the implementation was tailor-made to OpenAI,

00:45.210 --> 00:49.710
then supporting it with other models was quite troublesome.

00:49.710 --> 00:53.580
But now what LangChain did is to level up the field

00:53.580 --> 00:56.010
and to offer one interface for function calling

00:56.010 --> 00:58.680
or tool calling as LangChain calls it.

00:58.680 --> 01:01.830
And basically it supports all the famous models

01:01.830 --> 01:05.850
with function calling, of course, OpenAI, Vertex Gemini,

01:05.850 --> 01:09.390
Mistral, Fireworks, and an Anthropic Sonnet.

01:09.390 --> 01:11.820
And this was something that people have requested

01:11.820 --> 01:14.640
for a while now because they were locked

01:14.640 --> 01:17.490
for using only OpenAI function.

01:17.490 --> 01:19.830
So the interface consists of a couple of things,

01:19.830 --> 01:21.330
a bind function method,

01:21.330 --> 01:23.820
which takes the function that we wrote

01:23.820 --> 01:26.760
and tells the LLM that it may use it,

01:26.760 --> 01:30.150
the tool_calls which is returned from the LLM.

01:30.150 --> 01:32.760
And now when the LLM is returning an answer,

01:32.760 --> 01:35.670
this tool_calls is going to be populated

01:35.670 --> 01:37.740
if there is a function calling invocation.

01:37.740 --> 01:40.860
And the favorite part is the tool_calling_agent.

01:40.860 --> 01:42.930
So up until now we only had

01:42.930 --> 01:45.840
the OpenAI function_calling agent.

01:45.840 --> 01:47.820
Maybe I can show you here in the documentation.

01:47.820 --> 01:50.790
So we had here the OpenAI function_calling agent

01:50.790 --> 01:53.670
and if we wanted to use other models, for example,

01:53.670 --> 01:56.793
Vertex Gemini or Anthropic Sonnet, we couldn't.

01:57.630 --> 02:00.420
So now we can, with this new function

02:00.420 --> 02:02.940
which creates a function_calling agent

02:02.940 --> 02:05.280
regardless of which vendor we are using

02:05.280 --> 02:09.390
and support all vendors who support function calling.

02:09.390 --> 02:11.130
So we can see the example over here,

02:11.130 --> 02:12.870
which is pretty straightforward.

02:12.870 --> 02:16.920
We define some tools, maybe we can define even a function

02:16.920 --> 02:20.550
or simply describe directly the OpenAI format

02:20.550 --> 02:22.230
for function calling.

02:22.230 --> 02:26.280
And when we create LLM, we can bind it with those functions.

02:26.280 --> 02:28.440
Then every time we make an LLM call,

02:28.440 --> 02:29.640
the LLM can decide

02:29.640 --> 02:31.950
if we should invoke those functions or not

02:31.950 --> 02:33.453
and return it in the output.

02:34.980 --> 02:38.460
And let's go to the agent part of this new release.

02:38.460 --> 02:39.570
And right here you can see

02:39.570 --> 02:42.570
that we're creating a function_calling agent,

02:42.570 --> 02:46.500
providing it with the Anthropic Sonnet model.

02:46.500 --> 02:48.390
And this is something that up until now

02:48.390 --> 02:49.653
was super-hard to do.

02:51.510 --> 02:53.260
Let's go to the demo I've prepared.

02:55.350 --> 02:57.360
So the demo is pretty simple.

02:57.360 --> 03:00.480
I defined some tools, the multiply tool,

03:00.480 --> 03:03.480
and I'm using also the Tavily search tool,

03:03.480 --> 03:07.110
which is a way to search online for real time data.

03:07.110 --> 03:09.370
And I'm initializing the tool_calling_agent

03:10.221 --> 03:13.320
with one-time GPT-4 of OpenAI.

03:13.320 --> 03:15.450
And the other time I'm going to initialize it

03:15.450 --> 03:18.603
with Anthropic Sonnet, which now supports function calling.

03:19.440 --> 03:21.660
And I'm going to ask a simple question,

03:21.660 --> 03:23.550
what is the weather right now in Dubai

03:23.550 --> 03:25.560
and to compare it with San Francisco

03:25.560 --> 03:28.260
and I want the output in Celsius.

03:28.260 --> 03:30.780
I'm also tracing everything with LangSmith

03:30.780 --> 03:32.310
so we can examine the results

03:32.310 --> 03:34.500
and see how everything is working.

03:34.500 --> 03:37.623
So let's run it in debug for the first time with OpenAI,

03:47.070 --> 03:49.383
and we got a result, let's examine it.

03:51.990 --> 03:54.750
And we can see that we got the temperature in Dubai

03:54.750 --> 03:59.583
right now is 28 Celsius and in San Francisco, 8.9 Celsius.

04:02.310 --> 04:05.250
Now let's run it with Anthropic Sonnet.

04:05.250 --> 04:07.560
And the goal here is to show you how easy it is

04:07.560 --> 04:09.483
to switch now between the models.

04:25.740 --> 04:28.563
So we got the result, let's go and examine it.

04:30.510 --> 04:33.840
And we got here that the temperature in Dubai

04:33.840 --> 04:38.583
is around 28 Celsius and San Francisco is 8.9 Celsius.

04:40.530 --> 04:43.110
So this is pretty much what we expected.

04:43.110 --> 04:47.643
Now let's go to LangSmith and let's now compare the traces.

04:50.070 --> 04:53.370
So this is the first trace of using OpenAI.

04:53.370 --> 04:55.920
So we can see the first time where the LLM was called

04:55.920 --> 04:57.360
is with a prompt,

04:57.360 --> 05:00.330
and the result is to invoke the tools twice.

05:00.330 --> 05:04.080
One time to send the Tavily search to Dubai

05:04.080 --> 05:07.290
and the other one to search the weather for San Francisco.

05:07.290 --> 05:09.630
So one API call to the function calling

05:09.630 --> 05:11.610
gave us two invocations of tools

05:11.610 --> 05:13.800
and we can see right now the invocation.

05:13.800 --> 05:16.260
So we simply ran the function with Tavily,

05:16.260 --> 05:18.540
and we got there the answers.

05:18.540 --> 05:20.850
You can see it right over here.

05:20.850 --> 05:23.673
And this second time is for San Francisco.

05:25.350 --> 05:28.650
And the final call to the LLM, the second call,

05:28.650 --> 05:31.050
was simply to wrap everything together

05:31.050 --> 05:33.210
and to summarize those results.

05:33.210 --> 05:37.050
We can see this is the output that we saw when we debug.

05:37.050 --> 05:39.990
And now let's go examine what happened

05:39.990 --> 05:41.970
with Anthropic Sonnet.

05:41.970 --> 05:45.000
So we can see here we have three API calls

05:45.000 --> 05:47.640
to Sonnet instead of two.

05:47.640 --> 05:49.470
And the first time the LLM told us

05:49.470 --> 05:51.660
that we need to invoke the Tavily search tool

05:51.660 --> 05:53.433
with the weather in Dubai.

05:54.780 --> 05:58.050
And then we invoked it, we got a result

05:58.050 --> 05:59.550
of what's the weather in Dubai.

05:59.550 --> 06:03.840
And then the agent decided that we need to run it again.

06:03.840 --> 06:05.070
And then we went to the agent

06:05.070 --> 06:06.690
to see if we can return a result

06:06.690 --> 06:10.710
and the agent told us that we needed to invoke another tool,

06:10.710 --> 06:14.190
and this time the Tavily search tool with San Francisco.

06:14.190 --> 06:17.043
So we went and we invoked it.

06:19.410 --> 06:20.580
And then finally,

06:20.580 --> 06:22.110
after we had both results

06:22.110 --> 06:24.510
of the two function calling invocations,

06:24.510 --> 06:27.810
then we can see that we have here the output summary,

06:27.810 --> 06:29.010
and this is what we saw.

06:29.010 --> 06:30.390
And that's it for this demo.

06:30.390 --> 06:33.780
And my goal here was to show you how easy it is to switch

06:33.780 --> 06:36.630
between models now with function calling.

06:36.630 --> 06:39.000
And up until now that was not possible.

06:39.000 --> 06:41.250
And thanks for LangChain for implementing it.

06:41.250 --> 06:44.550
I think it's a huge step in commoditizing machine learning,

06:44.550 --> 06:47.220
and it was requested by a lot of folks

06:47.220 --> 06:49.770
who wanted the flexibility to try out

06:49.770 --> 06:51.993
and work with different models.