1
00:00:03,000 --> 00:00:06,960
So what is tool calling? Tool calling is a powerful technique where you make the

2
00:00:06,960 --> 00:00:12,600
LLM context aware of real-time data such as databases or APIs. Typically you use

3
00:00:12,600 --> 00:00:18,520
tool calling via a chat interface. So you would have your client application in

4
00:00:18,520 --> 00:00:27,799
one hand and then the LLM on the other side. For your client application you

5
00:00:27,799 --> 00:00:32,639
would send a set of messages together with a tool definition to the LLM. So you

6
00:00:32,639 --> 00:00:41,080
would have your messages here together with your list of tools. The LLM will

7
00:00:41,080 --> 00:00:45,159
look at both your message and the list of tools and it's going to recommend a

8
00:00:45,159 --> 00:00:52,479
tool you should call. From your client application you should call this tool

9
00:00:52,479 --> 00:00:57,959
and then supply the answer back to the LLM. So this tool response will be

10
00:00:57,959 --> 00:01:02,799
interpreted by the LLM and this will either tell you the next tool to call or

11
00:01:02,799 --> 00:01:08,519
it will give you the final answer. In your application you're responsible for

12
00:01:08,519 --> 00:01:13,839
creating the tool definition. So this tool definition includes a couple of

13
00:01:13,839 --> 00:01:19,160
things such as the name of every tool. It also includes a description for the tool.

14
00:01:19,160 --> 00:01:22,519
So this is where you can give additional information about how to use

15
00:01:22,519 --> 00:01:27,120
the tool or when to use it. And it also includes the input parameters needed to

16
00:01:27,120 --> 00:01:32,160
make a tool call. And the tools can be anything. So the tools could be APIs or

17
00:01:32,160 --> 00:01:41,900
databases but it could also be code that you interpret via Code Interpreter. So

18
00:01:41,900 --> 00:01:47,000
let's look at an example. Assume you want to find the weather in Miami. You might

19
00:01:47,000 --> 00:01:54,040
ask the LLM about the temperature in Miami. You also provide a list of tools

20
00:01:54,040 --> 00:02:01,160
and one of these tools is the weather API. The LLM will look at both your

21
00:02:01,160 --> 00:02:04,519
question which is what is the temperature in Miami. It would also look

22
00:02:04,519 --> 00:02:09,119
at the weather API and then based on the tool definition for the weather API it's

23
00:02:09,119 --> 00:02:13,000
going to tell you how to call the weather tool. So in here it's going to

24
00:02:13,000 --> 00:02:17,399
create a tool that you can use right here on this side where you call the API

25
00:02:17,399 --> 00:02:21,039
to collect the weather information. You would then supply the weather

26
00:02:21,039 --> 00:02:26,639
information back to the LLM. So let's say it would be 71 degrees. The LLM will look

27
00:02:26,639 --> 00:02:31,559
at the tool response and then give the final answer which might be something in

28
00:02:31,559 --> 00:02:37,360
the trend of the weather in Miami is pretty nice it's 71 degrees. This has

29
00:02:37,360 --> 00:02:40,600
some downsides. So when you do traditional tool calling where you have

30
00:02:40,600 --> 00:02:48,080
an LLM and a client application you could see the LLM hallucinate. Sometimes

31
00:02:48,080 --> 00:02:54,000
the LLM can also make up incorrect tool calls. That's why I also want to look at

32
00:02:54,000 --> 00:02:58,479
embedded tool calling. We just looked at traditional tool calling but traditional

33
00:02:58,479 --> 00:03:02,160
tool calling has its flaws. As I mentioned the LLM could hallucinate or

34
00:03:02,160 --> 00:03:06,479
create incorrect tool calls. That's why you also want to take embedded tool

35
00:03:06,600 --> 00:03:10,919
calling into account. With embedded tool calling you use a library or framework

36
00:03:10,919 --> 00:03:15,520
to interact with the LLM and your tool definitions. The library would be

37
00:03:15,520 --> 00:03:24,919
somewhere between your application and the large language model. In the library

38
00:03:24,919 --> 00:03:29,520
you would do the tool definition but you would also execute the tool calls. Let's

39
00:03:29,520 --> 00:03:34,479
draw a line between these sections here. So the library will contain your tool

40
00:03:34,479 --> 00:03:42,360
definition. It would also contain the tool execution. So when you send a

41
00:03:42,360 --> 00:03:45,839
message from your application to the large language model it will go through

42
00:03:45,839 --> 00:03:54,479
the library. So your message could still be what is the temperature in Miami. The

43
00:03:54,479 --> 00:03:57,960
library will then append the tool definition and send your message

44
00:03:57,960 --> 00:04:04,000
together with the tools to the LLM. So this will be your message plus your list

45
00:04:04,039 --> 00:04:10,039
of tools. Instead of sending the tool to call to the application or the user it

46
00:04:10,039 --> 00:04:14,759
will be sent to the library which will then do the tool execution. In this way

47
00:04:14,759 --> 00:04:19,920
the library will provide you with the final answer which could be it's 71

48
00:04:19,920 --> 00:04:24,000
degrees in Miami. When you use embedded tool calling the LLM will no longer

49
00:04:24,000 --> 00:04:28,559
hallucinate as the library to help you with the tool calling or the embedded

50
00:04:28,559 --> 00:04:32,399
tool calling is going to take care of the tool execution and will retry the

51
00:04:32,399 --> 00:04:36,600
tool calls in case it's needed. So in this video we looked at both traditional

52
00:04:36,600 --> 00:04:40,200
tool calling and also embedded tool calling where especially embedded tool

53
00:04:40,200 --> 00:04:43,959
calling will help you to prevent hallucination or help you with the

54
00:04:43,959 --> 00:04:47,119
execution of tools which could be APIs databases or code.