WEBVTT

00:00.200 --> 00:01.800
Hey there Eden here.

00:01.800 --> 00:07.280
And in this video I want to show you the gist of the MCP protocol I want to cover.

00:07.440 --> 00:11.640
What's the interaction between every component in the MCP protocol?

00:11.640 --> 00:14.760
So how does the client interact with the server?

00:14.800 --> 00:20.600
What's happening with the host and with the user and with the LM and how everything is playing together.

00:21.000 --> 00:26.160
So let's start with the diagram that we're seeing right over here on the left side we have the user,

00:26.160 --> 00:30.200
and the user is the one who's going to make queries into our application.

00:30.400 --> 00:36.560
Our application can be cursor windsurf can be cloud desktop and can be our agent that we wrote or any

00:36.560 --> 00:38.200
other agent which is deployed.

00:38.560 --> 00:40.040
Then we have the LM.

00:40.280 --> 00:43.120
So the application eventually is going to use an LM.

00:43.120 --> 00:47.080
So it's going to make requests to and we have the MCP server here.

00:47.080 --> 00:52.360
So the MCP server that we're going to be used that we're going to integrate into our application which

00:52.360 --> 00:53.440
supports MCP.

00:54.040 --> 00:56.240
Now you're wondering where is the client.

00:56.240 --> 00:59.760
So the client is actually residing within the application itself.

01:00.000 --> 01:04.220
And you can think about the application as the host of the MCP as well.

01:04.500 --> 01:08.940
So we have here the app which is going to perform the role of also the host.

01:08.940 --> 01:10.900
And it's going to also have the client.

01:11.340 --> 01:14.300
And the client is going to be connected to an MCP server.

01:14.500 --> 01:19.660
And in an app we can have multiple clients, and each client is going to be connected into a different

01:19.660 --> 01:20.700
MCP server.

01:20.740 --> 01:21.060
All right.

01:21.060 --> 01:22.460
So let's start at the very beginning.

01:22.460 --> 01:24.940
And this is when our application is loaded.

01:25.140 --> 01:30.860
So this is when we fire up cursor or when we fire up cloud desktop or our own agent.

01:31.140 --> 01:37.340
So the first thing that is going to happen we're going to make a connection to an MCP server or servers

01:37.340 --> 01:39.860
which are supported and integrated into the app.

01:40.260 --> 01:43.940
Now who is going to make those connections to the MCP servers?

01:43.940 --> 01:48.980
It's going to be the client inside the host, which lives in our application.

01:48.980 --> 01:54.420
So it's going to use the MCP protocol to initialize a connection, sending the messages back and forth.

01:54.620 --> 01:59.920
The MCP server is going to say that it acknowledges the client, and we're going to set the connection

01:59.920 --> 02:02.600
between the client and the MCP server.

02:02.600 --> 02:07.240
And if we have multiple MCP servers, then we're going to have a bunch of clients making connections

02:07.240 --> 02:08.800
into MCP servers here.

02:09.120 --> 02:15.800
Now when we initialize a connection, the server is going to let the client know which available tool

02:15.800 --> 02:17.040
does the server have.

02:17.080 --> 02:18.520
And let me reiterate on this.

02:18.520 --> 02:23.840
This is not only the tools, this is also including everything that the server exposes.

02:23.840 --> 02:29.160
So it can be all the resources that the server exposes and all the prompts and all the tools that the

02:29.160 --> 02:30.240
server exposes.

02:30.320 --> 02:32.560
I'm using tools here just for the example.

02:33.600 --> 02:37.840
If we're talking about the weather MCP server that we talked in the beginning of the course.

02:37.840 --> 02:41.280
So here's going to be the alert tool and the forecast tool.

02:41.440 --> 02:46.920
And the MCP server is letting to know the client which lives inside our application.

02:46.920 --> 02:50.800
And our host is going to let them know which available tools.

02:50.920 --> 02:57.800
So this is the interaction between the application and inside it, the clients and the MCP servers.

02:58.080 --> 03:02.060
And we see this is happening even before we have a user interaction.

03:02.060 --> 03:04.300
So this is when we fire up the application.

03:04.300 --> 03:09.860
Once we do that, we finish the MCP initialization and we finish setting up our application.

03:10.980 --> 03:18.460
So when a user is going to send a query to our application let's say cursor, then we are going to then

03:18.740 --> 03:23.700
send that message with the tools that we have to them.

03:23.980 --> 03:31.220
So because the application clients here, they know which tools the MCP servers expose, they're taking

03:31.220 --> 03:34.380
the tools that the MCP server returned that are available.

03:34.660 --> 03:39.860
And with the user query they are augmenting the user query with those tools.

03:39.900 --> 03:43.500
Now remember in the previous video I told you about the special prompt.

03:43.500 --> 03:45.580
So this is pretty much what's happening there.

03:45.620 --> 03:50.420
They're taking the original user query and they're listing the bunch of tools that are available.

03:50.700 --> 03:54.420
So now the Lem is not going to receive only the user's query.

03:54.540 --> 03:58.880
It's going to receive the user's query alongside with the available tools.

03:59.400 --> 04:01.640
So the LM now is going to respond.

04:01.680 --> 04:07.960
It's going to respond with an answer, or it's going to respond with a tool call that needs to be invoked.

04:08.000 --> 04:12.400
And remind you the MCP protocol is only working for tool calling LMS.

04:12.960 --> 04:18.320
So the tool call is going to have which tool needs to be called and what are the arguments that we need

04:18.320 --> 04:19.080
to call it with.

04:19.080 --> 04:22.240
So we have all the information about what needs to be executed.

04:22.680 --> 04:24.480
Now how do we execute that.

04:24.840 --> 04:30.680
So and this is the key difference by the way between the MCP and between frameworks like long chain.

04:31.000 --> 04:34.840
In long chain we execute everything in our application layer.

04:35.000 --> 04:36.160
And we will take that.

04:36.160 --> 04:39.000
And we're going to run everything inside the application.

04:39.000 --> 04:45.360
Usually what's happening with MCP, we are simply sending the tool call to the MCP server.

04:45.360 --> 04:52.040
So whether it's via Stdio or a server send event, we are sending to the MCP server which tool we need

04:52.040 --> 04:54.960
to invoke, which arguments do we need to send it.

04:54.960 --> 04:57.520
And the MCP server is going to run it.

04:57.520 --> 04:59.570
So everything is going to run now.

04:59.570 --> 05:02.930
The tool execution is going to happen in the MCP server.

05:02.930 --> 05:07.130
So it's not going to happen in the application in the graph or cursor application.

05:07.130 --> 05:09.090
It's going to happen in the MCP server.

05:09.130 --> 05:13.770
And this is a big advantage because once we do this we actually decouple everything.

05:13.930 --> 05:18.290
We decouple the MCP server and the tool execution from the agent itself.

05:18.290 --> 05:21.810
So the runtime of the server is what's going to run the tool.

05:22.250 --> 05:27.250
And this is going to help us if we want in the future to scale this out, maybe to deploy it on Kubernetes

05:27.250 --> 05:30.890
or in serverless and to monitor it in a different system.

05:30.890 --> 05:32.850
So this has a lot of advantages.

05:32.850 --> 05:36.250
And we'll talk about this when we talk about system design later in this course.

05:37.330 --> 05:37.730
All right.

05:37.730 --> 05:44.490
So the MCP server is now executing the tool it used for example the forecast tool and got us the forecast

05:44.490 --> 05:45.730
for California.

05:46.050 --> 05:50.250
Then it's going to send back the response to the application here.

05:50.250 --> 05:55.690
And and it's not really sending it to the application because it has this proxy of the MCP client.

05:55.690 --> 06:00.510
So the MCP client is going to handle the sending of the request and the receiving of the request, and

06:00.550 --> 06:06.830
the MCP client is then going to be integrated into our application, into our graph agent or cursor

06:06.870 --> 06:07.510
or whatever.

06:07.830 --> 06:10.470
So we got the answer from the MCP server.

06:11.150 --> 06:15.790
And now in the application layer we're going to make another request to the Lem.

06:16.510 --> 06:22.910
But it's going to be the user query with the response of the tool that was executed in the server.

06:23.270 --> 06:29.190
And now the Lem is going to decide whether we want to finish or whether we want to make another call.

06:29.190 --> 06:30.790
But let's say we want to finish.

06:30.790 --> 06:31.630
So what happens?

06:31.670 --> 06:36.150
The Lem sends its final answer and we receive it in the application layer.

06:36.310 --> 06:38.990
And then we return it to the user.

06:39.510 --> 06:46.750
And I want to note a very key difference between the Linkchain graph react agent and here the MCP flow.

06:46.950 --> 06:48.870
And we have a bunch of things that are together.

06:48.870 --> 06:54.150
And I want to note a very important difference in the Linkchain react agent.

06:54.150 --> 07:00.050
If we take it vanilla then the tools which are executing are going to execute within our app, within

07:00.050 --> 07:00.770
our agent.

07:01.210 --> 07:08.410
And if we'll take that and we'll integrate MCP into our graph agent, then what's going to happen is

07:08.410 --> 07:11.610
that we're going to run the tools in the MCP server.

07:11.770 --> 07:17.850
So we have here a decoupling of the tools component into a different service.

07:17.850 --> 07:19.090
And this is very useful.

07:19.090 --> 07:22.730
It's very helpful when debugging when logging helper for cost.

07:22.770 --> 07:23.970
It's helper for scaling.

07:24.130 --> 07:29.170
And I believe it's a better architectural decision to run everything in the MCP servers.

07:29.170 --> 07:35.650
And technically we can actually inside the graph tools, we can simply make dummy tools that will simply

07:35.650 --> 07:40.690
make requests into a different service, and we'll get a very similar behavior.

07:40.890 --> 07:46.930
However, the key difference here is that the MCP protocol is standard authorizing this, and it's going

07:46.930 --> 07:49.770
to give us one interface in order to do things.

07:49.930 --> 07:52.330
And this is something which is very, very cool.

07:52.330 --> 07:57.670
And I want to list another advantage of decoupling the tools from the agents itself.

07:57.670 --> 08:01.070
So the agent is going to be responsible for the orchestration.

08:01.230 --> 08:07.750
When to call the tool, and maybe to make another tool call, or to return a prompt to the user asking

08:07.750 --> 08:09.230
for feedback, whatever.

08:09.230 --> 08:14.030
So we decouple the logic of the orchestration from the tool execution.

08:14.030 --> 08:19.750
And by doing this decoupling, it actually gives us the point where we can actually update the server

08:19.750 --> 08:24.870
dynamically and maybe deploy it a new version of it, and we can set up that.

08:24.870 --> 08:30.430
The client is going to do this initialization not only once, but every once in a while, so that our

08:30.430 --> 08:33.230
agent is going to receive tools dynamically.

08:33.230 --> 08:36.910
And I think it's very cool and gives us the behavior of dynamic tool calling.

08:37.030 --> 08:42.430
So we don't need to redeploy our agent because our agent is going to have multiple initializations every

08:42.430 --> 08:45.270
once in a while, and it's going to receive the tools that it needs.

08:45.270 --> 08:46.710
This is another advantage.

08:47.230 --> 08:48.670
So I hope you enjoyed this video.

08:48.670 --> 08:53.910
And in the next video we're going to implement an MCP client inside our agent.

08:53.910 --> 08:56.270
And this is going to help us better understand these.
