WEBVTT

00:00.400 --> 00:01.120
Hey there.

00:01.160 --> 00:01.920
Eden here.

00:01.920 --> 00:06.000
And in this video I want to talk about context engineering.

00:06.520 --> 00:13.640
Now, if you've been working with AI agents and probably coding agents like Cursor and Cloud Code,

00:13.640 --> 00:19.240
or maybe you even developed AI agents for your companies or for your own use, and you probably know

00:19.240 --> 00:26.840
that basically, it all boils down to a prompt being sent to an LLM and a lot of engineering around

00:26.840 --> 00:27.200
it.

00:27.440 --> 00:34.640
So there is some truth for calling applications like cursor and cloud code, just wrappers around llms.

00:35.040 --> 00:41.880
However, to build really good wrappers requires a lot of deep knowledge and a lot of engineering work

00:42.200 --> 00:47.640
because we know that those calls to those llms, they come with context.

00:47.640 --> 00:50.440
And this context comes from various sources.

00:50.680 --> 00:53.800
Context can come from the developer of the application.

00:53.800 --> 00:55.560
It can come from the user.

00:55.560 --> 01:01.640
It can come from the previous interaction of the user, from tool calls and other external data.

01:01.920 --> 01:10.210
The number of context sources is is increasing every day and sending the correct and relevant context

01:10.250 --> 01:11.210
to the LM.

01:11.690 --> 01:15.730
It's not quite as simple as we thought it was in the early days.

01:15.730 --> 01:20.810
We thought, hey, we have prompt engineering and we'll write some fancy prompts and this will fix all

01:20.810 --> 01:21.690
of the problem.

01:22.210 --> 01:25.930
However, the problem is this is that prompts are static.

01:25.930 --> 01:33.050
However, those pieces of context are extremely dynamic, and if they're extremely dynamic, then in

01:33.050 --> 01:39.290
order to construct the correct contents, we need to have a dynamic system as well.

01:39.290 --> 01:41.570
So it's not just a static prompt.

01:41.730 --> 01:45.930
And this is why we're entering the realm of context engineering.

01:45.930 --> 01:49.250
And it's the natural evolution of prompt engineering.

01:49.250 --> 01:52.610
But it's a much deeper concept.

01:53.090 --> 01:56.450
Now we all know the saying garbage in, garbage out.

01:56.450 --> 02:02.450
And this is a common reason why agentic systems don't perform as the way they should, because they're

02:02.450 --> 02:04.770
simply not provided with the right context.

02:05.210 --> 02:07.370
Llms cannot read our minds.

02:07.530 --> 02:10.410
We actually need to give them the right information.

02:10.610 --> 02:13.250
And by the way, it's not always information and data.

02:13.250 --> 02:18.650
Sometimes we need to give them the correct tools so they'll be able to fetch other information and take

02:18.650 --> 02:22.810
some actions and do some stuff for us, and then they'll achieve the task.

02:23.970 --> 02:26.370
So let's focus a bit on agents.

02:26.610 --> 02:29.610
And Llms are becoming better and better.

02:29.610 --> 02:30.890
And this is not new.

02:31.130 --> 02:33.650
They can reason very very well.

02:33.650 --> 02:36.730
And we have tool calling and we can build with it.

02:36.730 --> 02:43.650
Very cool functionality of AI agents which are running tools, invoking those tools, getting the output

02:43.650 --> 02:47.810
of those tools and then running in a loop until they finish a task.

02:48.290 --> 02:55.090
However, when it comes to a long running task and a complex task, we often accumulate the feedback

02:55.090 --> 03:00.970
from the tool calls, and this means that the context window is going to keep growing and growing,

03:01.170 --> 03:05.130
and to have lots of tokens being filled up with all the tool calls results.

03:05.410 --> 03:11.820
So this can cause a lot of problems, and it can of course exceed the size of the context window Though,

03:11.980 --> 03:15.420
and it can increase the cost, the latency.

03:15.660 --> 03:20.500
And by the end of it, it's eventually, if we're not going to do anything about it, it's going to

03:20.500 --> 03:24.380
degrade the agent performance, this degradation.

03:24.380 --> 03:25.580
There are many reasons.

03:25.580 --> 03:28.260
So for once we can have a context poisoning.

03:28.260 --> 03:36.140
So this is when one tool call or one call introduces a hallucination that makes it into the context.

03:36.140 --> 03:38.300
And it's starting now to degrade the system.

03:38.340 --> 03:40.420
Another thing that can happen is context.

03:40.420 --> 03:47.020
Confusion is that when we introduce some unnecessary context that is going to influence the response.

03:47.020 --> 03:50.100
So this is context which is not needed to the task.

03:50.580 --> 03:57.900
And of course there is the possibility of a context clash when parts of the context contradict each

03:57.940 --> 03:58.340
other.

03:59.300 --> 03:59.740
All right.

03:59.740 --> 04:02.340
So let me summarize this video in a couple of sentences.

04:02.340 --> 04:09.940
We discussed what is context engineering and in a very simple term is simply a way to give the LLM the

04:09.940 --> 04:11.060
correct context.

04:11.100 --> 04:11.780
We discussed.

04:11.780 --> 04:17.700
Where did this concept evolve from prompt engineering and why is it specifically important for agents?

04:17.700 --> 04:20.980
And in the next video we're going to discuss some techniques.

04:21.180 --> 04:25.620
And those techniques are going to give eventually the LM better context.

04:25.860 --> 04:30.380
Now some of those techniques are going to be on the application developer side.

04:30.380 --> 04:36.380
So for example, developers of applications like cloud code will implement those kinds of solution.

04:36.580 --> 04:40.500
However some of those techniques are on the user side.

04:40.500 --> 04:47.100
So we as users of cloud code has a lot of influence on the answer that we get and the context that we're

04:47.100 --> 04:49.100
going to supply the LM eventually.

04:49.700 --> 04:55.660
So this means that even non developers need to know context engineering and need to understand this

04:55.660 --> 05:01.700
principle if they want to get better responses and better answers from their AI systems and AI agents.

05:02.020 --> 05:08.220
Just to finish, an excellent example of context engineering techniques and how to implement them from

05:08.220 --> 05:13.980
both the developer side and both the user side is coding agents like cloud Code.

05:13.980 --> 05:15.940
And this is what we're going to be discussing.

05:15.940 --> 05:20.420
In the next video, I'll be discussing how to better engineer our context.
