WEBVTT

00:00.170 --> 00:06.140
All right, so memory Linkchain have gone through a lot of iterations, and right now I'm giving you

00:06.140 --> 00:12.290
the overview of the latest, best practice for setting up memory in your chat bot.

00:12.620 --> 00:13.400
All right.

00:13.400 --> 00:19.910
And overall, you can think about memory as simply stuffing all the memory and all the messages into

00:19.910 --> 00:20.840
the lmco.

00:21.380 --> 00:25.640
In some cases, it's simply not going to be enough because we'll pass the token limit.

00:25.640 --> 00:31.370
It's going to cost us a lot of money, and we don't really need to send everything every time to the

00:31.370 --> 00:39.470
LM, because even if we are using an LM with a large context window like Gemini 1.5 Pro with a million

00:39.470 --> 00:46.310
token size, then it would cost us more money, it would be slower, and we may get worse results,

00:46.310 --> 00:51.170
because we simply send a lot of garbage that the LM don't really need to handle.

00:51.170 --> 00:54.440
And as you remember the saying, garbage in, garbage out.

00:54.590 --> 00:58.250
Currently in link Chain, there are three main strategies to handle this.

00:58.250 --> 01:05.130
And the first strategy is to simply ignore this problem and not do anything and still stuff everything

01:05.130 --> 01:06.630
in your LM code.

01:06.660 --> 01:12.090
So this is also useful when you have short chats between the user and your bot.

01:12.180 --> 01:14.310
It's the most easiest way to start.

01:14.340 --> 01:22.080
The second way is to trim out old messages, so we'll simply get rid of those messages at the very beginning

01:22.080 --> 01:25.590
that probably are not going to be relevant to our chatbot.

01:25.920 --> 01:27.540
And this is a heuristic here.

01:27.570 --> 01:29.370
Of course, this is not always the case.

01:29.370 --> 01:34.050
And another strategy is to do some processing over the messages.

01:34.050 --> 01:40.350
So for example to summarize all of the messages and to save only the summary of them and the last couple

01:40.380 --> 01:41.340
of messages.

01:42.900 --> 01:49.860
And up until now we discussed which messages to save, whether to save them all, maybe to filter the

01:49.860 --> 01:52.230
old messages or to save a summary.

01:52.230 --> 01:57.750
But we didn't discuss where are we going to save those messages and how we're going to persist them.

01:57.750 --> 02:04.680
And the new way to do it with the link chain ecosystem is to use link graph, where we have their check

02:04.710 --> 02:08.100
pointers, which are going to help us persist those messages.

02:08.130 --> 02:09.550
And it's very easy to use.

02:09.550 --> 02:11.170
I'm going to show you the examples.

02:11.170 --> 02:15.130
We're not going to have live demos and dive deep into those examples.

02:15.130 --> 02:18.130
However, I do go over them in my lane graph course.

02:18.130 --> 02:22.840
So if you want to get a coupon to those courses, feel free to ping me or post in the groups.

02:22.840 --> 02:28.270
I love sharing coupons with you, but this is out of the course's scope because it really introduces

02:28.300 --> 02:32.770
a lot of new topics, which I think it's better to separate in a different course.

02:32.800 --> 02:39.910
All right, so let me show you an example of how do we pass all the past messages into our LM.

02:39.910 --> 02:47.110
So you can see right over here we have this chat prompt template that we use the from messages function.

02:47.110 --> 02:52.150
So right over here you can see our system message of our system instructions.

02:52.150 --> 02:57.640
And here you can see a message placeholder object with variable name equals messages.

02:57.640 --> 03:03.970
And this is our way of telling link chain that instead of this variable we're going to dynamically inject

03:03.970 --> 03:06.490
here all the past history of the user.

03:06.490 --> 03:10.240
And we're going to inject here a bunch of other messages.

03:10.540 --> 03:12.130
The format is going to be a dictionary.

03:12.140 --> 03:14.210
and you can see it right at the bottom here.

03:14.210 --> 03:16.700
Then we are invoking the chain.

03:16.700 --> 03:20.000
But we send a dictionary of messages.

03:20.000 --> 03:24.440
And then we have here a list of all the past messages of the user.

03:24.440 --> 03:26.810
So we have here the human message.

03:26.840 --> 03:30.710
Then the I responded and then we have the human message again.

03:30.710 --> 03:33.890
So this is the history that we are pending right now.

03:33.890 --> 03:36.830
So again this is what we are sending to the Elm.

03:36.830 --> 03:38.030
We still haven't discussed.

03:38.030 --> 03:39.500
How are we sending it.

03:39.500 --> 03:44.930
So in arena application, we'll use a persistent DB to save all of those messages and retrieve them

03:44.930 --> 03:46.970
and get them and then to send them.

03:47.000 --> 03:47.240
Okay.

03:47.270 --> 03:49.610
So this is the sending part of the messages.

03:49.610 --> 03:52.160
And you can see it's very simple to use.

03:54.620 --> 03:55.250
All right.

03:55.250 --> 04:02.420
So let's discuss now of how link chain or specifically lane graph can help us persist those messages.

04:02.420 --> 04:08.570
And line graph introduces a new terminology which is called a check pointer or checkpointing.

04:08.570 --> 04:15.470
And it basically means that every iteration, every user message that we send, or every AI message

04:15.470 --> 04:18.460
that we send, then we are simply going to, not we.

04:18.460 --> 04:23.620
But Landgraf is simply going to take this information and it's going to persistent in a DB.

04:23.710 --> 04:30.400
So here we have a memory saver checkpoint which saves it in memory so it doesn't persist it.

04:30.430 --> 04:37.840
We have other checkpoints like PostgreSQL, MySQL, Redis and MongoDB saver.

04:37.840 --> 04:44.020
So a lot of integrations are more to come with Landgraaf, which help us persist those messages in persistent

04:44.020 --> 04:45.040
databases.

04:45.070 --> 04:51.280
All that we do is create this checkpoint object and pass it into our Landgraaf graph.

04:51.280 --> 04:55.960
And again, I know that you are not very familiar with landgraaf graphs.

04:55.960 --> 04:58.990
I do cover it in my Landgraaf course.

04:58.990 --> 05:04.840
So again, sorry I could not fit this in in this course, but it's simply too much and I just want to

05:04.840 --> 05:05.770
show you the concept.

05:05.770 --> 05:12.010
But the most important thing that you need to understand is that this checkpoint is going to do all

05:12.010 --> 05:14.200
the persistent for us, and it's going to persist.

05:14.200 --> 05:15.400
It in the DB.

05:15.760 --> 05:16.390
All right.

05:16.390 --> 05:18.880
So let's discuss about how to treat messages.

05:18.880 --> 05:21.530
So how to ignore messages that are old?

05:21.530 --> 05:28.340
Maybe that we don't want in order to save on tokens and on latency and on cost and link chain introduces

05:28.370 --> 05:30.140
a concept of a trimmer.

05:30.140 --> 05:33.290
So this is an object that is going to trim those messages.

05:33.290 --> 05:36.620
And we can create a trimmer with the trim messages function.

05:36.620 --> 05:42.440
We can give it a strategy which link chain offers a bunch of ways of how to handle the trimming and

05:42.440 --> 05:43.370
what to remove.

05:43.400 --> 05:47.450
You can see we put here max tokens and the token counter to be Len.

05:47.450 --> 05:50.570
So you can and you can trim by token number.

05:50.570 --> 05:53.180
You can trim by messages number.

05:53.180 --> 05:55.010
And you have a lot of options here.

05:55.010 --> 05:58.880
And the terminology is very reminding us of the text splitter.

05:59.000 --> 06:03.140
And in order to invoke the trimmer we use the invoke method.

06:03.140 --> 06:07.640
And we plug in all the messages that we want to process.

06:07.640 --> 06:14.000
And then we're left with the trimmed messages that we can simply send to the LM, like we're seeing

06:14.000 --> 06:14.930
right over here.

06:14.960 --> 06:20.240
Now, you might be wondering, why do we need Link Chain to do it for us so we can do it ourselves?

06:20.240 --> 06:24.360
But this is a very convenient function which is already implemented for us.

06:24.360 --> 06:30.240
So why reinvent the wheel where Linkchain knows about all the use cases and gives us a solution for

06:30.240 --> 06:30.450
them?

06:30.480 --> 06:32.550
I mean, that's at least my opinion.

06:33.660 --> 06:34.560
All right.

06:34.560 --> 06:36.900
So we discussed about trimming.

06:36.900 --> 06:38.940
Let's discuss about summarization.

06:38.940 --> 06:45.990
So this is another technique for saving tokens and for sending a very precise context to the LLM.

06:45.990 --> 06:48.270
And this is basically done with the prompt here.

06:48.270 --> 06:50.460
And you can see here this summary prompt.

06:50.460 --> 06:55.650
And this prompt is simply going to receive all the history that we have.

06:55.650 --> 06:59.430
And it's going to summarize it into one summary.

06:59.430 --> 07:04.050
And this is what we are going to save in our persistent storage.

07:04.050 --> 07:07.470
Every time we summarize the messages we don't need the raw messages.

07:07.470 --> 07:09.900
So we simply go and delete them later.

07:09.900 --> 07:12.630
And this is how we save tokens.

07:12.630 --> 07:15.510
So the check pointer stayed the same.

07:15.510 --> 07:19.830
But we did do some manipulation to the data before checkpointing it.

07:19.860 --> 07:20.490
All right.

07:20.490 --> 07:25.800
So I just want to reiterate on what we saw in the video and what I wanted to show.

07:25.800 --> 07:30.310
So first of all, you don't need to understand Landgraaf right now.

07:30.310 --> 07:37.150
I just want you to understand the concepts of what are we saving to the memory so we can save either

07:37.150 --> 07:42.220
all the raw messages, we can save some trimmed messages, or we can save some summarization of the

07:42.220 --> 07:42.790
messages.

07:42.790 --> 07:45.820
And this is what we saw here in this documentation of linkchain.

07:45.820 --> 07:52.060
Now of course, you can add your own processing of the memory of what you want to save, and you can

07:52.060 --> 07:56.770
extend this functionality to whatever logic that is better for your application.

07:56.860 --> 08:00.730
So Linkchain does give you the freedom for it, and it's very easy to do so.

08:00.760 --> 08:03.700
The actual saving of the history and messages.

08:03.700 --> 08:10.300
So persisting it is actually done by the Landgraaf check pointer which is the new preferred way to doing

08:10.300 --> 08:10.840
things.

08:10.840 --> 08:16.690
And the check pointer is an object which is simply going to take this data and it's going to make DB

08:16.690 --> 08:20.770
queries to send it in the target DB, nothing more.

08:20.770 --> 08:24.640
And I do go and elaborate on this in the link of course.

08:24.640 --> 08:29.830
So that's pretty much it for this video and see you in the next one.
