1
00:00:04,040 --> 00:00:04,340
Okay.

2
00:00:04,340 --> 00:00:10,070
Let's talk a little bit about memory in, uh, Elm applications.

3
00:00:10,250 --> 00:00:18,680
As we already know, uh, the memory of a foundation LMS is limited by their context window.

4
00:00:18,710 --> 00:00:23,420
This is one of the main problems of the Foundation Elm models.

5
00:00:23,420 --> 00:00:30,050
And we know that the best way to solve this problem is using what we call the RAC technique.

6
00:00:30,050 --> 00:00:35,840
And long chain is good, uh, you know, providing RAC technique solutions.

7
00:00:36,230 --> 00:00:43,220
But apart from that, in addition to the RAC technique, like long chain offers several features to

8
00:00:43,220 --> 00:00:45,770
manage the memory of larger maps.

9
00:00:46,520 --> 00:00:57,920
These are interesting, uh, features, but in practice we seldom will use it.

10
00:00:58,010 --> 00:01:03,710
So it's not going to be frequent that we use these alternatives to the RAC technique.

11
00:01:03,710 --> 00:01:06,290
We mostly will use the RAC technique.

12
00:01:06,290 --> 00:01:13,580
And in some very specific cases we may use this buffer memories that long chain offer us.

13
00:01:13,580 --> 00:01:18,200
And you will see very quickly in the code in the right side of the screen.

14
00:01:18,200 --> 00:01:20,480
What are these buffer memories?

15
00:01:20,480 --> 00:01:25,310
We are going to present four kinds of buffer memories provided by long chain.

16
00:01:26,360 --> 00:01:28,820
The first one is the buffer memory.

17
00:01:28,820 --> 00:01:37,610
So after as usual, connecting with the dot m file and getting the credentials, etc. etc. we will import

18
00:01:37,610 --> 00:01:42,950
the necessary modules to uh uh to.

19
00:01:44,430 --> 00:01:49,770
Design a very simple exercise to demonstrate the conversation buffer.

20
00:01:50,100 --> 00:01:56,160
So what we do here is we are going to create a variable called buffer memory.

21
00:01:56,160 --> 00:01:57,960
We can call it whatever we want.

22
00:01:57,960 --> 00:02:04,950
And we want we we are going to apply the conversation buffer memory module we have imported.

23
00:02:05,340 --> 00:02:12,630
So now we have an instance of the LM and a buffer memory.

24
00:02:13,830 --> 00:02:22,470
We will create now a conversation chain using the conversation chain module from uh from Lang chain.

25
00:02:22,950 --> 00:02:29,940
And in this conversation chain that we call conversation we are going to configure the LM.

26
00:02:30,740 --> 00:02:37,340
We are going to say that, and then we are going to use the LM we have created and memory and in the

27
00:02:37,340 --> 00:02:41,870
memory we are going to say we are going to use the buffer memory we created.

28
00:02:41,870 --> 00:02:42,620
Okay.

29
00:02:42,620 --> 00:02:50,870
In this case, we are also going to configure verbose as true because we want to see what is happening.

30
00:02:50,870 --> 00:02:52,970
You know, in the background.

31
00:02:54,140 --> 00:03:02,990
With this configure we can start talking with the LM model and we can say okay my hi my name is Julio

32
00:03:02,990 --> 00:03:06,080
and I have moved 33 times.

33
00:03:06,350 --> 00:03:15,770
We send this message to the LM and we see here what is the behavior of the LM.

34
00:03:15,770 --> 00:03:21,890
So since we have verbose as true, we can see what is happening in the background.

35
00:03:22,040 --> 00:03:29,420
So the LM is building this prompt.

36
00:03:30,380 --> 00:03:37,340
And you will see that this prompt here, the role, the definition of the role of the LM is something

37
00:03:37,340 --> 00:03:41,330
that the conversation chain creates by default.

38
00:03:41,330 --> 00:03:41,720
Okay.

39
00:03:41,720 --> 00:03:46,880
So by default this chain tells the LM model the following.

40
00:03:46,880 --> 00:03:51,320
The following is a friendly conversation between a human and an AI.

41
00:03:51,320 --> 00:03:57,200
The AI is talkative and provides lots of specific details from its context.

42
00:03:57,200 --> 00:04:04,610
If the AI does not know the answer to a question, it truthfully says it does not know.

43
00:04:04,610 --> 00:04:07,790
Okay, and this is the interesting part.

44
00:04:08,450 --> 00:04:18,800
Apart from this role information, the prompt includes the current conversation that the LM is having

45
00:04:18,800 --> 00:04:19,250
with me.

46
00:04:19,760 --> 00:04:26,420
So as you can see here in the prompt, the LM has my first message.

47
00:04:27,150 --> 00:04:30,300
Now it is going to respond.

48
00:04:31,630 --> 00:04:40,990
Then I enter another message and you will see that inside the prompt it has the collection of messages.

49
00:04:42,100 --> 00:04:44,110
It responds my message.

50
00:04:44,110 --> 00:04:50,440
And the prompt, as you see, is storing all the messages of the conversation.

51
00:04:50,440 --> 00:04:54,280
So you will immediately guess what is the problem with this?

52
00:04:54,310 --> 00:04:56,710
The problem with this is the context window.

53
00:04:56,710 --> 00:05:01,480
So this method is limited by the context window.

54
00:05:01,480 --> 00:05:03,430
So it's not very interesting for us.

55
00:05:03,430 --> 00:05:06,250
It's not solving the real problem here.

56
00:05:06,250 --> 00:05:10,360
So that's why we are going to use drag instead of this a.

57
00:05:12,530 --> 00:05:16,160
Temporary solutions that long chain offers us, right?

58
00:05:16,160 --> 00:05:23,570
So if we dig a little bit deeper in the variable we have created in the buffer memory variable, we

59
00:05:23,570 --> 00:05:30,320
can see different things like the content of this buffer memory or the variables that are included,

60
00:05:30,320 --> 00:05:34,010
etc. all this functionality is provided in the documentation of launch.

61
00:05:34,550 --> 00:05:42,680
Apart from this conversation buffer memory, we have another couple of very similar buffer memories.

62
00:05:42,680 --> 00:05:50,330
The first one, uh, limits the number of messages that it can store, and the second one limits the

63
00:05:50,330 --> 00:05:52,760
tokens, the number of tokens it can store.

64
00:05:52,760 --> 00:06:01,580
So as you can see here in the second kind of buffer memory, we are saying that we only want three messages

65
00:06:01,580 --> 00:06:03,440
stored in the buffer memory.

66
00:06:03,440 --> 00:06:13,190
And in the third kind of memory we are telling the maximum number of tokens we want to store in the

67
00:06:13,190 --> 00:06:13,970
buffer memory.

68
00:06:13,970 --> 00:06:14,360
Okay.

69
00:06:14,360 --> 00:06:19,220
So more than these tokens are not going to be stored.

70
00:06:19,220 --> 00:06:22,460
And more than these messages here are going to be stored.

71
00:06:22,460 --> 00:06:31,160
So if we ask anything about a previous message, the uh LM is not going to be able to respond.

72
00:06:33,060 --> 00:06:36,780
So the last a buffer memory.

73
00:06:36,780 --> 00:06:44,250
The conversation summary buffer memory is more interesting than the previous ones because instead of

74
00:06:44,250 --> 00:06:50,460
storing messages, it stores a summary of the current conversation.

75
00:06:50,460 --> 00:06:56,700
So this is, you know, shorter and it can be more interesting for us.

76
00:06:56,700 --> 00:07:02,520
The only problem with this approach is that, okay, we have a summary of the of the conversation,

77
00:07:02,520 --> 00:07:06,480
but we are going to lose the details of the conversation.

78
00:07:06,870 --> 00:07:09,690
Uh, if the conversation is a long conversation.

79
00:07:09,690 --> 00:07:10,050
Right.

80
00:07:10,050 --> 00:07:13,440
We have a summary, but we don't know the details.

81
00:07:13,440 --> 00:07:19,770
So as I was telling you, this buffer memory is that luncheon provides, uh, can be interesting, you

82
00:07:19,770 --> 00:07:23,460
know, for a small exercise or for a specific, very specific case.

83
00:07:23,460 --> 00:07:31,740
But in most cases, we are going to use the rack technique in order to, uh, solve this memory problem

84
00:07:31,740 --> 00:07:33,510
that the foundation models have.

85
00:07:33,510 --> 00:07:34,080
Okay.

86
00:07:34,080 --> 00:07:36,840
So you want to know more about the buffer memory.

87
00:07:36,840 --> 00:07:39,900
You can go to the long chain documentation.