1
00:00:04,960 --> 00:00:09,730
In this lesson we are going to talk about rack optimization.

2
00:00:16,309 --> 00:00:24,290
If you remember, in previous, uh, lessons, we have been talking about a rag a lot, and we said

3
00:00:24,290 --> 00:00:28,010
that there is a basic approach, like a basic.

4
00:00:28,910 --> 00:00:37,340
Range of techniques in order to apply the rack technique in your prototype or initial demo, and then

5
00:00:37,340 --> 00:00:49,100
you will have a many other advanced techniques in order to optimize your rack application for a professional

6
00:00:49,100 --> 00:00:52,280
a for a professional stage.

7
00:00:53,000 --> 00:01:00,380
So we have already been talking about some of the optimization techniques you need to consider.

8
00:01:00,380 --> 00:01:01,370
In this lesson.

9
00:01:01,370 --> 00:01:11,570
We are going to focus our attention in a couple of areas where you are going to see a probably when

10
00:01:11,570 --> 00:01:21,590
you study this, a this, this topic, uh, further, you are going to find a different terminologies,

11
00:01:21,590 --> 00:01:25,220
uh, uh, new names, new ways of doing things.

12
00:01:25,220 --> 00:01:30,680
And we are not going to, uh, explain all of them here in detail.

13
00:01:30,680 --> 00:01:37,280
We just want to give you a quick idea of the things you are going to find and the things that can be

14
00:01:37,280 --> 00:01:47,150
interesting for you to study in a more advanced stage when you are, uh, optimizing your application

15
00:01:47,150 --> 00:01:53,390
or when you are preparing the launch of your or your of your professional LM application.

16
00:01:53,390 --> 00:02:02,780
So one of the areas that we wanted to, uh, explore a little bit more with you is the area of the alternative

17
00:02:02,780 --> 00:02:06,500
techniques to index and store vectors.

18
00:02:06,500 --> 00:02:19,820
So remember that this is refer to how we a how we, uh, save and how we search data in vector databases,

19
00:02:19,820 --> 00:02:23,270
which, as you know, is an important part of the rack technique.

20
00:02:23,270 --> 00:02:32,660
So in this particular step, storing and indexing vectors, you may see a different terminology like

21
00:02:32,690 --> 00:02:43,730
hierarchy here hierarchical navigable small small world difficult time I repeat hierarchical navigable

22
00:02:43,730 --> 00:02:45,440
small world.

23
00:02:46,600 --> 00:02:47,620
Sometimes you will.

24
00:02:47,620 --> 00:02:53,650
You will see just the the NSW letters.

25
00:02:54,650 --> 00:02:58,250
This is one of the techniques for storing and indexing vectors.

26
00:02:58,250 --> 00:03:03,650
You may find as the advanced technique you would like to explore.

27
00:03:03,740 --> 00:03:09,620
You have another one like Locally Sensitive hashing or LSH.

28
00:03:10,870 --> 00:03:14,740
You have product quantization or PQ.

29
00:03:15,370 --> 00:03:21,640
And finally you have filtering pre and post filtering and random projections.

30
00:03:21,820 --> 00:03:28,270
All these some different advanced techniques for storing and indexing vectors.

31
00:03:28,270 --> 00:03:34,000
So and they are not all the the the ones you have available for you.

32
00:03:34,000 --> 00:03:35,440
These are the most popular.

33
00:03:35,440 --> 00:03:41,200
So these are the kind of techniques you will have to explore study and practice.

34
00:03:41,200 --> 00:03:47,260
And in many cases just just try and see, you know, the results of each of these techniques in your,

35
00:03:47,470 --> 00:03:54,610
uh, professional application in order to see how each of them impacts the performance of the application.

36
00:03:54,610 --> 00:03:58,660
In many cases, it's just a a case of trial and error.

37
00:03:58,660 --> 00:04:05,440
Try and compare, you know, the different techniques and see the impact in your application.

38
00:04:05,440 --> 00:04:14,260
The second area where you will probably explore different alternatives is the area, uh, when you are

39
00:04:14,260 --> 00:04:20,560
measuring the vector similarity or the search similarity, semantic similarity.

40
00:04:21,130 --> 00:04:31,090
So remember that the semantic similarity is the main technique to search data in vector databases.

41
00:04:31,090 --> 00:04:37,120
So for a for this purpose we have different ways of doing things.

42
00:04:37,120 --> 00:04:41,560
So we have different ways of doing semantic similarity.

43
00:04:41,560 --> 00:04:44,020
The most common the most popular.

44
00:04:44,020 --> 00:04:53,260
The ones that you will probably try first are cosine similarity and the second one Euclidean distance.

45
00:04:54,290 --> 00:04:59,540
So they are both popular in the math world.

46
00:04:59,570 --> 00:05:06,890
The cosine similarity measures the angle between vectors and the Euclidean distance measures.

47
00:05:06,890 --> 00:05:11,180
You know the straight line between items, between vectors as well.

48
00:05:11,180 --> 00:05:15,350
But don't worry about about, uh, math right now.

49
00:05:15,350 --> 00:05:21,830
This is this is something that you may find in the literature, you know, in the articles or the courses

50
00:05:21,830 --> 00:05:24,050
or advanced videos you may find about that.

51
00:05:24,050 --> 00:05:31,250
The important thing right now is for you to understand that when you are working with a Rag application,

52
00:05:31,250 --> 00:05:36,920
which is going to be very frequent when you are, uh, working in the professional world, you have

53
00:05:36,920 --> 00:05:44,270
two main stages the basic development stage of the Rag technique and then the advanced one, where you

54
00:05:44,270 --> 00:05:52,580
are going to try different approaches in order to optimize the results that the basic approach, uh,

55
00:05:52,580 --> 00:05:55,250
initially, initially gave you.

56
00:05:55,250 --> 00:05:55,820
Okay.

57
00:05:55,820 --> 00:06:05,030
So this is just a quick review of two areas where you probably would like to, uh, explore and different

58
00:06:05,030 --> 00:06:06,110
alternatives.

59
00:06:08,540 --> 00:06:15,950
In the next lesson, we are going to talk about a very important a question regarding LLM applications,

60
00:06:15,950 --> 00:06:18,080
which is the speed.

61
00:06:18,410 --> 00:06:22,340
And in some cases this is known as latency.

62
00:06:22,460 --> 00:06:27,170
We will see the relationship between speed and latency in the next lesson.

