WEBVTT

1
00:00:00.780 --> 00:00:05.040
<v Maximilian>Now besides uploading and handling images,</v>

2
00:00:05.040 --> 00:00:09.090
which, again, must be supported by the model you're using

3
00:00:09.090 --> 00:00:09.990
to be an option,

4
00:00:09.990 --> 00:00:13.530
besides doing that, you can also attach other files,

5
00:00:13.530 --> 00:00:18.030
like for example, PDF, plain text or Word documents.

6
00:00:18.030 --> 00:00:19.590
Now, at least at the point of time

7
00:00:19.590 --> 00:00:20.850
where I'm recording this,

8
00:00:20.850 --> 00:00:22.410
an info pop-up opens

9
00:00:22.410 --> 00:00:25.380
if I choose the option to attach a file.

10
00:00:25.380 --> 00:00:26.820
And in this info pop-up,

11
00:00:26.820 --> 00:00:30.210
I get informed about the file types I can add,

12
00:00:30.210 --> 00:00:32.940
and that, at least right now when I'm recording this,

13
00:00:32.940 --> 00:00:36.090
I can upload up to five files

14
00:00:36.090 --> 00:00:39.900
with a maximum combined size of 30 megabytes.

15
00:00:39.900 --> 00:00:42.540
Now, of course, when you are viewing this video,

16
00:00:42.540 --> 00:00:44.490
you might no longer get this message.

17
00:00:44.490 --> 00:00:46.050
The limits also might have changed.

18
00:00:46.050 --> 00:00:49.620
But in general, when working with those local models,

19
00:00:49.620 --> 00:00:52.680
just slamming dozens of large files

20
00:00:52.680 --> 00:00:55.050
all into this one chat session

21
00:00:55.050 --> 00:00:57.120
typically won't be a good idea.

22
00:00:57.120 --> 00:00:59.640
If you wanna process multiple files,

23
00:00:59.640 --> 00:01:01.410
you should do it in chunk.

24
00:01:01.410 --> 00:01:03.960
The Context window size will also matter,

25
00:01:03.960 --> 00:01:06.660
but that's something I'll get back to later.

26
00:01:06.660 --> 00:01:09.810
In general, what LM Studio will try to do

27
00:01:09.810 --> 00:01:11.550
with any files you give it

28
00:01:11.550 --> 00:01:16.110
is it will try to load the content of this file

29
00:01:16.110 --> 00:01:19.890
into the chat history invisibly behind the scenes,

30
00:01:19.890 --> 00:01:22.350
as if you would have copy and pasted

31
00:01:22.350 --> 00:01:24.810
that content into the history.

32
00:01:24.810 --> 00:01:28.560
If that's not possible because the Context window,

33
00:01:28.560 --> 00:01:31.140
and again, I'll get back to this, is not sufficient,

34
00:01:31.140 --> 00:01:33.510
so if the model as it's currently configured

35
00:01:33.510 --> 00:01:36.570
would not be able to handle that amount of input,

36
00:01:36.570 --> 00:01:38.370
if that's not possible,

37
00:01:38.370 --> 00:01:41.580
LM Studio will actually try to split the content

38
00:01:41.580 --> 00:01:44.070
into smaller chunks for you

39
00:01:44.070 --> 00:01:46.890
and try to retrieve the relevant chunks

40
00:01:46.890 --> 00:01:50.430
based on the prompt you sent to the model.

41
00:01:50.430 --> 00:01:53.790
That's a technique called Retrieval Augmented Generation,

42
00:01:53.790 --> 00:01:57.870
and LM Studio will do it for you when uploading files.

43
00:01:57.870 --> 00:01:59.640
And again, the Context window part

44
00:01:59.640 --> 00:02:01.560
is something we'll explore later.

45
00:02:01.560 --> 00:02:03.090
Now here for this demo,

46
00:02:03.090 --> 00:02:07.053
I will upload a pretty small PDF document.

47
00:02:07.890 --> 00:02:12.360
It's just a very simple made up financial report PDF

48
00:02:12.360 --> 00:02:14.700
that includes a bunch of information

49
00:02:14.700 --> 00:02:18.270
about made up financials and numbers.

50
00:02:18.270 --> 00:02:19.103
But, of course,

51
00:02:19.103 --> 00:02:21.960
we can also use locally running large language models,

52
00:02:21.960 --> 00:02:25.320
like this one here, for summarizing documents like this.

53
00:02:25.320 --> 00:02:29.040
And indeed that is a task they're really, really good at.

54
00:02:29.040 --> 00:02:31.560
I will also say that the Gemma models, for example,

55
00:02:31.560 --> 00:02:34.200
do support quite large Context windows

56
00:02:34.200 --> 00:02:35.970
if you configure them appropriately,

57
00:02:35.970 --> 00:02:39.210
so they can handle quite a lot of information,

58
00:02:39.210 --> 00:02:41.790
and for example, summarize it for you.

59
00:02:41.790 --> 00:02:44.400
So for example, here I'll ask Gemma 3,

60
00:02:44.400 --> 00:02:46.620
the 12 billion parameters model again,

61
00:02:46.620 --> 00:02:47.880
to summarize this document

62
00:02:47.880 --> 00:02:51.210
and extract the key insights and financials for me,

63
00:02:51.210 --> 00:02:52.860
and that again, as I just mentioned,

64
00:02:52.860 --> 00:02:57.000
is one core use case for using large language models

65
00:02:57.000 --> 00:02:58.830
that are running locally on your system.

66
00:02:58.830 --> 00:03:00.360
They're really good at that,

67
00:03:00.360 --> 00:03:01.920
and of course, the huge advantage

68
00:03:01.920 --> 00:03:04.650
is that you don't need to share information like this,

69
00:03:04.650 --> 00:03:07.650
which potentially might, of course, be confidential

70
00:03:07.650 --> 00:03:10.020
or which you simply don't wanna share with everyone,

71
00:03:10.020 --> 00:03:12.270
with OpenAI or any other provider.

72
00:03:12.270 --> 00:03:15.870
Instead, you can perform tasks like this

73
00:03:15.870 --> 00:03:17.490
locally on your system

74
00:03:17.490 --> 00:03:19.890
by simply uploading documents to LM Studio

75
00:03:19.890 --> 00:03:21.330
and then working on them

76
00:03:21.330 --> 00:03:23.700
with your favorite large language model,

77
00:03:23.700 --> 00:03:25.593
in this case, the Gemma 3 model.