WEBVTT

1
00:00:00.600 --> 00:00:02.550
<v Maximilian>So let's explore some code examples.</v>

2
00:00:02.550 --> 00:00:04.620
You could, for example, write code like this

3
00:00:04.620 --> 00:00:06.030
and you find this and a couple

4
00:00:06.030 --> 00:00:08.190
of other examples attached to this lecture,

5
00:00:08.190 --> 00:00:10.290
by the way, you can write code like this,

6
00:00:10.290 --> 00:00:12.180
set up this base URL,

7
00:00:12.180 --> 00:00:14.340
and also set up the API key so

8
00:00:14.340 --> 00:00:16.710
that you don't get any warnings or errors.

9
00:00:16.710 --> 00:00:18.060
Though the value doesn't matter

10
00:00:18.060 --> 00:00:21.570
because this locally running API isn't protected.

11
00:00:21.570 --> 00:00:23.705
So you can set any value here,

12
00:00:23.705 --> 00:00:25.890
but you can set it up like this

13
00:00:25.890 --> 00:00:28.920
and then use this approach that could also be used

14
00:00:28.920 --> 00:00:32.820
with the open AI models if you were communicating with them,

15
00:00:32.820 --> 00:00:36.060
to communicate with your locally running models.

16
00:00:36.060 --> 00:00:38.490
Now what's important is this model identifier

17
00:00:38.490 --> 00:00:42.690
because that tells LM Studio or the server

18
00:00:42.690 --> 00:00:44.640
provided by LM Studio

19
00:00:44.640 --> 00:00:47.910
with which locally running model you want to communicate

20
00:00:47.910 --> 00:00:49.367
or which model should be loaded

21
00:00:49.367 --> 00:00:52.200
if it hasn't been loaded yet.

22
00:00:52.200 --> 00:00:53.520
Of course, it must be a model

23
00:00:53.520 --> 00:00:55.290
that has been downloaded though.

24
00:00:55.290 --> 00:00:57.900
So loaded just means loaded into memory.

25
00:00:57.900 --> 00:00:59.700
If it doesn't exist on your system,

26
00:00:59.700 --> 00:01:03.660
you can't communicate it no matter if it's loaded or not.

27
00:01:03.660 --> 00:01:05.700
But this must be a valid identifier

28
00:01:05.700 --> 00:01:08.745
and you get the value for this identifier from here,

29
00:01:08.745 --> 00:01:13.140
from your developer view in LM Studio

30
00:01:13.140 --> 00:01:15.480
for a given model that you loaded.

31
00:01:15.480 --> 00:01:17.460
So for this model, for example,

32
00:01:17.460 --> 00:01:19.653
you find the identifier here.

33
00:01:20.640 --> 00:01:24.660
You can also find valid identifiers by going to your models,

34
00:01:24.660 --> 00:01:26.940
and then it's these identifiers here

35
00:01:26.940 --> 00:01:29.110
that have to be inserted in your code

36
00:01:30.388 --> 00:01:33.120
that tells the LM Studio server

37
00:01:33.120 --> 00:01:35.970
to which model the request should be sent.

38
00:01:35.970 --> 00:01:38.640
And again, if it hasn't been loaded into memory yet,

39
00:01:38.640 --> 00:01:41.550
if you got just in time model loading enabled,

40
00:01:41.550 --> 00:01:44.190
it will be loaded as soon as a request

41
00:01:44.190 --> 00:01:46.380
tries to reach that model.

42
00:01:46.380 --> 00:01:50.010
Well then here I'm sending a made up chat history

43
00:01:50.010 --> 00:01:54.000
including a system message to that model.

44
00:01:54.000 --> 00:01:57.180
Therefore, here, if I execute this basic PY file

45
00:01:57.180 --> 00:01:59.220
with Python, which can take a while

46
00:01:59.220 --> 00:02:01.890
because streaming is not enabled here,

47
00:02:01.890 --> 00:02:04.110
so I only get the complete response

48
00:02:04.110 --> 00:02:07.353
once it's done and generating that can take a while.

49
00:02:08.250 --> 00:02:10.650
But once I sent this and wait a while,

50
00:02:10.650 --> 00:02:13.521
eventually I will get back the response

51
00:02:13.521 --> 00:02:15.840
of my locally running model.

52
00:02:15.840 --> 00:02:19.650
But now programmatically with help of that server

53
00:02:19.650 --> 00:02:21.883
that's spun up by LM Studio.

54
00:02:21.883 --> 00:02:25.410
And as mentioned, I got some other examples here as well.

55
00:02:25.410 --> 00:02:29.970
For example, a basic chat where I do have streaming enabled

56
00:02:29.970 --> 00:02:32.430
and where I got an infinite loop to ask

57
00:02:32.430 --> 00:02:34.530
for more and more user input.

58
00:02:34.530 --> 00:02:37.320
So if I run that, I got a chat interface here

59
00:02:37.320 --> 00:02:38.550
where I can send messages

60
00:02:38.550 --> 00:02:40.440
and then the responses are streamed in

61
00:02:40.440 --> 00:02:43.260
and I can send up follow up messages.

62
00:02:43.260 --> 00:02:45.330
And then I also got another example

63
00:02:45.330 --> 00:02:49.440
for an application you could build with help of local AI.

64
00:02:49.440 --> 00:02:52.020
I got an image parser application,

65
00:02:52.020 --> 00:02:55.920
which tries to find all jpeg and png images

66
00:02:55.920 --> 00:02:58.200
in a folder named Images.

67
00:02:58.200 --> 00:03:02.223
This folder here, which got 2 demo images here, in my case.

68
00:03:03.060 --> 00:03:05.580
And then for every image file I find,

69
00:03:05.580 --> 00:03:10.110
I in the end convert that to a format that's called base 64,

70
00:03:10.110 --> 00:03:12.570
which is a text representation of that image

71
00:03:12.570 --> 00:03:13.950
that can be converted back

72
00:03:13.950 --> 00:03:18.510
to the actual image on the server I send it to, for example,

73
00:03:18.510 --> 00:03:23.510
and I sent this converted image to the AI model

74
00:03:23.520 --> 00:03:26.280
by setting the content of the message

75
00:03:26.280 --> 00:03:29.087
I'm sending to the model to a combination of text

76
00:03:29.087 --> 00:03:32.280
and an image URL.

77
00:03:32.280 --> 00:03:36.966
And that image URL is that base 64 encoded image.

78
00:03:36.966 --> 00:03:39.330
Of course you have to send that to a model

79
00:03:39.330 --> 00:03:40.890
that supports image input,

80
00:03:40.890 --> 00:03:41.970
but that's the case

81
00:03:41.970 --> 00:03:44.523
for this Gemma 3 model I'm using here.

82
00:03:45.660 --> 00:03:49.140
And with that, I'm sending this lake image,

83
00:03:49.140 --> 00:03:51.090
which I took from a plane

84
00:03:51.090 --> 00:03:54.503
and this image here of me in my recording set up,

85
00:03:54.503 --> 00:03:59.040
I'm sending this to the AI here,

86
00:03:59.040 --> 00:04:02.310
again programmatically by sending it to that server

87
00:04:02.310 --> 00:04:05.105
that was spun up by LM Studio.

88
00:04:05.105 --> 00:04:07.667
Now of course this also takes a while here,

89
00:04:07.667 --> 00:04:12.030
but once it's done, I got a detailed description

90
00:04:12.030 --> 00:04:16.724
of my Max image here, which is correct that it's a selfie,

91
00:04:16.724 --> 00:04:20.823
by a man in a home studio or recording setup.

92
00:04:20.823 --> 00:04:24.343
And I got the thumbs up gesture,

93
00:04:24.343 --> 00:04:26.070
if we take another look at the image.

94
00:04:26.070 --> 00:04:28.890
You see that is correct. Thumbs up.

95
00:04:28.890 --> 00:04:30.900
I got a description of my shirt

96
00:04:30.900 --> 00:04:34.200
and a bunch of other information about the image.

97
00:04:34.200 --> 00:04:37.290
And then the same also for a lake

98
00:04:37.290 --> 00:04:42.225
where it describes a lake, which of course also is true.

99
00:04:42.225 --> 00:04:45.360
It also correctly detected that this was taken from

100
00:04:45.360 --> 00:04:47.340
inside an airplane.

101
00:04:47.340 --> 00:04:48.450
So that's another example

102
00:04:48.450 --> 00:04:52.110
for using a locally running AI model programmatically

103
00:04:52.110 --> 00:04:55.620
by sending a request to that locally running server

104
00:04:55.620 --> 00:04:58.410
that is created and managed by LM Studio,

105
00:04:58.410 --> 00:05:00.810
and that exposes those locally running

106
00:05:00.810 --> 00:05:02.703
AI models through that server.