WEBVTT

1
00:00:00.630 --> 00:00:02.610
<v Maximilian>So viewing information</v>

2
00:00:02.610 --> 00:00:06.420
about the model is, of course, helpful and interesting,

3
00:00:06.420 --> 00:00:09.000
but changing some settings

4
00:00:09.000 --> 00:00:11.613
with help of /set can be even more interesting.

5
00:00:12.570 --> 00:00:14.460
Now, if you type just /set,

6
00:00:14.460 --> 00:00:18.870
you get a list of things you can set, for example,

7
00:00:18.870 --> 00:00:21.870
you can set parameters like the temperature.

8
00:00:21.870 --> 00:00:24.600
You can also set a system message

9
00:00:24.600 --> 00:00:28.770
for this chat session then, and a couple of other things,

10
00:00:28.770 --> 00:00:32.070
for example, whether it automatically wraps words

11
00:00:32.070 --> 00:00:34.200
if they reach the end of line,

12
00:00:34.200 --> 00:00:36.120
whether you want to enable JSON mode

13
00:00:36.120 --> 00:00:38.970
and force the model to always reply

14
00:00:38.970 --> 00:00:41.430
with data structured as JSON,

15
00:00:41.430 --> 00:00:43.290
which can be helpful if you plan

16
00:00:43.290 --> 00:00:45.030
on using the model response

17
00:00:45.030 --> 00:00:49.140
in some automated workflow or some other application.

18
00:00:49.140 --> 00:00:51.607
And a couple of other things as well.

19
00:00:51.607 --> 00:00:53.400
/set verbose can, for example,

20
00:00:53.400 --> 00:00:55.140
be interesting to get a bit more

21
00:00:55.140 --> 00:00:56.550
behind-the-scenes information

22
00:00:56.550 --> 00:00:58.383
when interacting with the model.

23
00:00:59.250 --> 00:01:02.190
So here I did set verbose mode.

24
00:01:02.190 --> 00:01:07.190
You can always disable it by setting quiet thereafter again.

25
00:01:07.380 --> 00:01:09.570
But with it being set, if you, for example,

26
00:01:09.570 --> 00:01:11.943
ask, What can you do for me?

27
00:01:12.840 --> 00:01:16.443
You'll get a response just as you're used to.

28
00:01:18.480 --> 00:01:20.490
But then at the end of this response,

29
00:01:20.490 --> 00:01:22.890
you get some summary statistics,

30
00:01:22.890 --> 00:01:26.790
some information about how this response was generated.

31
00:01:26.790 --> 00:01:28.830
You, for example, get the number of tokens

32
00:01:28.830 --> 00:01:30.510
that were generated per second,

33
00:01:30.510 --> 00:01:32.250
giving you some useful insights

34
00:01:32.250 --> 00:01:34.950
into the performance at the speed of this model

35
00:01:34.950 --> 00:01:39.030
on your system and some other pieces of information.

36
00:01:39.030 --> 00:01:41.760
Of course, you also might want to set a system message.

37
00:01:41.760 --> 00:01:44.970
And for that, you just type /set system,

38
00:01:44.970 --> 00:01:46.590
and then any message of your choice,

39
00:01:46.590 --> 00:01:50.760
like Always reply in rhymes,

40
00:01:50.760 --> 00:01:55.533
no matter what the user asks you.

41
00:01:56.760 --> 00:01:58.920
So now I did set that system message

42
00:01:58.920 --> 00:02:02.130
and we can see it if I type /show system.

43
00:02:02.130 --> 00:02:04.113
That proves that it has been set.

44
00:02:05.310 --> 00:02:06.990
And then, of course,

45
00:02:06.990 --> 00:02:09.883
I can send any other follow-up message and for example,

46
00:02:09.883 --> 00:02:13.953
I ask it to summarize what you can do for me.

47
00:02:15.630 --> 00:02:17.580
And due to that system message,

48
00:02:17.580 --> 00:02:19.863
it goes ahead and does that in rhymes.

49
00:02:21.360 --> 00:02:22.980
Now, if I quit this session,

50
00:02:22.980 --> 00:02:24.813
which you can always do with /bye,

51
00:02:25.770 --> 00:02:28.320
and I then start a new one by running ollama run

52
00:02:28.320 --> 00:02:30.513
and then the model identifier again,

53
00:02:31.770 --> 00:02:33.960
you will see that this system message

54
00:02:33.960 --> 00:02:35.514
is now no longer set

55
00:02:35.514 --> 00:02:37.530
because the system message,

56
00:02:37.530 --> 00:02:39.480
when setting it with /set system,

57
00:02:39.480 --> 00:02:42.240
is always set for one specific chat session,

58
00:02:42.240 --> 00:02:44.010
not for the model itself.

59
00:02:44.010 --> 00:02:47.640
You can also set a system message for the overall model,

60
00:02:47.640 --> 00:02:50.340
but that's something we'll get back to later.

61
00:02:50.340 --> 00:02:53.280
Now, one other important thing you can adjust

62
00:02:53.280 --> 00:02:56.040
is the parameters of the model.

63
00:02:56.040 --> 00:02:59.250
For that, you can type /set parameter,

64
00:02:59.250 --> 00:03:00.120
and if you hit Enter,

65
00:03:00.120 --> 00:03:02.853
you'll get a list of the parameters you can set.

66
00:03:04.110 --> 00:03:08.910
For example, top_k, top_p, and the temperature.

67
00:03:08.910 --> 00:03:10.560
And these are, of course,

68
00:03:10.560 --> 00:03:13.110
the settings that control how creative

69
00:03:13.110 --> 00:03:15.210
or predictable the model is.

70
00:03:15.210 --> 00:03:17.790
And again, attached, you find a link

71
00:03:17.790 --> 00:03:20.580
to a lecture where I explain these parameters

72
00:03:20.580 --> 00:03:22.320
in greater detail.

73
00:03:22.320 --> 00:03:26.790
But you can, for example, also set the context window here,

74
00:03:26.790 --> 00:03:29.070
the context size.

75
00:03:29.070 --> 00:03:32.340
As I explained before, depending on the model you're using,

76
00:03:32.340 --> 00:03:35.640
there is a maximum context window size

77
00:03:35.640 --> 00:03:37.110
supported by the model,

78
00:03:37.110 --> 00:03:38.610
but that's not necessarily

79
00:03:38.610 --> 00:03:41.133
the context window size that's active.

80
00:03:42.030 --> 00:03:44.400
And why is it not always using the maximum?

81
00:03:44.400 --> 00:03:46.020
Well, because all that space

82
00:03:46.020 --> 00:03:49.080
for all these tokens must be reserved in memory

83
00:03:49.080 --> 00:03:51.690
and therefore, you typically don't wanna

84
00:03:51.690 --> 00:03:54.990
set a huge context window size if you don't need it,

85
00:03:54.990 --> 00:03:57.660
because that will eat up a lot of memory

86
00:03:57.660 --> 00:04:00.510
and potentially may even need more memory

87
00:04:00.510 --> 00:04:02.823
than you got available on your system.

88
00:04:03.720 --> 00:04:07.950
But you can, of course, /set parameter num_ctx

89
00:04:07.950 --> 00:04:10.410
to change the context window size

90
00:04:10.410 --> 00:04:12.870
that is assigned to this chat session

91
00:04:12.870 --> 00:04:15.540
and set it to 10,000.

92
00:04:15.540 --> 00:04:17.460
And now, this session here

93
00:04:17.460 --> 00:04:20.973
would have a context window size of 10,000 tokens.

94
00:04:22.050 --> 00:04:24.240
But, of course, as I just explained,

95
00:04:24.240 --> 00:04:29.240
it therefore also needs more memory space, more VRAM space.

96
00:04:29.610 --> 00:04:34.050
And that's essentially it for the most important settings.

97
00:04:34.050 --> 00:04:38.160
As you see, you can, of course, set more things,

98
00:04:38.160 --> 00:04:40.701
more parameters, and especially top_k,

99
00:04:40.701 --> 00:04:43.380
top_p and temperature are parameters

100
00:04:43.380 --> 00:04:46.830
you might wanna set according to those explanations

101
00:04:46.830 --> 00:04:49.140
I provided in that previous course section

102
00:04:49.140 --> 00:04:51.300
with that lecture you find attached.

103
00:04:51.300 --> 00:04:53.100
But most other things you can set

104
00:04:53.100 --> 00:04:56.940
are typically not things you'll need to change all the time.

105
00:04:56.940 --> 00:04:58.800
For example, the history here,

106
00:04:58.800 --> 00:05:01.980
which you can enable or disable, is not the chat history,

107
00:05:01.980 --> 00:05:03.690
which is what you could think.

108
00:05:03.690 --> 00:05:07.050
Instead, that is just the history of messages you sent,

109
00:05:07.050 --> 00:05:08.670
which you can always recall

110
00:05:08.670 --> 00:05:12.120
by pressing the Up and Down arrow keys.

111
00:05:12.120 --> 00:05:14.550
That is the history that's being saved.

112
00:05:14.550 --> 00:05:16.350
And if you know that certain commands

113
00:05:16.350 --> 00:05:18.990
or messages you send to the model should not be included

114
00:05:18.990 --> 00:05:21.510
in that history, you could set nohistory

115
00:05:21.510 --> 00:05:23.370
before running a command.

116
00:05:23.370 --> 00:05:26.460
So if I now type, Hello,

117
00:05:26.460 --> 00:05:28.623
I'm sending this as a message to the model,

118
00:05:29.820 --> 00:05:32.460
and now, if I quit that model

119
00:05:32.460 --> 00:05:35.610
and I restarted and I pressed the Up arrow key,

120
00:05:35.610 --> 00:05:37.560
I don't get Hello as a suggestion,

121
00:05:37.560 --> 00:05:39.480
but instead, /set nohistory,

122
00:05:39.480 --> 00:05:41.550
which was the last command I executed

123
00:05:41.550 --> 00:05:43.503
before I disabled the history.

124
00:05:44.340 --> 00:05:49.340
So that will disable cross-session command memory,

125
00:05:49.620 --> 00:05:50.670
so to say.

126
00:05:50.670 --> 00:05:53.367
If you don't want Ollama to memorize

127
00:05:53.367 --> 00:05:55.950
certain commands or messages across sessions,

128
00:05:55.950 --> 00:05:57.780
you can set nohistory here.

129
00:05:57.780 --> 00:05:59.403
I did re-enable it though.

