WEBVTT

1
00:00:00.480 --> 00:00:01.650
<v Maximilian>Here's another example</v>

2
00:00:01.650 --> 00:00:04.380
where these models are quite good.

3
00:00:04.380 --> 00:00:06.270
And for that, I'll reload the model,

4
00:00:06.270 --> 00:00:08.850
still the 12-billion parameters model

5
00:00:08.850 --> 00:00:12.090
with a bigger context window.

6
00:00:12.090 --> 00:00:15.390
So I'll ramp it up to, let's say, 10,000 tokens,

7
00:00:15.390 --> 00:00:18.960
just to be sure that I don't run out of context length.

8
00:00:18.960 --> 00:00:21.630
Because you can also use these models

9
00:00:21.630 --> 00:00:23.790
as content generation machines,

10
00:00:23.790 --> 00:00:27.690
especially when combining them with few-shot prompting,

11
00:00:27.690 --> 00:00:29.430
which is a prompt engineering technique,

12
00:00:29.430 --> 00:00:31.650
a quite important one.

13
00:00:31.650 --> 00:00:33.990
And you can learn all about prompt engineering

14
00:00:33.990 --> 00:00:37.530
in my complete Generative AI course, by the way.

15
00:00:37.530 --> 00:00:39.570
The idea behind few-shot prompting

16
00:00:39.570 --> 00:00:42.330
is that you provide some examples to the model

17
00:00:42.330 --> 00:00:43.890
for a given task.

18
00:00:43.890 --> 00:00:47.670
For example, I want my model to generate a LinkedIn post.

19
00:00:47.670 --> 00:00:52.670
So I will say, "You are an expert LinkedIn post generator.

20
00:00:54.360 --> 00:00:59.360
Here are some example LinkedIn posts I wrote in the past."

21
00:01:01.920 --> 00:01:05.970
And then I'll add some LinkedIn articles.

22
00:01:05.970 --> 00:01:10.970
And I use delimiters here, XML delimiters,

23
00:01:11.010 --> 00:01:13.470
which is not required technically,

24
00:01:13.470 --> 00:01:15.240
but which can help the model.

25
00:01:15.240 --> 00:01:16.740
Not just open models,

26
00:01:16.740 --> 00:01:19.020
but Large Language Models in general,

27
00:01:19.020 --> 00:01:24.020
distinguish between examples and other parts of your prompt.

28
00:01:24.750 --> 00:01:27.900
So, when having longer, more complex prompts,

29
00:01:27.900 --> 00:01:31.170
structuring your prompt by adding blocks like this

30
00:01:31.170 --> 00:01:32.640
can be useful.

31
00:01:32.640 --> 00:01:33.930
Because my idea here

32
00:01:33.930 --> 00:01:38.930
is that I wanna provide two LinkedIn article examples

33
00:01:38.940 --> 00:01:41.370
to this model here,

34
00:01:41.370 --> 00:01:43.140
to this prompt here.

35
00:01:43.140 --> 00:01:45.960
For that, I'm on my LinkedIn page here,

36
00:01:45.960 --> 00:01:48.510
and I'll simply go to my posts,

37
00:01:48.510 --> 00:01:52.173
and I'll just grab some of my more recent posts here.

38
00:01:53.190 --> 00:01:55.540
So that's the first one, I'll paste it in here.

39
00:01:57.690 --> 00:02:01.593
And then let's say I also wanna grab this one.

40
00:02:02.430 --> 00:02:04.950
So these are just some LinkedIn articles

41
00:02:04.950 --> 00:02:06.450
I wrote in the past,

42
00:02:06.450 --> 00:02:10.320
and I'll add them between those XML tags here

43
00:02:10.320 --> 00:02:14.699
to clearly highlight the places where one post starts

44
00:02:14.699 --> 00:02:17.430
and then ends and the next one starts.

45
00:02:17.430 --> 00:02:20.257
And then after this second example, I'll say,

46
00:02:20.257 --> 00:02:25.013
"Above, you see two example posts, or LinkedIn posts,

47
00:02:26.310 --> 00:02:28.473
I wrote in the past.

48
00:02:29.550 --> 00:02:34.550
Use the same writing style as shown in those posts,

49
00:02:35.190 --> 00:02:38.580
but don't use the content.

50
00:02:38.580 --> 00:02:43.580
Instead, using the above shown writing style,

51
00:02:43.920 --> 00:02:45.540
generate a new...

52
00:02:45.540 --> 00:02:49.270
Generate a new LinkedIn post

53
00:02:50.400 --> 00:02:55.143
about using open LLMs locally.

54
00:02:56.580 --> 00:02:57.750
The post should cover

55
00:02:57.750 --> 00:03:00.810
the following core concepts and topics."

56
00:03:00.810 --> 00:03:03.090
And then I created a list of concepts

57
00:03:03.090 --> 00:03:04.683
I want the post to cover.

58
00:03:06.300 --> 00:03:08.550
Now, I could, of course, refine that,

59
00:03:08.550 --> 00:03:10.350
add more instructions to this prompt,

60
00:03:10.350 --> 00:03:11.910
but I can also now send this

61
00:03:11.910 --> 00:03:15.780
to this 12-billion parameter Gemma 3 model.

62
00:03:15.780 --> 00:03:19.830
So not even the most capable or powerful Gemma 3 model.

63
00:03:19.830 --> 00:03:23.400
I could also run the 27-billion parameters model here,

64
00:03:23.400 --> 00:03:24.840
but I'm using this one,

65
00:03:24.840 --> 00:03:28.860
and I'm sending this as a few-shot prompt to this model

66
00:03:28.860 --> 00:03:31.230
because I included some examples.

67
00:03:31.230 --> 00:03:34.350
And because of these examples I included here,

68
00:03:34.350 --> 00:03:37.080
I am getting a LinkedIn post,

69
00:03:37.080 --> 00:03:41.250
which I could now take and tweak and then share on LinkedIn.

70
00:03:41.250 --> 00:03:45.150
I'm getting that post from that locally running model.

71
00:03:45.150 --> 00:03:49.440
So without using any paid or proprietary model,

72
00:03:49.440 --> 00:03:54.060
and to me, on first sight, this post doesn't look too bad.

73
00:03:54.060 --> 00:03:56.460
If I would've included more examples here

74
00:03:56.460 --> 00:04:00.450
and/or use the more capable model, it might be even better.

75
00:04:00.450 --> 00:04:02.490
And just to show you the difference I get

76
00:04:02.490 --> 00:04:05.730
from including these examples,

77
00:04:05.730 --> 00:04:10.730
let me grab that same prompt without the examples.

78
00:04:12.360 --> 00:04:15.127
I'll open a new chat, insert it here, and say,

79
00:04:15.127 --> 00:04:17.130
"Generate a new LinkedIn post

80
00:04:17.130 --> 00:04:18.780
about using open LLMs locally."

81
00:04:18.780 --> 00:04:23.780
And I'll say, "You are an expert LinkedIn post generator."

82
00:04:26.010 --> 00:04:28.920
So now, I have no examples in there.

83
00:04:28.920 --> 00:04:31.480
I'll bring back my bullets here

84
00:04:32.460 --> 00:04:34.623
and I'll send this to that same model.

85
00:04:35.640 --> 00:04:36.660
And as you see,

86
00:04:36.660 --> 00:04:40.710
I get quite a different kind of post with way more emojis,

87
00:04:40.710 --> 00:04:43.530
as these Large Language Models tend to do.

88
00:04:43.530 --> 00:04:45.570
They tend to include more emojis.

89
00:04:45.570 --> 00:04:47.400
I also got hashtags here,

90
00:04:47.400 --> 00:04:51.093
even though I don't want to have hashtags in my posts.

91
00:04:52.170 --> 00:04:55.830
So, you see these posts are quite different

92
00:04:55.830 --> 00:04:58.080
from the posts I got before.

93
00:04:58.080 --> 00:04:59.280
Here they are again,

94
00:04:59.280 --> 00:05:01.410
these are the posts I got

95
00:05:01.410 --> 00:05:04.320
with my extra examples that I added.

96
00:05:04.320 --> 00:05:05.250
And this shows you

97
00:05:05.250 --> 00:05:07.800
that for tasks that involve few-shot prompting,

98
00:05:07.800 --> 00:05:10.650
like for example here, content generation,

99
00:05:10.650 --> 00:05:13.230
locally running models can be amazing

100
00:05:13.230 --> 00:05:14.970
and can be all you need,

101
00:05:14.970 --> 00:05:17.040
and can therefore also replace

102
00:05:17.040 --> 00:05:19.113
proprietary models potentially.