WEBVTT

00:00.990 --> 00:02.310
Instructor: Large language models

00:02.310 --> 00:06.600
are trained on enormous amounts of data.

00:06.600 --> 00:09.600
It is believed that GPT-3 was trained

00:09.600 --> 00:12.510
of over a billion words.

00:12.510 --> 00:14.403
So just to give you a comparison.

00:15.300 --> 00:20.040
If you were to take a stack of a billion one dollar bills,

00:20.040 --> 00:25.040
this dollar stack would be over 67 miles up high.

00:25.080 --> 00:25.913
Now,

00:25.913 --> 00:28.440
this amount of data is directly translated

00:28.440 --> 00:30.750
to the knowledge of the model.

00:30.750 --> 00:35.040
So it is totally capable of answering questions

00:35.040 --> 00:36.840
and performing instructions

00:36.840 --> 00:40.143
without having some input data provided to it.

00:41.130 --> 00:45.060
So a zero-shot prompt is a type of prompt

00:45.060 --> 00:47.880
in which the model generates an output

00:47.880 --> 00:52.140
for that task it has not been explicitly trained on.

00:52.140 --> 00:55.260
So this means that the model is asked to perform

00:55.260 --> 00:58.950
a task without any specific training data, for example,

00:58.950 --> 01:01.080
for that specific task.

01:01.080 --> 01:04.830
Instead the model uses its preexisting knowledge

01:04.830 --> 01:07.980
to perform the task based on the information provided

01:07.980 --> 01:09.240
in the prompt.

01:09.240 --> 01:10.680
So for example,

01:10.680 --> 01:14.310
a language model that has not been trained on English text

01:14.310 --> 01:17.850
can still generate accurate outputs for French text

01:17.850 --> 01:21.693
despite it not being specifically trained on French.

01:22.560 --> 01:23.393
Now,

01:23.393 --> 01:24.240
let's take a look at an example

01:24.240 --> 01:27.480
of a prompt that is a zero-shot prompt.

01:27.480 --> 01:28.313
So the prompt is,

01:28.313 --> 01:31.447
"Create a list of the 10 must-visit cities

01:31.447 --> 01:34.110
"in the world in no particularly order."

01:34.110 --> 01:37.350
So you can see that we didn't supply it with any examples

01:37.350 --> 01:40.440
or any input data to indicate the answer.

01:40.440 --> 01:44.400
And it listed us a beautiful, coherent answer.

01:44.400 --> 01:48.390
So this is the zero-shot prompt.

01:48.390 --> 01:49.223
Now,

01:49.223 --> 01:51.780
as you go deeper and deeper into prompt engineering,

01:51.780 --> 01:55.020
you'll notice that the zero shot prompt is actually

01:55.020 --> 01:58.290
the most popular kind of prompt that people are using

01:58.290 --> 02:00.150
when they're getting into AI.

02:00.150 --> 02:01.230
Because at the beginning,

02:01.230 --> 02:04.170
you're starting to learning how to interact with the model.

02:04.170 --> 02:07.650
And it's super intuitive to simply ask it questions

02:07.650 --> 02:10.230
without providing it any examples

02:10.230 --> 02:12.660
or any way of thinking.

02:12.660 --> 02:16.320
So zero shot prompting do come with a bunch of limitations.

02:16.320 --> 02:17.880
For example, the accuracy.

02:17.880 --> 02:20.340
So what we're getting back from the model

02:20.340 --> 02:22.740
may not be exactly what we're looking for

02:22.740 --> 02:27.060
because we didn't supply it with any data or any guidance.

02:27.060 --> 02:29.220
So the scope might be limited,

02:29.220 --> 02:32.160
and we definitely have less control

02:32.160 --> 02:34.320
because it's only one prompt

02:34.320 --> 02:37.620
that relies on the model's preexisting knowledge,

02:37.620 --> 02:41.913
and it cannot be fine-tuned to any specific use case.