WEBVTT

00:00.140 --> 00:05.660
Large language models with 100 billion parameters can achieve great things.

00:05.660 --> 00:14.000
They can output us strong results from a wide variety and range of tasks, and they can do it even with

00:14.000 --> 00:17.090
no training or little training by us.

00:17.150 --> 00:25.100
However, even the largest LLM model can still struggle with certain kinds of problems.

00:25.100 --> 00:33.680
Those multi-step reasoning tasks, such as math word problems and common sense reasoning that are easy

00:33.680 --> 00:37.400
for us humans are not so easy for the models.

00:37.400 --> 00:44.150
So for that, researchers at Google came up with a method which is called chain of thought and chain

00:44.180 --> 00:44.480
of thought.

00:44.480 --> 00:52.070
Prompting aims to improve the reasoning abilities of the llms, and this method enables the models to

00:52.100 --> 00:58.970
decompose multi-step problems into intermediate steps, allowing the model to solve complex reasoning

00:58.970 --> 01:07.820
problems that are not solvable with standard prompting methods or Chain of thought, or abbreviated

01:07.850 --> 01:15.740
Cot, is breaking down a problem into a series of intermediate reasoning steps, and it significantly

01:15.740 --> 01:19.340
improved the ability of llms to perform complex reasoning.

01:19.490 --> 01:26.600
So by breaking down a complex problem into smaller, more manageable steps, the chain of thought allows

01:26.690 --> 01:31.160
llms to reason more accurate and to have better efficiency.

01:31.580 --> 01:38.660
So, like most prompt engineering techniques, the chain of thought technique was first introduced in

01:38.660 --> 01:41.510
an official paper for prompt Engineering.

01:41.510 --> 01:46.850
You can see the paper right over here and its URL also listed in the course's resources.

01:48.050 --> 01:55.280
Okay, so let's start introducing this topic from the examples that are noted in this paper over here.

01:55.310 --> 02:01.910
Okay, so we want to start with the standard problem prompting and introduce the problem and to see

02:01.910 --> 02:03.200
that it's not sufficient.

02:03.230 --> 02:03.590
Okay.

02:03.620 --> 02:08.300
And we'll see how the chain of thought prompting resolves this insufficiency.

02:08.630 --> 02:13.550
So we have two prompts over here that we asked our AI model.

02:13.580 --> 02:16.580
The first one is a question and it's kind of a riddle.

02:16.580 --> 02:19.460
So Sean has five toys for Christmas.

02:19.460 --> 02:23.000
He got two toys each from his mom and his dad.

02:23.000 --> 02:25.490
And how many toys does he has in total?

02:25.490 --> 02:30.650
So the AA model A responded with nine, which is the correct answer.

02:30.680 --> 02:33.380
Okay, so basically he had five toys already.

02:33.410 --> 02:36.020
Two he got from his mom and two he got from his dad.

02:36.020 --> 02:38.900
So it's five plus two plus two.

02:38.930 --> 02:39.980
This is great.

02:40.250 --> 02:46.970
Now afterwards those researchers, they asked another riddle which is kind of similar.

02:47.150 --> 02:48.740
It has the same thought process.

02:48.740 --> 02:53.270
So the question is the prompt is John takes care of ten dogs.

02:53.270 --> 02:58.250
Each dog takes five hours a day to walk and take care of their business.

02:58.250 --> 03:03.170
How many hours a week does John need to take care of his dogs?

03:03.170 --> 03:06.140
So the AI model outputted.

03:06.140 --> 03:12.470
That was 50, so probably the calculation was ten times five, which outputted 50.

03:12.500 --> 03:14.870
Now this is not the correct solution.

03:14.900 --> 03:18.020
The correct answer for this is 35 hours.

03:18.050 --> 03:24.080
So in the standard prompting that we saw right now, which is actually if we classify it, it goes to

03:24.110 --> 03:27.770
the topic of zero shot prompting.

03:27.770 --> 03:31.160
Then we got insufficient answers for our questions.

03:31.190 --> 03:31.490
Okay.

03:31.520 --> 03:33.950
So we had kind of a limitation over here.

03:33.950 --> 03:38.540
So the paper here introduced this limitation with the standard prompting.

03:38.540 --> 03:41.780
And here was an example of a zero shot prompting.

03:41.930 --> 03:47.990
And the paper suggested that there is a new way of prompting technique which is called change of thought.

03:48.050 --> 03:51.110
So chain of thought is actually pretty simple.

03:51.110 --> 03:58.580
What we want to do is guide the model to solve the problem, like we as humans solve those problems.

03:58.580 --> 04:05.030
And by doing that, we'll make the model smarter and better, and hopefully it will output us the answer

04:05.030 --> 04:07.160
that we want and the correct answer.

04:07.190 --> 04:07.640
Okay.

04:07.670 --> 04:14.960
So in the example over here, what those researchers did, they took the first question with the toys

04:14.960 --> 04:17.430
and they gave an example answer.

04:17.460 --> 04:21.690
Okay, so you can think of it as a one shot prompting.

04:21.720 --> 04:22.290
Okay.

04:22.290 --> 04:24.390
But they did more than that.

04:24.390 --> 04:29.850
They also supplied the chain of thought how we as humans solve this problem.

04:29.880 --> 04:33.900
So in their answer, they said that John started with five toys.

04:33.900 --> 04:36.900
He got two from his mom and two from his dad.

04:36.900 --> 04:42.120
So the calculation should be four plus two plus two, which is nine.

04:42.150 --> 04:45.330
Okay, that's why the answer is nine.

04:45.360 --> 04:45.870
Okay.

04:45.900 --> 04:49.260
And then they asked the same question again.

04:49.290 --> 04:55.590
And it turned out that the model was smart enough to take the chain of thought that the researchers

04:55.590 --> 05:01.590
applied to it for the previous question, which was kind of similar, and to apply this chain of thought

05:01.590 --> 05:03.030
in the next answer.

05:03.060 --> 05:03.540
Okay.

05:03.570 --> 05:08.700
So you can see that the second response with the chain of thought prompting.

05:08.700 --> 05:14.640
Then the model got the correct answer and it actually supplied the steps that it made in order to get

05:14.640 --> 05:15.390
to that answer.

05:15.420 --> 05:20.700
Okay, so the AI model outputted that John took care of ten dogs.

05:20.700 --> 05:25.080
And because each dog takes half an hour a day to walk and take care of.

05:25.110 --> 05:30.630
So the calculation should be ten, which is ten dogs times the time it takes to take care of them,

05:30.630 --> 05:32.670
which is five hours per day.

05:32.700 --> 05:39.720
So if we have five hours per day for taking care of the dogs and we have seven days in a week, then

05:39.720 --> 05:44.610
the calculation should be five times seven, which is going to be equal to 35.

05:44.640 --> 05:45.060
Okay.

05:45.090 --> 05:50.700
So the thought process was pretty much the same over here with both questions.

05:50.700 --> 05:57.600
It was to take the calculation that we need to calculate and to break it down to two sub calculation.

05:57.630 --> 05:58.110
Okay.

05:58.140 --> 05:59.190
So yeah.

05:59.190 --> 06:04.770
So that was the chain of thought prompting in the research paper that you see below.

06:05.940 --> 06:13.290
Chain of thought is a significant development in prompt engineering and LM usages because it allows

06:13.290 --> 06:20.340
the LM model to approach problem solving in a more human like manner, and it allows the models to break

06:20.340 --> 06:26.430
down complex problems into intermediate steps that are resolved individually.

06:26.460 --> 06:34.890
So this is a major step in prompt engineering which opens us to solving a lot more problems.

06:35.760 --> 06:41.970
So we saw in earlier videos that we had zero shot, one shot and few shot prompting.

06:41.970 --> 06:46.200
So in chain of thought we actually have something very similar.

06:46.230 --> 06:46.650
Okay.

06:46.680 --> 06:51.390
So we have something which is called a zero shot chain of thought.

06:51.390 --> 07:01.440
So in this kind of prompt we simply add to our prompt something like let's think step by step and then

07:01.440 --> 07:07.500
the model will output the answer while explaining what it gets to be the steps.

07:07.530 --> 07:14.790
Okay, so we did not supply the reasoning engine to the model, and we let it have its creative freedom

07:14.790 --> 07:15.690
and choose for us.

07:15.720 --> 07:16.320
Okay.

07:16.320 --> 07:24.300
And if we integrate few shots chain of thought prompting, then we simply in our prompt supply the model

07:24.300 --> 07:30.300
with the answer we want and explain to them what are the steps and what are the chain of thought that

07:30.300 --> 07:32.520
we used to resolve those steps?

07:32.520 --> 07:34.800
And how did we come up with this answer.

07:34.800 --> 07:40.080
And the model takes this data, processes it, and then outputs the correct answer.

07:40.110 --> 07:40.650
Okay.

07:40.680 --> 07:44.520
So zero shot chain of thought can be actually beneficial.

07:44.550 --> 07:44.880
Okay.

07:44.910 --> 07:48.660
And you can actually see the reasoning process of the AI model.

07:48.690 --> 07:49.080
Okay.

07:49.110 --> 07:51.660
But it will have no prior knowledge.

07:51.660 --> 07:59.370
And in the few shot change of thought prompt we will attach to our prompt the real answer that we're

07:59.370 --> 08:06.600
expecting in a very similar problem and explain how we got to this answer and the AI model will take

08:06.600 --> 08:14.280
it, will process it, will learn it, and then the model will be able to deduce the chain of thought

08:14.310 --> 08:21.480
that we supplied it to other problems, and will be able to apply this chain of thought and will be

08:21.510 --> 08:28.920
able to solve similar, but not really the same problems in the way that we wanted it to resolve it.