1
00:00:01,670 --> 00:00:02,030
Hello.

2
00:00:02,060 --> 00:00:03,350
Welcome back.

3
00:00:03,350 --> 00:00:10,710
So we are going to explain gradient descent using this simple graph over here.

4
00:00:10,880 --> 00:00:17,090
Does blue curve or the blue curve over here represent every value of error.

5
00:00:17,090 --> 00:00:25,580
The yellow dots not a pointer but a yellow dots on the curve is the point of both the current weight

6
00:00:25,700 --> 00:00:26,960
and error.

7
00:00:27,350 --> 00:00:35,270
The dotted circle is where we want to go and that is where arrow equals zero and which will explain

8
00:00:35,270 --> 00:00:43,670
this later as we can see the slope point to the bottom because of this no matter where we are on the

9
00:00:43,670 --> 00:00:53,930
curve we can use the slope to help the neural network reduce error let's say this is the initial state

10
00:00:53,980 --> 00:01:00,730
number one over here with arrow at two point six four and the weight is at zero.

11
00:01:01,130 --> 00:01:10,910
Let's we compute the weight Delta and we get a value of minus zero point eight eight when we update

12
00:01:10,970 --> 00:01:21,800
the weight with this weight delta value we get this new plot over here the the error reduces as we can

13
00:01:21,800 --> 00:01:30,110
see the dot is closer to zero than it used to be the weight value update to the value of zero point

14
00:01:30,170 --> 00:01:39,590
eight eight as we can see the tops of our chutes also it's overshoot our desired destination we will

15
00:01:39,590 --> 00:01:46,490
have to update the weight again until we get as close as possible to our desired destination which is

16
00:01:46,490 --> 00:01:50,240
the bashed which is that dashed cycle over here

17
00:01:53,090 --> 00:02:00,170
let's say we update the weight again after a computation of 0 4 weight Delta and it's slightly overshoot

18
00:02:00,260 --> 00:02:08,390
in the opposite direction but with less error than before this update and off wait continues until we

19
00:02:08,390 --> 00:02:15,690
get to a destination as close as possible until we get to a point as close to us.

20
00:02:16,880 --> 00:02:25,010
This update enough weight continues until we get to that destination as close as possible on the arrow

21
00:02:25,070 --> 00:02:27,650
is as close to zero as possible.

22
00:02:27,650 --> 00:02:31,250
So basically that is what we wanted do we gradient descent.

23
00:02:31,250 --> 00:02:40,100
We want to reduce the error as possible and you know descend the gradient to this point of zero which

24
00:02:40,100 --> 00:02:40,930
is our destination.

25
00:02:40,930 --> 00:02:46,080
We want to get that as close as we want together as close as possible.

26
00:02:46,220 --> 00:02:48,590
That is the point of gradient descent.

27
00:02:48,590 --> 00:02:50,720
If it's not so clear now you need no worry.

28
00:02:50,720 --> 00:02:57,400
You simply need to remember there is a curve and there is a ball and you want to come to DL.

29
00:02:57,710 --> 00:02:59,980
You want to come over here on the horizontal line.

30
00:02:59,990 --> 00:03:01,700
Does your destination.

31
00:03:01,700 --> 00:03:04,250
That is gradient descent essentially.

32
00:03:04,310 --> 00:03:10,820
So that's all there is for this lesson and we should talk more about gradient descent and what we would

33
00:03:10,970 --> 00:03:14,450
actually spend some time proving why it works later.

34
00:03:15,050 --> 00:03:16,300
So I'll see you in the next lesson.
