1
00:00:00,720 --> 00:00:01,110
Hello.

2
00:00:01,140 --> 00:00:02,310
Welcome back.

3
00:00:02,310 --> 00:00:09,510
Before we continue with our cut problem I would like to use this opportunity to introduce a visualization

4
00:00:09,510 --> 00:00:12,230
to known as computation graphs.

5
00:00:12,240 --> 00:00:21,990
Let's say we have a function J of ABC equals three into bracket a plus b c as you can see over here.

6
00:00:22,560 --> 00:00:26,770
Let's say a equals five B equals three and C equals two.

7
00:00:27,000 --> 00:00:32,700
When we insert this into our function the result is that a three like we can see over here.

8
00:00:32,730 --> 00:00:36,280
Now let's break the function down into smaller functions.

9
00:00:36,330 --> 00:00:42,660
We are going to represent the B times C parts of the function with a variable U and then create a new

10
00:00:42,660 --> 00:00:46,000
variable v such that fee equals a plus U.

11
00:00:46,110 --> 00:00:55,020
If we substitute these new variables into our function J becomes equal to 3 v we can represent this

12
00:00:55,020 --> 00:01:02,360
in a computational graph by creating three nodes one for you one for the and one for J.

13
00:01:02,490 --> 00:01:11,610
As you can see over here now then multiply by C gives this variable U which is equal to 6 as we see

14
00:01:11,700 --> 00:01:23,640
over here and variable U Plus variable a gives us variable v V equals a plus U which gives us eleven.

15
00:01:23,640 --> 00:01:28,190
Just like we have in this sub equation over here right.

16
00:01:28,200 --> 00:01:36,080
This the second note and now if we take V and we compute j such that j equals 3 multiplied by V we get

17
00:01:36,080 --> 00:01:37,370
thirty three.

18
00:01:37,590 --> 00:01:44,310
So we've essentially represented our equation in this visualization to known as computation graph over

19
00:01:44,310 --> 00:01:46,970
here we start with you here.

20
00:01:46,970 --> 00:01:56,990
U equals B.C. the result of U is taken to give us V because V equals a new variable a plus U.

21
00:01:57,090 --> 00:02:03,680
And that's eleven and a result of V is taken to give us J because K equals 3 v.

22
00:02:03,720 --> 00:02:04,890
Simple as that.

23
00:02:04,890 --> 00:02:10,700
Now let's see how to compute the derivatives using the same visualization method.

24
00:02:10,710 --> 00:02:13,560
Let's start by computing the derivative of J.

25
00:02:13,560 --> 00:02:14,850
With respect to V.

26
00:02:15,600 --> 00:02:18,510
This is written us D.J. over T V.

27
00:02:19,710 --> 00:02:22,750
This simply means that if we change the value of the.

28
00:02:22,800 --> 00:02:25,700
How much will we change the value of G.

29
00:02:25,740 --> 00:02:28,750
Essentially.

30
00:02:29,420 --> 00:02:35,150
And the answer is 3 we know G is always three times the value of the.

31
00:02:35,750 --> 00:02:43,460
So whatever increment we make to V that increment is going to be multiplied by 3 to get the corresponding

32
00:02:43,460 --> 00:02:53,120
G increment as we can see in this example when we increment V by zero point Treasury a 1 j is incremented

33
00:02:53,150 --> 00:02:56,140
by zero point zero zero three.

34
00:02:56,210 --> 00:03:04,370
By computing the derivative of V we have essentially done one step of back propagation back propagation

35
00:03:04,430 --> 00:03:13,250
involves computing the derivative of the final output variable with respect to the intermediary variables.

36
00:03:13,250 --> 00:03:20,290
Now let's try to compute the derivative of the final variable g with respect to the variable a.

37
00:03:20,330 --> 00:03:26,890
In other words let's see how much the final variable j will increase if we increase the variable a A

38
00:03:27,050 --> 00:03:27,970
is 5.

39
00:03:27,980 --> 00:03:37,430
If we increase a base your point is usually a 1 A becomes 5.0 0 1 V equals a plus u u is 6.

40
00:03:37,430 --> 00:03:42,090
So the new V value becomes six plus five point zero zero one.

41
00:03:42,110 --> 00:03:46,340
This will give us the equals eleven point 0 0 1.

42
00:03:46,550 --> 00:03:55,930
J is three times V and three times eleven point zero zero one is equal to thirty three point zero true

43
00:03:56,040 --> 00:04:01,480
three as we can see we increased a by zero point zero zero one and J.

44
00:04:01,480 --> 00:04:04,180
Got to increase by a point or two or three.

45
00:04:04,190 --> 00:04:12,350
This shows that the derivative of J with respect to e is 3 but semantically what we essentially did

46
00:04:12,440 --> 00:04:19,880
is find the derivative with risk the derivative of V with respect to a and then multiply the result

47
00:04:19,910 --> 00:04:23,740
by the derivative of J with respect to V.

48
00:04:23,750 --> 00:04:30,140
This is because in order to find how much J will increase if we increase a we have to find how much

49
00:04:30,140 --> 00:04:37,920
V will increase if we increase E and then find out how much J will increase if we increase V.

50
00:04:37,970 --> 00:04:45,210
This is because in our computation graph it connects to V and V connects to J.

51
00:04:45,230 --> 00:04:48,830
This also proves the chain rule of derivatives.

52
00:04:48,830 --> 00:04:56,930
We mentioned in the previous lesson detail over the course D J over the DV times DV over d a.

53
00:04:56,940 --> 00:05:04,670
Now let's see how to solve DG over dp 1 and when we increase the value of the variable b how much will

54
00:05:04,670 --> 00:05:06,490
the final variable increase.

55
00:05:06,620 --> 00:05:13,700
To do this we have to first find do you have DP which means when we increase the variable b how much

56
00:05:13,700 --> 00:05:15,660
will the variable you increase.

57
00:05:15,770 --> 00:05:24,300
Remember your equals B multiplied by C we set a verbal c course in the equation you is effectively U

58
00:05:24,350 --> 00:05:33,230
equals to b if not replace C here by its value to then indeed your becomes equal to two multiplied by

59
00:05:33,230 --> 00:05:34,170
B.

60
00:05:34,220 --> 00:05:40,220
This means that whatever increase we make to the variable b it's going to be more supplied by two such

61
00:05:40,220 --> 00:05:47,530
that when we increase the current value of B which is three to three point zero you are one you will

62
00:05:47,530 --> 00:05:55,380
becomes six points to a short term because we will derive u by computing two multiplied by three points

63
00:05:55,400 --> 00:06:00,870
you are short one so then that the U of IDB is equal to 2.

64
00:06:00,890 --> 00:06:08,930
This means when we increase the variable B by a particular amount the variable you is going to increase

65
00:06:09,110 --> 00:06:10,900
by twice that amount.

66
00:06:10,910 --> 00:06:15,390
Now that we know the derivative of you with respect to B is 2.

67
00:06:15,590 --> 00:06:19,350
We simply need to find a derivative of J with respect to you.

68
00:06:19,580 --> 00:06:25,790
But wait a minute we have already computed that we have already computed how much the final variable

69
00:06:25,790 --> 00:06:29,580
j will increase when we increase the variable you.

70
00:06:29,960 --> 00:06:36,470
And the answer is 3 Um by multiplying deejay over to you by.

71
00:06:36,500 --> 00:06:40,390
Do you of IDB we get DG of our DP.

72
00:06:40,700 --> 00:06:44,620
As you can see over here right.

73
00:06:44,840 --> 00:06:48,500
We apply the same method to derive D.J. over D.C..

74
00:06:48,710 --> 00:06:54,890
Let's see how we can apply what we've learned about computation graphs to our logistic regression example.

75
00:06:54,920 --> 00:07:03,470
We said we could see by the transpose x plus P We then passed through our activation function to derive

76
00:07:03,530 --> 00:07:10,220
y hut or a we also defined our lost function to be this equation.

77
00:07:10,340 --> 00:07:13,350
We can represent this in a computation graph like this.

78
00:07:13,430 --> 00:07:22,550
We take W X and B to compute C like this and then we take Z to compute y heart like this and then Y

79
00:07:22,550 --> 00:07:22,890
huts.

80
00:07:22,910 --> 00:07:26,180
I should mention y heart is the same as it over here.

81
00:07:26,180 --> 00:07:29,180
After completing a we apply our lost function.

82
00:07:29,210 --> 00:07:31,160
Remember it is the same as Y hearts.

83
00:07:31,190 --> 00:07:34,270
Like I just said which is a what predicted value.

84
00:07:34,460 --> 00:07:37,660
Y is the expected value.

85
00:07:37,670 --> 00:07:38,210
Right.

86
00:07:38,240 --> 00:07:43,070
So now our aim is to compute the derivatives with respect to the loss.

87
00:07:43,070 --> 00:07:48,350
Now our aim is to compute the derivatives with respect to the loss like I mentioned earlier.

88
00:07:48,350 --> 00:07:53,360
We start by computing the derivative of the loss with respect to the variable a.

89
00:07:53,510 --> 00:08:00,320
When we take the loss equation up here and we compute the derivative this is the answer we get when

90
00:08:00,320 --> 00:08:03,490
we compute the derivative of the loss with respect to Z.

91
00:08:03,530 --> 00:08:09,520
We get this other answer a minus Y and this is the proof of the answer.

92
00:08:09,590 --> 00:08:13,960
You can go through it if you're interested one will compare to fit.

93
00:08:14,000 --> 00:08:22,670
We find that the derivative with respect to w 1 which we have named w 1 is equal to X1 DL.

94
00:08:23,630 --> 00:08:31,760
Similarly D derivative with respect to W2 which we've named D W2 is equal to X2 dizzy.

95
00:08:31,760 --> 00:08:38,890
This essentially gives us how much we need to change w values in order to minimize the loss.

96
00:08:39,320 --> 00:08:43,210
And finally Debbie is equal to DC like this.

97
00:08:43,220 --> 00:08:47,530
This is all there is to this lesson and I shall see you in the next lesson.
