1
00:00:00,910 --> 00:00:01,990
Oh welcome back.

2
00:00:02,040 --> 00:00:04,850
In its lesson we shall create our model.

3
00:00:05,190 --> 00:00:13,000
So first of all let's create Constanta for the um the various attributes of the model.

4
00:00:13,590 --> 00:00:16,500
So I'm going to say annex number of pixels.

5
00:00:16,540 --> 00:00:24,280
Number of pixels is going to be it's going to be um we know our image pixels was a sixty four.

6
00:00:27,150 --> 00:00:32,670
I'm going around the script again just to verify when I run this bit to control us and then I'm gonna

7
00:00:32,700 --> 00:00:35,790
come over here run module.

8
00:00:36,270 --> 00:00:36,570
Okay.

9
00:00:36,570 --> 00:00:39,870
Sixty four by sixty four by three.

10
00:00:40,130 --> 00:00:41,580
Okay so a number of

11
00:00:46,100 --> 00:00:53,280
number of pixels is going to be this total number of pixels.

12
00:00:53,600 --> 00:00:57,050
It's gonna be this by this by three.

13
00:00:59,150 --> 00:00:59,740
Okay.

14
00:00:59,750 --> 00:01:02,450
And X so I'm gonna print and X to see what I get

15
00:01:06,390 --> 00:01:10,130
simply going to comment this out.

16
00:01:10,150 --> 00:01:16,950
Control is run more module Mm hmm.

17
00:01:17,010 --> 00:01:17,580
Interesting.

18
00:01:17,590 --> 00:01:18,490
Why do I get this

19
00:01:25,530 --> 00:01:25,880
the

20
00:01:33,960 --> 00:01:35,520
is because of my typo here.

21
00:01:35,520 --> 00:01:38,020
That should be multiplied by three.

22
00:01:38,010 --> 00:01:46,040
So with by height by three Chanos.

23
00:01:46,430 --> 00:01:46,840
Okay.

24
00:01:46,880 --> 00:01:47,990
This is what we have.

25
00:01:48,080 --> 00:01:49,320
So this is Centex.

26
00:01:49,580 --> 00:01:51,550
This total pixels of a single image.

27
00:01:51,560 --> 00:01:52,150
This is quick.

28
00:01:52,160 --> 00:01:54,080
This is also the size of our column vector.

29
00:01:54,800 --> 00:01:55,780
Okay.

30
00:01:55,850 --> 00:02:02,560
And over here we're going to set the the number of hidden um the hidden layer length.

31
00:02:02,570 --> 00:02:07,250
Meaning number of nodes in the hidden layer n h here.

32
00:02:07,400 --> 00:02:09,350
So you can just set it to any number.

33
00:02:09,350 --> 00:02:14,120
I'll start by six so you can change this value if you want the number of nodes in the hidden layer to

34
00:02:14,120 --> 00:02:15,130
be 10.

35
00:02:15,140 --> 00:02:21,460
You can set it to 10 and output layer is going to have a single node so I'll just pass one over here

36
00:02:23,030 --> 00:02:23,470
okay.

37
00:02:23,870 --> 00:02:27,160
And this.

38
00:02:27,200 --> 00:02:32,080
So these um yeah we should use this in our function.

39
00:02:32,090 --> 00:02:37,970
Also we can set lower dimensions we can create a variable called Layer dimensions

40
00:02:44,840 --> 00:02:46,520
and then this is going to be

41
00:02:49,360 --> 00:02:54,260
an X and then an h it in

42
00:02:58,270 --> 00:03:02,600
and then in y like this

43
00:03:08,120 --> 00:03:14,300
so we can just make sure the person we can just make sure the use I can pass the dimension.

44
00:03:14,330 --> 00:03:16,790
This list can be passed or functional we should write

45
00:03:20,120 --> 00:03:22,340
I just plural.

46
00:03:22,540 --> 00:03:23,500
Okay.

47
00:03:23,980 --> 00:03:27,100
So now we're going to write a function to take everything we've created

48
00:03:31,180 --> 00:03:40,570
Okay I'm going to come over here and see if I'm gonna call the function to Leah and Ed

49
00:03:44,010 --> 00:03:45,010
Leah and then Modo

50
00:03:52,050 --> 00:03:54,190
you sped up with this okay.

51
00:03:54,350 --> 00:04:01,070
So we're going to take our input on what training data sets their labels.

52
00:04:01,070 --> 00:04:01,980
Which is why.

53
00:04:02,150 --> 00:04:09,860
And then we're gonna take layout dimensions over here and then we going to take the lending rate

54
00:04:12,640 --> 00:04:21,610
this learning rate to we can set it to a constant in case the user doesn't provide a learning rate we

55
00:04:21,610 --> 00:04:27,900
can start with this and then number of iterations.

56
00:04:28,060 --> 00:04:30,810
And when I say no underscore

57
00:04:35,000 --> 00:04:40,910
iterations set us to three thousand five hundred.

58
00:04:41,120 --> 00:04:43,990
Okay.

59
00:04:44,240 --> 00:04:46,290
Right.

60
00:04:46,700 --> 00:04:58,310
So we start off by um let's create some some list to hold um some of the the things that we shall compute.

61
00:04:58,310 --> 00:05:05,090
So this function is going to return our parameters which are the m the weight w 1 the bias to be one

62
00:05:05,130 --> 00:05:07,950
way to W2 and then by is to be two.

63
00:05:08,000 --> 00:05:10,650
I'm gonna hold a gradient over here.

64
00:05:10,890 --> 00:05:20,030
I'll have Gratz and open close and then have this total costs

65
00:05:24,800 --> 00:05:34,450
and then I'll go up the number of training examples storage in a variable called M O say m e course

66
00:05:36,740 --> 00:05:44,770
shape of X 1 like this and then I'm going to extract what is past this is going to be a list when X

67
00:05:45,050 --> 00:05:48,490
extract what is parsed here and put into three variables.

68
00:05:48,800 --> 00:06:02,180
So I'll see and underscore X comma and underscore H and underscore why of course there's dimensions

69
00:06:02,720 --> 00:06:05,240
like this once that is done.

70
00:06:05,270 --> 00:06:11,810
I'm going to call our initialize parameters function from our library.

71
00:06:11,810 --> 00:06:16,030
Let's see what it looks like.

72
00:06:17,380 --> 00:06:17,690
Okay.

73
00:06:17,720 --> 00:06:20,440
So it takes an X in each and in y

74
00:06:24,880 --> 00:06:28,460
I'm going to close our help but we don't need it.

75
00:06:28,630 --> 00:06:39,030
Okay so I'm going to store it when I saw this in a list called parameters.

76
00:06:39,400 --> 00:06:42,990
The return is gonna be stored in this variable code parameters here.

77
00:06:43,360 --> 00:06:46,210
And so I'm going to say parameters equals

78
00:06:50,620 --> 00:06:52,300
initialize parameters

79
00:06:57,230 --> 00:07:00,660
and then the score X in on the score H.

80
00:07:00,830 --> 00:07:03,920
And then the score y like this.

81
00:07:04,640 --> 00:07:04,940
Right.

82
00:07:04,970 --> 00:07:09,060
So after this after running this you have parameters here.

83
00:07:09,080 --> 00:07:12,630
So now I'm going to take what is in this.

84
00:07:12,650 --> 00:07:15,070
This actually is a dictionary.

85
00:07:15,080 --> 00:07:17,320
This is not a simple list.

86
00:07:17,330 --> 00:07:19,980
It's a dictionary Lucio function.

87
00:07:20,120 --> 00:07:20,290
Yeah.

88
00:07:20,300 --> 00:07:22,900
The function returns this dictionary called parameters.

89
00:07:22,910 --> 00:07:30,080
So this is going to be a dictionary and we can extract the m the y one with a key y one y to the A key

90
00:07:30,080 --> 00:07:35,270
y to be one etc. It's a key vote key value pair.

91
00:07:35,790 --> 00:07:40,610
So I'll come over here and extract the results I'll see why 1 equals

92
00:07:44,830 --> 00:07:45,640
parameters

93
00:07:48,660 --> 00:07:55,800
y 1 be 1 because parameters

94
00:08:02,470 --> 00:08:02,960
P1

95
00:08:07,950 --> 00:08:10,820
I kept saying why this is totally one sorry border.

96
00:08:10,980 --> 00:08:16,500
I'm losing my mind so they stop you one by one weight bias weight bias.

97
00:08:16,520 --> 00:08:17,930
And I said Why.

98
00:08:17,930 --> 00:08:19,660
Why for a while actually.

99
00:08:19,800 --> 00:08:23,540
So w one of course parameters

100
00:08:29,360 --> 00:08:32,520
w two equals to be two.

101
00:08:32,570 --> 00:08:35,620
Okay so the last speed to be true cause

102
00:08:45,290 --> 00:08:50,230
parameters V to right.

103
00:08:51,800 --> 00:08:57,860
So now we have our weight and bias for the two layers.

104
00:08:57,860 --> 00:09:04,310
This is the architecture again we used to be one over here and then we used to be two and B two over

105
00:09:04,310 --> 00:09:08,800
here w one and B one over here and then over here.

106
00:09:09,380 --> 00:09:19,490
Okay so now we can create a loop of a gradient descent loop or c for i in range

107
00:09:22,080 --> 00:09:24,150
zero to number of iterations

108
00:09:30,940 --> 00:09:34,090
I'm going to I'm gonna perform

109
00:09:36,640 --> 00:09:40,780
our linear forward propagation

110
00:09:45,530 --> 00:09:50,990
or perhaps rather than take a two step approach we can perform the propagation and activation at once

111
00:09:52,070 --> 00:09:56,440
so we have linear we have linear activation forward.

112
00:09:56,450 --> 00:10:02,810
This one takes the input or the previous a value the weight the bias and then we call our linear forward

113
00:10:02,840 --> 00:10:03,650
propagation here.

114
00:10:03,650 --> 00:10:04,040
Right.

115
00:10:04,040 --> 00:10:04,320
Good.

116
00:10:04,340 --> 00:10:06,740
So we can just call this one or copy this.

117
00:10:06,740 --> 00:10:20,540
This is from our library and I'm gonna come down here and this is going to take x w one and it's gonna

118
00:10:20,540 --> 00:10:21,560
take the bias

119
00:10:24,150 --> 00:10:35,090
B1 and the activation function we going to use is there really and this is going to return it's gonna

120
00:10:35,090 --> 00:10:46,940
return a dictionary and no let's see this is going to return to cash the cash so list is going to return

121
00:10:46,940 --> 00:10:52,880
to cash and a single scalar value which is the activation right.

122
00:10:52,940 --> 00:11:01,130
So I'm going to store this I'm gonna have I'm gonna come over here and see you store this in a one and

123
00:11:01,130 --> 00:11:07,890
then store the cash in cash 1 right.

124
00:11:07,980 --> 00:11:11,740
So now we're going to have a one and then cash one.

125
00:11:12,030 --> 00:11:16,770
We're going to take a one and pass it through the neural network.

126
00:11:16,770 --> 00:11:21,060
Can I take a one pass it through here.

127
00:11:21,330 --> 00:11:21,780
Right.

128
00:11:21,780 --> 00:11:24,630
We've computed this bit.

129
00:11:24,810 --> 00:11:28,080
Now we are on this side.

130
00:11:28,080 --> 00:11:29,530
We pass it through a sigmoid.

131
00:11:29,580 --> 00:11:30,650
There should be a sigmoid.

132
00:11:30,660 --> 00:11:31,410
Not really.

133
00:11:32,760 --> 00:11:33,500
Okay.

134
00:11:33,600 --> 00:11:42,320
So I'm gonna come over here and call the same function.

135
00:11:42,540 --> 00:11:52,260
I'm gonna pass a one and wait matrix to W2 over here and V2 and the activation function I'm gonna use

136
00:11:52,340 --> 00:12:06,400
is the sigmoid and this is Green to return A2 and then cash to like this.

137
00:12:06,640 --> 00:12:07,850
Okay.

138
00:12:08,230 --> 00:12:11,830
So once that is done we can compute the cost.

139
00:12:12,160 --> 00:12:18,460
Let's see the function we created for compute in a cost as simple as this computer course we pass y

140
00:12:18,460 --> 00:12:26,290
Hudson y value remember y heart is the same as a two y how just like the last activation so I'll copy

141
00:12:26,290 --> 00:12:38,530
this compute cost and then when I come over here and pass a 2 and then Y and the return is gonna be

142
00:12:38,530 --> 00:12:42,130
stored in a variable that we shall call cost like this.

143
00:12:43,960 --> 00:12:51,930
And once that is done we can initialize these variables for back propagation.

144
00:12:52,090 --> 00:12:54,570
So now let's compute a two.

145
00:12:54,790 --> 00:13:10,120
So we say t h to you cause this is what we are it we say we use the N P door to divide.

146
00:13:10,210 --> 00:13:18,320
Um n p divide over here and we do Y comma A2

147
00:13:22,240 --> 00:13:26,740
and then minus n Pitot.

148
00:13:26,830 --> 00:13:27,460
Divide

149
00:13:31,060 --> 00:13:40,880
1 minus Y comma one minus A2 then put at minus here.

150
00:13:41,110 --> 00:13:41,470
Right.

151
00:13:41,470 --> 00:13:43,080
So this will give us a 2

152
00:13:48,720 --> 00:13:52,220
once this is done we can take D E 2.

153
00:13:52,950 --> 00:14:00,270
I mean d h you know a 2 the derivative of a 2 we can take the derivative of a 2 and then the content

154
00:14:00,270 --> 00:14:10,110
of our cash 2 and then pass it through a linear um or back propagation function.

155
00:14:10,110 --> 00:14:13,050
S or whether we have a function for the linear back propagation

156
00:14:16,180 --> 00:14:18,120
let's see the any above what

157
00:14:21,830 --> 00:14:25,030
Yeah we have this function linear activation.

158
00:14:25,220 --> 00:14:30,110
So I'm gonna copy this from our library and then come over here.

159
00:14:31,130 --> 00:14:36,050
And this and this function is gonna take us I comment to a 2

160
00:14:39,680 --> 00:14:51,620
and then cache to remember the content of cash to let's see the content of cash to cash to is given

161
00:14:51,620 --> 00:14:53,960
to us from our forward propagation

162
00:14:58,060 --> 00:15:02,570
de cash to host to the linear cash in the activation cash

163
00:15:05,310 --> 00:15:10,370
we want to see what a linear cautious we simply go to the linear forward function we wrote here the

164
00:15:10,470 --> 00:15:20,520
forward cash to house the he has the weight the bias and the input the weight the the parameters as

165
00:15:20,520 --> 00:15:29,060
well as the input which is a year the previous a course that's what we pass we do a W plus B to get

166
00:15:29,070 --> 00:15:41,610
c w dot E plus be just like a W plus B to get C and then we take a we take a W and B put into cash that

167
00:15:41,610 --> 00:15:46,800
is the same thing we are getting and that is what has permeated through the various functions to arrive

168
00:15:46,800 --> 00:15:54,390
here so we take cash too and we apply our activation we're gonna use the sigmoid

169
00:15:57,320 --> 00:16:06,020
over here and then once that is done this function is going to return three you're gonna return um three

170
00:16:06,020 --> 00:16:14,810
variables the derivative of a one derivative of w two and derivative of um b Two we can verify this

171
00:16:14,810 --> 00:16:17,330
from the function in our library

172
00:16:21,640 --> 00:16:21,980
okay.

173
00:16:22,210 --> 00:16:29,540
So these are the return values to a previous dy W and then DP.

174
00:16:30,010 --> 00:16:41,940
Okay so I'm gonna come over here collect them and put them in a variable I'll have the A1 W2 cause we

175
00:16:41,940 --> 00:16:50,140
say D a previous so it has to be one less than the others W2 and then D 1 B.

176
00:16:50,520 --> 00:16:51,680
Okay.

177
00:16:51,780 --> 00:16:56,090
So once this is done we've got to take what we have.

178
00:16:56,130 --> 00:17:00,060
And then pass it through our next activation for.

179
00:17:00,100 --> 00:17:00,950
Oh I really do.

180
00:17:02,100 --> 00:17:05,280
So I'm gonna call the same linear activation part quote again

181
00:17:08,500 --> 00:17:13,450
I'm gonna pass a one I'm going to pass D.A. 1 Sorry

182
00:17:20,170 --> 00:17:22,580
a one over here and then cash 1

183
00:17:27,590 --> 00:17:37,340
and then relative activation over here and this is going to return a previous data if you want and then

184
00:17:37,430 --> 00:17:48,560
D uh DP 1 2 previous A's the it's gonna be D is you pretty equal to sign here and then we're going to

185
00:17:48,560 --> 00:17:55,640
have the w 1 and then DP 1 Okay.

186
00:17:56,040 --> 00:17:56,440
Right.

187
00:17:56,510 --> 00:18:02,120
So this is going to be this the end of the back propagation we can take the gradient and store them

188
00:18:02,180 --> 00:18:06,970
in a key value pair the empty one that we created here.

189
00:18:06,980 --> 00:18:08,100
No no cigarettes.

190
00:18:08,510 --> 00:18:13,260
So I'm gonna come over here and then I'll see crowds.

191
00:18:13,690 --> 00:18:19,190
I'm gonna give this the key detail you one I'm sorry bother

192
00:18:26,590 --> 00:18:28,680
GW one in crowds

193
00:18:33,550 --> 00:18:34,310
dp 1

194
00:18:36,980 --> 00:18:40,010
this is going to hold that dp 1 crowds

195
00:18:43,720 --> 00:18:45,580
key W to

196
00:18:50,580 --> 00:18:53,900
this is going to TWD to grads

197
00:18:57,880 --> 00:19:01,670
DP to this is going to hold.

198
00:19:01,700 --> 00:19:03,970
DP to right.

199
00:19:05,090 --> 00:19:09,320
So once we've done this we can update our parameters.

200
00:19:09,320 --> 00:19:14,820
Let's see or update parameters function come off here.

201
00:19:14,870 --> 00:19:19,350
Update parameters takes the parameters the gradient and the lending rate.

202
00:19:19,350 --> 00:19:22,400
I'm gonna call this function from our library.

203
00:19:22,730 --> 00:19:26,470
Come over here paste this over here.

204
00:19:26,720 --> 00:19:29,730
The first argument is parameters.

205
00:19:29,760 --> 00:19:34,310
Gonna copy parameters pasted over here.

206
00:19:34,310 --> 00:19:35,930
The second argument is the gradient.

207
00:19:35,960 --> 00:19:39,920
I'm going to copy the grads pasted over here.

208
00:19:39,920 --> 00:19:41,800
The next document is the lending rates.

209
00:19:41,810 --> 00:19:45,480
I'm just gonna take the lending rate the user passes to this function.

210
00:19:45,500 --> 00:19:53,130
Copy this paste this over here and let's see what this returns this update.

211
00:19:53,150 --> 00:19:55,500
Parameters function.

212
00:19:56,000 --> 00:19:59,330
Okay it returns.

213
00:19:59,570 --> 00:19:59,830
Yeah.

214
00:19:59,990 --> 00:20:03,230
Well it would return a dictionary of the new parameters.

215
00:20:03,230 --> 00:20:04,730
So I'm just gonna store this

216
00:20:09,820 --> 00:20:22,210
parameters like this right and then I'm going to um I'm gonna retrieve the parameters and put them in

217
00:20:22,210 --> 00:20:25,210
a variable when I retrieve the new parameters.

218
00:20:25,210 --> 00:20:27,040
We have a dictionary that will be retained.

219
00:20:27,040 --> 00:20:29,560
I'm going to take each item out.

220
00:20:30,190 --> 00:20:31,300
So do w 1

221
00:20:33,970 --> 00:20:39,590
equals parameters w 1

222
00:20:43,130 --> 00:20:45,280
p. 1.

223
00:20:45,790 --> 00:20:46,280
Course

224
00:20:48,730 --> 00:20:49,540
parameters

225
00:20:52,790 --> 00:20:56,110
P1 w 2

226
00:20:58,760 --> 00:21:02,340
equals parameters.

227
00:21:04,500 --> 00:21:09,040
W two and then B2

228
00:21:11,540 --> 00:21:25,230
equals parameters B2 Um actually we are done here or we can do is provide some visual feedback as the

229
00:21:25,230 --> 00:21:26,500
model is being trained.

230
00:21:26,670 --> 00:21:31,870
So we are going to print that cost for every 100 training examples.

231
00:21:32,010 --> 00:21:42,370
So I'm gonna come over here and see if if by percentage one hundred of course equals zero then will

232
00:21:42,400 --> 00:21:43,890
print the cost print

233
00:21:47,960 --> 00:21:50,510
cost after iteration

234
00:21:58,260 --> 00:22:04,320
and then I'm simply going to pass the iteration number in the cost.

235
00:22:05,700 --> 00:22:15,510
So I simply do it for months and then I over here then end p dot squeeze we're going to squeeze.

236
00:22:15,540 --> 00:22:23,060
And the reason we squeeze I think I explained this before I explained again the reason why we increase

237
00:22:23,060 --> 00:22:29,160
is sometimes our return values have excess dimensions.

238
00:22:29,670 --> 00:22:38,580
We could have a value such as this 20 after a particular you know after computation or after doing something

239
00:22:38,580 --> 00:22:39,860
we end up with something like this.

240
00:22:39,870 --> 00:22:42,410
But what we want is just 20.

241
00:22:42,600 --> 00:22:45,390
So when we you squeeze this becomes just 20.

242
00:22:45,390 --> 00:22:49,740
If we pass the 20 this way to our function it would be rejected.

243
00:22:49,740 --> 00:22:51,810
That's why we sometimes you squeeze

244
00:22:55,070 --> 00:23:04,180
right um if someone in the in the course has a better explanation you can share it India in the Q Any

245
00:23:04,180 --> 00:23:05,000
s..

246
00:23:05,960 --> 00:23:16,220
Okay so we're going to do this and yeah we're going to append this to the cost um we've got a list for

247
00:23:16,220 --> 00:23:16,660
cost.

248
00:23:16,670 --> 00:23:20,140
Yeah we're going to keep spending to it as well.

249
00:23:20,480 --> 00:23:25,610
So I'm gonna come over here and then um I'll see if

250
00:23:30,010 --> 00:23:41,890
I'll just append it in the same block I'll see cost dot append and I'm going to append our cost over

251
00:23:41,890 --> 00:23:43,050
here like this.

252
00:23:43,350 --> 00:23:52,120
Also we can provide some visual feedback in a form of a graph we can actually make it plotted graph.

253
00:23:52,120 --> 00:23:54,370
So I'm gonna come over here

254
00:24:00,380 --> 00:24:05,110
sorry I'm gonna I'm trying to set my indentation column before.

255
00:24:05,630 --> 00:24:08,930
Okay so I'm going to plot I'll say BLT the plot

256
00:24:11,790 --> 00:24:16,710
and then R2 and pedo squeeze again to squeeze my cost.

257
00:24:16,710 --> 00:24:23,460
And this is the cost list we need to audit we need to append to various costs so that we'll be able

258
00:24:23,460 --> 00:24:34,950
to use it to plot a graph then or do I'll do peel T dot y label in addition to y label is going to have

259
00:24:34,950 --> 00:24:39,750
the word cost and then peel T dot X label

260
00:24:43,160 --> 00:24:46,160
and this is going to be iterations

261
00:24:50,570 --> 00:24:55,560
and this is going to be when I come over here and to peel t to try to

262
00:24:58,380 --> 00:25:09,620
give this a try to I'm gonna cut the learning rate here and in great and we simply take what is provided

263
00:25:09,620 --> 00:25:10,430
by the user

264
00:25:20,140 --> 00:25:31,190
and then peel Tito show and then we can return the parameters of the model so that we can use we can

265
00:25:31,190 --> 00:25:32,360
use it for prediction

266
00:25:41,150 --> 00:25:41,570
right

267
00:25:46,140 --> 00:25:53,790
so um let's conclude this lesson here in that next lesson we should run our model and then once it's

268
00:25:53,790 --> 00:26:00,600
done training we should run some predictions to see its performance so I'll see you in the next.
