1
00:00:00,790 --> 00:00:01,300
Hello.

2
00:00:01,330 --> 00:00:02,630
Welcome back.

3
00:00:02,630 --> 00:00:10,120
In this lesson we are going to see how to perform the complete forward propagation of a work case study.

4
00:00:10,180 --> 00:00:10,900
This one over here.

5
00:00:10,900 --> 00:00:19,210
We're going to do the forward prop show meaning we are going to take our input and then compute it interacted

6
00:00:19,210 --> 00:00:25,230
with a way to get the output pass it through the activation function until we get our y hat.

7
00:00:25,300 --> 00:00:27,860
We're gonna do that here.

8
00:00:28,450 --> 00:00:30,580
So I'm gonna make a copy of the last project

9
00:00:40,950 --> 00:00:54,830
call this forward propagation.

10
00:00:56,350 --> 00:00:56,680
Okay.

11
00:00:56,770 --> 00:00:58,000
So I'm gonna open this

12
00:01:02,750 --> 00:01:04,410
and this is where we left off.

13
00:01:04,460 --> 00:01:09,080
So what we're going to do first is right our activation function.

14
00:01:09,150 --> 00:01:11,070
Let's implement it.

15
00:01:11,150 --> 00:01:14,480
We're gonna write a function to compute the sigmoid

16
00:01:20,060 --> 00:01:26,220
and we said this what a sigmoid looks like if we pass F to the function.

17
00:01:26,270 --> 00:01:29,810
This is the computation we perform to get the sigmoid.

18
00:01:29,810 --> 00:01:38,700
We do 1 over 1 plus e to the poll minus sorry if we pass Z to the function if we proceed to the function

19
00:01:38,820 --> 00:01:43,520
f and we want to find a sigmoid of the number z.

20
00:01:43,770 --> 00:01:51,360
Then with the Form 1 over 1 plus e to the power minus C and we have to implement this in C code.

21
00:01:51,570 --> 00:01:52,180
Let's do that.

22
00:01:52,180 --> 00:01:57,870
Now I'm going to come over here and dysfunction.

23
00:01:57,960 --> 00:02:00,700
I'm gonna call sigmoid.

24
00:02:01,290 --> 00:02:04,940
Why do I get why do I keep starting from here.

25
00:02:10,640 --> 00:02:11,140
Okay.

26
00:02:11,180 --> 00:02:11,920
Sorry bother.

27
00:02:13,490 --> 00:02:15,390
Okay.

28
00:02:15,920 --> 00:02:24,830
This of type w it's returns a double gonna call it sigmoid and we're going to take Z or X. Let's call

29
00:02:24,830 --> 00:02:36,130
it X over here s argument open and close and I'm simply going to come over here see double result equals

30
00:02:36,340 --> 00:02:49,220
1 0 4 1 and then 1 plus we have the exponent function known as E XP minus the argument which is x over

31
00:02:49,220 --> 00:02:50,000
here.

32
00:02:50,000 --> 00:02:59,030
So this over here will get us the sigmoid weather will get us the sigmoid of a single value but we're

33
00:02:59,030 --> 00:03:01,150
dealing with vectors and matrices.

34
00:03:01,160 --> 00:03:04,960
So let's find a way to get a sigmoid of an entire vector.

35
00:03:05,540 --> 00:03:10,130
So I'm gonna come over here and say void

36
00:03:14,770 --> 00:03:18,180
vector sigmoid open close.

37
00:03:18,490 --> 00:03:23,640
The first argument is a pointer to the input vector.

38
00:03:23,870 --> 00:03:30,960
The second argument is a pointer to the output vector.

39
00:03:31,140 --> 00:03:34,220
The last argument is the length of the vector.

40
00:03:34,310 --> 00:03:35,770
So I still have the same length.

41
00:03:35,790 --> 00:03:45,810
We can have just one length parameter instead to underscore t Ellie n open and close and I'll suggest

42
00:03:45,870 --> 00:03:48,630
you post the video and try to implement this on your own.

43
00:03:48,660 --> 00:03:53,210
Since we've already implemented a sigmoid function here right.

44
00:03:53,220 --> 00:03:56,400
Once you are done we can do it together.

45
00:03:56,470 --> 00:04:02,280
Start off by saying for int i equals zero.

46
00:04:02,860 --> 00:04:10,830
And of course i is less than the length I plus plus then open and close and we come over here and see

47
00:04:11,040 --> 00:04:13,710
the output vector index.

48
00:04:13,740 --> 00:04:17,100
I E course sigmoid

49
00:04:22,700 --> 00:04:27,350
input vector index I over here right.

50
00:04:27,830 --> 00:04:33,050
So we can expose the vector sigmoid function we can leave this one here if you want to expose it you

51
00:04:33,050 --> 00:04:36,900
can but we don't need this in our main file we need just this once.

52
00:04:36,920 --> 00:04:40,040
I'm gonna bring this one over here like this

53
00:04:43,200 --> 00:04:50,760
page over here at a semicolon here and now we can go to I mean the c file to perform our forward propagation

54
00:04:52,500 --> 00:05:01,200
and I'm going to clean everything from the internals here I'm going to come over here just clean this

55
00:05:01,440 --> 00:05:08,360
and we have a what data in our buffers and one thing I'm gonna do is send ups one we realize our sit

56
00:05:08,350 --> 00:05:15,440
ups one doesn't need to be a matrix it's a vector cause we saw you had one row three columns so sit

57
00:05:15,460 --> 00:05:20,220
ups one over here rather than it's been a matrix we can simply

58
00:05:23,360 --> 00:05:24,500
we can change it.

59
00:05:26,450 --> 00:05:28,330
Let me see should we change it.

60
00:05:29,590 --> 00:05:32,690
Yeah we can change it to a vector.

61
00:05:32,990 --> 00:05:33,620
No problem

62
00:05:37,060 --> 00:05:39,020
right.

63
00:05:39,160 --> 00:05:39,850
Right.

64
00:05:39,850 --> 00:05:48,310
So actually I'm going to rename these bits over here clean this and I'm simply going to put it together

65
00:05:48,400 --> 00:05:58,750
and say um row X and then vector rise X. We're gonna put the two data types x 1 and x 2 together just

66
00:05:58,750 --> 00:06:01,560
like we have it in our image here.

67
00:06:01,940 --> 00:06:07,570
We're going to we're going to keep all of this in a matrix so that we can compute it all at once.

68
00:06:07,570 --> 00:06:09,820
Like it's often done in real world.

69
00:06:09,820 --> 00:06:13,630
I'm gonna say double row x.

70
00:06:14,320 --> 00:06:20,470
And notice I said row X not row X1 or anything clean this as well.

71
00:06:20,470 --> 00:06:35,350
Row X the size of row X number of features which is the same number of input and number of examples.

72
00:06:35,350 --> 00:06:39,840
And it's a two dimensional array or a matrix.

73
00:06:39,910 --> 00:06:41,710
The first 2 5 1

74
00:06:44,870 --> 00:06:50,220
them show you what I mean by 2 5 1 2 5 1.

75
00:06:50,240 --> 00:06:54,270
So the first rule belongs to the first feature type.

76
00:06:54,290 --> 00:06:57,800
The second rule belongs to the second feature type or input type.

77
00:06:57,800 --> 00:07:04,310
So 2 5 1 here belongs to hours of work out and then

78
00:07:08,180 --> 00:07:10,590
just arrange it in a matrix form.

79
00:07:10,610 --> 00:07:18,030
The second rule here 8 5 8 this is hours of rest right.

80
00:07:18,440 --> 00:07:22,670
And I'm going to have to have another double type here.

81
00:07:28,700 --> 00:07:32,330
Double raw why.

82
00:07:32,800 --> 00:07:43,150
And this just for uniform is uniformity sake I'm going to make it a matrix and it's got one rule so

83
00:07:43,170 --> 00:07:47,080
of past the constant one day and this is number of examples

84
00:07:50,310 --> 00:07:52,260
and we know the y values.

85
00:07:52,260 --> 00:07:58,080
Two hundred ninety and then one hundred and ninety.

86
00:07:58,230 --> 00:08:05,040
Right.

87
00:08:05,190 --> 00:08:11,280
So I think I should put a comment here for you just to allow you to picture what we have

88
00:08:14,480 --> 00:08:16,720
she.

89
00:08:16,930 --> 00:08:17,560
I'm going to say.

90
00:08:17,560 --> 00:08:18,370
Train X

91
00:08:21,240 --> 00:08:29,760
Crane X is this date um it s two five one does the training data for x

92
00:08:39,580 --> 00:08:53,170
8 5 8 and then the dimensions do it cause an X meaning number of features by m mean in number of training

93
00:08:53,170 --> 00:09:00,820
examples and X year number of features how many features do we have we have two features hours of work

94
00:09:00,820 --> 00:09:09,820
out hours of rest by M number of training examples how many training examples do we have we have three

95
00:09:09,820 --> 00:09:10,840
of them.

96
00:09:10,840 --> 00:09:16,860
Column 1 Column 2 Column 3 and we can see this in our architecture here.

97
00:09:17,050 --> 00:09:19,840
Column 1 Column two column three.

98
00:09:19,870 --> 00:09:27,520
So in this row actually in this table what I've written in the code is a transpose of this table here

99
00:09:28,120 --> 00:09:35,910
in what I have in the code the rows have become columns and columns have become rows essentially right.

100
00:09:36,070 --> 00:09:40,270
So I'm gonna put a similar comment here just to give a clearer picture.

101
00:09:40,280 --> 00:09:46,990
Cos my goal in these initial sections is for you to understand if you do not understand and it s works

102
00:09:47,590 --> 00:09:54,310
then my goal hasn't been accomplished so I'm going to heavily comment this and you know sort of drill

103
00:09:54,310 --> 00:10:00,820
down every little thing you might get frustrated you know but you can always skip like I say or have

104
00:10:00,820 --> 00:10:04,470
a go at me in the review like some of you like to do.

105
00:10:04,980 --> 00:10:07,180
Um so this is it.

106
00:10:07,540 --> 00:10:08,440
And the dimension

107
00:10:13,170 --> 00:10:24,780
is time of course one by number of training examples one row by number of training examples and we can

108
00:10:24,780 --> 00:10:38,680
see that over here right you can see the like I said you can and this is a column the lead column has

109
00:10:38,680 --> 00:10:42,310
become a row essentially and this is because I've arranged it like this.

110
00:10:42,460 --> 00:10:47,540
And once you understand what goes on in the background it really doesn't matter what you make room where

111
00:10:47,560 --> 00:10:48,400
you make a column.

112
00:10:48,730 --> 00:10:52,150
Especially since we are starting from scratch like this.

113
00:10:52,150 --> 00:10:57,280
But once you start dealing with libraries this a particular format in which data is arranged and you

114
00:10:57,280 --> 00:11:02,890
have to know that you have to take into account as you work with him later on we shall see how to work

115
00:11:02,890 --> 00:11:10,030
with a carer's and tensor flu and other Python libraries after we've seen a lot of oh after we've seen

116
00:11:10,030 --> 00:11:16,870
some of our C code course in the room where would you be sort of right in the code in Python and then

117
00:11:17,230 --> 00:11:24,790
training with Python and then run in in France only on your microcontroller so you would have to know

118
00:11:24,790 --> 00:11:31,580
a bit of Python 2 but already have lessons on Python for those of you who are not convenient or who

119
00:11:31,620 --> 00:11:37,000
have never studied Python you can go to the last sections of this course there is a section known as

120
00:11:37,000 --> 00:11:42,280
Python primer and you can learn everything you need to know about python there cos I've got other Python

121
00:11:42,280 --> 00:11:49,690
courses on digital image processing and DSP I've just taken the python primer lessons to improve them

122
00:11:49,690 --> 00:11:56,360
here so that you can get acquainted with Python when we start using tensile flew and carers and you

123
00:11:56,380 --> 00:12:01,950
know you don't already know Python you can get a start right.

124
00:12:02,050 --> 00:12:09,220
So have having done this we can go on we said we have row y row X so we're gonna create a buffer to

125
00:12:09,220 --> 00:12:13,170
store a while to store the normalized version is right.

126
00:12:13,180 --> 00:12:21,100
So I'm gonna come over here and see a double Crane X and the size of this is just the size of row y

127
00:12:21,100 --> 00:12:34,780
row X so copy this pasted over here and I'm gonna come over here double Crane y the size of this is

128
00:12:34,840 --> 00:12:38,650
the same as the size of row y.

129
00:12:38,890 --> 00:12:40,260
Okay.

130
00:12:40,340 --> 00:12:42,110
Right.

131
00:12:42,520 --> 00:12:48,370
So I'm going to create some buffers to I'm gonna have see one.

132
00:12:48,370 --> 00:12:51,790
Let's see what we have.

133
00:12:52,300 --> 00:12:58,870
So I'm going to create buffers to hold ze one to hold C A Z and why.

134
00:12:58,870 --> 00:13:05,590
Remember we have ze superscript 2 over here a superscript to C superscript 3 etc. We're going to create

135
00:13:05,590 --> 00:13:12,220
buffers to hold this this slide over here shows the entire forward propagation process this is what

136
00:13:12,220 --> 00:13:14,100
we're going to implement in this lesson.

137
00:13:15,490 --> 00:13:18,700
So having minimized that oh come over here and see

138
00:13:22,960 --> 00:13:23,410
double

139
00:13:26,460 --> 00:13:27,470
Okay let's see.

140
00:13:27,570 --> 00:13:34,170
Let's say we want to start training example by example we want to process training example 1 so I'm

141
00:13:34,170 --> 00:13:39,330
saying train X e.g. 1 and this is equal to number of features

142
00:13:46,530 --> 00:13:47,230
and then

143
00:13:54,510 --> 00:14:08,910
we do a train why each one and then you do a double C one for training example one we take an early

144
00:14:09,030 --> 00:14:12,390
training example one just to make it understandable.

145
00:14:12,510 --> 00:14:14,770
And then ze one of course.

146
00:14:14,900 --> 00:14:15,590
No.

147
00:14:16,310 --> 00:14:18,350
He didn't notice.

148
00:14:19,380 --> 00:14:23,820
That's the size double A1.

149
00:14:23,980 --> 00:14:27,860
The screen should be the same size I see one.

150
00:14:27,890 --> 00:14:30,330
This also of course number of hidden notes.

151
00:14:35,090 --> 00:14:37,010
And then can I see a double

152
00:14:39,870 --> 00:14:42,930
you can have double over here.

153
00:14:43,080 --> 00:14:43,860
See two

154
00:14:46,650 --> 00:14:47,480
of training.

155
00:14:47,490 --> 00:14:48,410
Example one.

156
00:14:48,420 --> 00:14:54,120
And then I see a double y hut which is the same as a two

157
00:14:58,630 --> 00:15:03,880
offering an example one like this right.

158
00:15:04,480 --> 00:15:07,980
So we start off by normalizing our data.

159
00:15:08,140 --> 00:15:21,550
I'm gonna come over here and see normalized data and I'm going to pass a number of features by number

160
00:15:21,550 --> 00:15:23,290
of examples.

161
00:15:23,290 --> 00:15:34,150
And I'm going to start to normalize data the number of features numeric samples and the input the input

162
00:15:34,180 --> 00:15:36,570
matrix is raw x.

163
00:15:37,120 --> 00:15:45,110
The output is train X like this.

164
00:15:45,460 --> 00:15:46,880
We have a typo

165
00:15:50,610 --> 00:15:51,060
okay.

166
00:15:51,210 --> 00:15:58,170
So the reason we have a typo is our normalized function takes two vectors and we're trying to normalize

167
00:15:58,170 --> 00:15:58,800
a matrix.

168
00:15:58,920 --> 00:16:03,260
So we've got to create a new normalize.

169
00:16:03,840 --> 00:16:06,680
Um yeah you've got to create a new normalize.

170
00:16:07,270 --> 00:16:10,360
So I'm gonna come over here and create a new function here.

171
00:16:10,410 --> 00:16:18,810
Call this normalized 2D and I'm gonna create after that yeah after it I'm gonna create another function

172
00:16:18,810 --> 00:16:25,380
for 1 D and 2 this stuff just because of this show that we would have to we have the option.

173
00:16:25,380 --> 00:16:28,470
Anyway I'm gonna come away and quickly do void

174
00:16:31,820 --> 00:16:42,060
normalize on the score data on the score 2D and the first document is going to be the number of rows

175
00:16:42,090 --> 00:16:44,530
and we can make number of columns second.

176
00:16:44,760 --> 00:16:50,460
So come over here you end 32 on the score t row

177
00:16:54,500 --> 00:17:02,560
and then you end 32 on a score t column and then

178
00:17:05,810 --> 00:17:09,960
see double I can simply call this input array

179
00:17:13,240 --> 00:17:14,410
0 input matrix

180
00:17:16,980 --> 00:17:21,960
input matrix I bother and it's rule by column

181
00:17:27,100 --> 00:17:29,620
the next documents can be the output matrix

182
00:17:34,900 --> 00:17:41,890
and this also is rule by column size.

183
00:17:42,520 --> 00:17:45,520
And yeah we can implement it.

184
00:17:45,520 --> 00:17:49,450
I think this should be fine.

185
00:17:49,480 --> 00:17:55,770
And like we did earlier we start off by going through the matrix to find the maximum number.

186
00:17:55,850 --> 00:18:01,680
So this time is not just a one a one D or effect or two matrix.

187
00:18:01,690 --> 00:18:04,980
So we need a nested loop gonna come over here.

188
00:18:04,990 --> 00:18:07,390
I see a double Can I use a different method here.

189
00:18:07,390 --> 00:18:11,500
Double Max because I could not make it an extremely small number.

190
00:18:13,210 --> 00:18:24,140
Okay so let's say that Max and I'm gonna come see a four and I equals zero.

191
00:18:24,330 --> 00:18:39,040
I s less than rows I plus plus pin close and for int j equals zero g s less than columns

192
00:18:42,240 --> 00:18:43,230
J plus plus

193
00:18:46,730 --> 00:18:47,770
we've been close.

194
00:18:48,890 --> 00:18:53,000
And Mark's gonna say if

195
00:18:55,610 --> 00:19:15,180
input matrix high J is created on max then Mark cause the input matrix I and J.

196
00:19:15,210 --> 00:19:22,020
Like this once we've done this we have once we found the marks the maximum number in the matrix

197
00:19:26,360 --> 00:19:33,460
we can come out here and normalize it we say for int i equals zero.

198
00:19:33,700 --> 00:19:45,680
I use less than row I plus plus open and close and this and this that loop us well for int j equals

199
00:19:45,680 --> 00:19:46,340
zero.

200
00:19:46,340 --> 00:19:53,960
J s less than call J plus plus open and close as well.

201
00:19:53,960 --> 00:20:08,090
And then output matrix index I index J E course input matrix I j.

202
00:20:08,150 --> 00:20:11,770
Divided by Max it should normalize for us.

203
00:20:12,200 --> 00:20:14,400
Right.

204
00:20:14,810 --> 00:20:19,440
And what I'm gonna do is let's see our initialization function.

205
00:20:19,910 --> 00:20:25,310
So this weight initialization function is to D.

206
00:20:25,340 --> 00:20:34,500
I'm gonna write a one day version of weight initialization function she gonna come over here and quickly

207
00:20:34,500 --> 00:20:37,330
do weight initialization totally void

208
00:20:42,180 --> 00:20:43,930
where each initialization

209
00:20:47,330 --> 00:20:48,580
one D.

210
00:20:49,110 --> 00:20:49,500
Right.

211
00:20:49,500 --> 00:20:58,050
So we simply going to take input vector output vector or we simply need output vector here we initialize.

212
00:20:58,080 --> 00:21:08,290
So we need no input we just need a place to store the random numbers output for actual and the length

213
00:21:08,730 --> 00:21:10,110
the length of the vector

214
00:21:12,890 --> 00:21:17,480
you enter 30 to 40 alien

215
00:21:20,600 --> 00:21:24,910
you open and close like this.

216
00:21:25,350 --> 00:21:26,960
Right.

217
00:21:27,780 --> 00:21:36,260
And we can run do a here See that's.

218
00:21:36,930 --> 00:21:39,670
And then for each.

219
00:21:39,670 --> 00:21:42,230
J equals zero.

220
00:21:43,230 --> 00:21:46,590
J is less than the length J.

221
00:21:46,590 --> 00:21:56,040
Plus plus open and close and we simply need a local variable here called Double de Ronde.

222
00:21:56,440 --> 00:22:00,000
I'm gonna come down here and see

223
00:22:03,780 --> 00:22:04,500
better ways.

224
00:22:04,510 --> 00:22:06,020
The youngest core runt.

225
00:22:06,420 --> 00:22:07,810
So a D on the score round.

226
00:22:07,860 --> 00:22:08,550
E course

227
00:22:11,360 --> 00:22:19,820
round function modulo 10 and then D and the score run the course divided by 10.

228
00:22:19,970 --> 00:22:30,110
You'd need to generate a random number between between 0 and 10 and see article vector index J equals

229
00:22:30,350 --> 00:22:33,450
the underscore round and our function is complete.

230
00:22:34,010 --> 00:22:39,820
So I'm gonna take the prototypes of these functions to the interface file over here

231
00:22:43,190 --> 00:22:46,250
open and close like this.

232
00:22:46,250 --> 00:22:56,740
And the last one where each initialization control C and come over here.

233
00:22:56,750 --> 00:22:59,990
Paste to here semicolon here.

234
00:23:00,020 --> 00:23:03,380
Now we can go to our main function.

235
00:23:03,960 --> 00:23:10,090
Now let's see now we can simply see normalized data.

236
00:23:10,670 --> 00:23:17,920
We can use the 20 version by simply putting on the score to d over here right.

237
00:23:18,350 --> 00:23:25,200
Of course and the cool on here and I'm going to normalize this as well.

238
00:23:28,040 --> 00:23:45,860
Normalize data score to D and this one by number of examples and then to input this raw why the output

239
00:23:45,860 --> 00:23:50,410
is training y right.

240
00:23:50,600 --> 00:23:51,000
Good.

241
00:23:53,140 --> 00:23:59,000
So once that's done let's fetch training example one from a Y and training set.

242
00:23:59,210 --> 00:24:02,000
I'm gonna simply say training example 1

243
00:24:13,320 --> 00:24:16,190
train training example 1 index 0.

244
00:24:16,220 --> 00:24:19,880
Course we fetched from Crane x

245
00:24:22,420 --> 00:24:29,480
0 0 and then an example 1

246
00:24:35,890 --> 00:24:36,480
x 1

247
00:24:40,660 --> 00:24:48,340
u course create next 1 0

248
00:24:51,930 --> 00:24:55,170
with essentially selected 2 and 8.

249
00:24:55,180 --> 00:25:02,350
So between an example one training example 2 is 5 and 5 train an example 3 is 1 in 8 so that's how come

250
00:25:02,350 --> 00:25:14,080
with navigated this way right I'm gonna fetch the y value as well Crane why each one this a single value

251
00:25:16,420 --> 00:25:23,380
course train example why train why next zero index zero

252
00:25:27,800 --> 00:25:30,770
Okay so once this is done

253
00:25:33,520 --> 00:25:40,680
we can check if we have the right data and can print.

254
00:25:40,710 --> 00:25:41,020
OK.

255
00:25:41,100 --> 00:25:41,480
What.

256
00:25:41,650 --> 00:25:43,900
Let's just quickly print print

257
00:25:47,100 --> 00:25:51,520
create an underscore x.

258
00:25:51,850 --> 00:25:54,220
Each one is

259
00:25:57,500 --> 00:26:00,400
percentage of percentage F

260
00:26:03,060 --> 00:26:03,450
and

261
00:26:07,680 --> 00:26:08,850
one to a screen

262
00:26:11,440 --> 00:26:12,280
score x.

263
00:26:12,780 --> 00:26:13,550
Next zero.

264
00:26:13,560 --> 00:26:26,680
And then green X that you want your next one right.

265
00:26:30,440 --> 00:26:31,190
Print off

266
00:26:35,600 --> 00:26:37,530
you can put another new line here

267
00:26:45,590 --> 00:26:50,630
and we can do the same for y value J print f

268
00:26:55,940 --> 00:26:56,850
print y.

269
00:26:57,830 --> 00:27:01,590
Each one is simply percentage.

270
00:27:02,550 --> 00:27:06,390
This is a trend y.

271
00:27:06,420 --> 00:27:08,430
If you want follow you.

272
00:27:09,080 --> 00:27:09,670
Right.

273
00:27:09,680 --> 00:27:19,040
So I'm going to click here to build and then turn it onto my board and I'm gonna look into a term

274
00:27:25,830 --> 00:27:30,620
to research my board and create an example one.

275
00:27:30,630 --> 00:27:35,180
Is this because it's a normalized value.

276
00:27:35,340 --> 00:27:43,170
This is the x value to raw terms font size has gone back to its default state so that

277
00:27:52,560 --> 00:27:54,860
so the Australian example 1.

278
00:27:55,230 --> 00:27:57,510
And if you normalize these values here

279
00:28:02,570 --> 00:28:03,390
right.

280
00:28:03,470 --> 00:28:10,910
If you divide to show a normalization basically takes the maximum value and then you device each element

281
00:28:10,910 --> 00:28:12,520
by the maximum value.

282
00:28:12,530 --> 00:28:16,530
So if we take two and we divide it by eight we get your point two five.

283
00:28:16,580 --> 00:28:20,240
If we take eight and divide it by eight we get one.

284
00:28:20,250 --> 00:28:21,330
How well trained in x.

285
00:28:21,590 --> 00:28:28,340
How about a y value to y value the y value for our training example one is the highest in the vector

286
00:28:28,370 --> 00:28:29,880
which is two hundred.

287
00:28:29,900 --> 00:28:33,490
So if you take two hundred and divided by two hundred you get one.

288
00:28:33,740 --> 00:28:34,100
Right.

289
00:28:34,130 --> 00:28:35,380
So we're on the right track.

290
00:28:37,850 --> 00:28:39,060
Okay.

291
00:28:39,380 --> 00:28:40,780
So we have the right data.

292
00:28:40,790 --> 00:28:43,820
Now let's initialize and up 0 and snaps 1 wait

293
00:28:47,280 --> 00:28:48,270
for a comment here

294
00:28:56,700 --> 00:28:58,400
I'm going to upload a code afterwards.

295
00:28:58,410 --> 00:28:59,810
That's why I'm commenting it.

296
00:29:05,460 --> 00:29:05,950
Okay.

297
00:29:06,020 --> 00:29:11,120
This proves we have the right data and then we are going to see over here

298
00:29:19,450 --> 00:29:25,610
finish initialized in OPs zero and Snopes 1 quite

299
00:29:29,070 --> 00:29:30,810
I'm gonna say wait.

300
00:29:31,200 --> 00:29:39,860
Random initialization and then this is number of

301
00:29:42,630 --> 00:29:51,960
number of hidden nodes by number of features and we want to store it in the synopsis of Matrix buffer

302
00:29:53,850 --> 00:29:54,300
right

303
00:29:57,680 --> 00:30:00,600
and we can print this out to verify.

304
00:30:00,620 --> 00:30:02,510
Oh leave that up to you.

305
00:30:02,630 --> 00:30:05,510
I'm going to initialize it snaps 1

306
00:30:15,530 --> 00:30:16,640
comment already exist.

307
00:30:16,640 --> 00:30:27,410
Sorry about the just copy and paste this and I'm gonna use our function Initialize one d and initialized

308
00:30:27,410 --> 00:30:28,900
1 D over here

309
00:30:33,270 --> 00:30:42,200
simply takes the article vector which is synopsis 1 and then the length which is number of output nodes

310
00:30:53,580 --> 00:30:58,770
there's the function we can verify it takes output vector in the length and then it generates the random

311
00:30:58,770 --> 00:31:01,450
numbers into that.

312
00:31:01,710 --> 00:31:07,340
Why do we have this way to random initialization on a score one to see

313
00:31:12,850 --> 00:31:15,920
um let's see the function name is different

314
00:31:18,920 --> 00:31:26,870
I'm gonna add random over here for uniformity which random initialization one day I'm gonna fix this

315
00:31:26,870 --> 00:31:33,570
in the c file as well and no this is fixed.

316
00:31:36,120 --> 00:31:36,540
Right.

317
00:31:38,610 --> 00:31:40,530
So we can print this out as well.

318
00:31:40,570 --> 00:31:46,410
So I encourage you to print this out to make sure you farm sort of initialize so accurately and print

319
00:31:46,410 --> 00:31:47,330
this out.

320
00:31:47,400 --> 00:31:50,880
I'm not going to do the want to go ahead and compute the C1 value

321
00:32:03,830 --> 00:32:06,870
and we can take a look at our formula for computer ze 1

322
00:32:09,980 --> 00:32:15,460
we said to ze 1 is simply multiplying the input x by the weight.

323
00:32:15,460 --> 00:32:16,210
Um yeah.

324
00:32:16,220 --> 00:32:21,480
What I see one or Z to the z value is to multiply input by the weight.

325
00:32:21,650 --> 00:32:24,320
So I'm going to do the over here we said as the two.

326
00:32:24,400 --> 00:32:24,980
No problem

327
00:32:30,190 --> 00:32:31,200
I'll see.

328
00:32:33,360 --> 00:32:39,690
I'm gonna use the multiple input multiple output neural network function that we have multiple input

329
00:32:39,780 --> 00:32:41,950
multiple output.

330
00:32:41,970 --> 00:32:48,630
Then the first argument is the input vector which is straight Example 1.

331
00:32:49,420 --> 00:32:53,700
The reason we've taken does a single training example is because we've not written the function to deal

332
00:32:53,700 --> 00:33:01,110
with the entire matrix and you can continue working on that and then the next argument or say a number

333
00:33:01,110 --> 00:33:01,800
of features

334
00:33:04,530 --> 00:33:06,570
and then we store it over here.

335
00:33:06,570 --> 00:33:12,440
This the output vector we call it c 1 e.g. 1.

336
00:33:13,510 --> 00:33:16,340
And then number of it in notes

337
00:33:19,120 --> 00:33:26,230
then the weight stored in synopsis here right.

338
00:33:26,240 --> 00:33:34,280
So once we've done this we can um we can compute a 1 which is the activation

339
00:33:44,300 --> 00:33:48,860
and we simply use our vector or sigmoid function and then

340
00:33:52,070 --> 00:33:59,970
there's the source we want to store the output in a one each one vector the length is number of.

341
00:33:59,990 --> 00:34:00,710
He didn't note

342
00:34:04,900 --> 00:34:14,500
once we've done this we can go on to use the output stored here to compute the next C value.

343
00:34:15,150 --> 00:34:18,460
So come over here see compute

344
00:34:27,630 --> 00:34:32,400
compute C 2 and then we do that by C as each to each one.

345
00:34:32,730 --> 00:34:40,350
This course we use the same function multiple input multiple output neural network multiple input single

346
00:34:40,380 --> 00:34:41,540
output in your network.

347
00:34:41,550 --> 00:34:42,090
This time

348
00:34:49,310 --> 00:34:59,730
and the input is the one each one vector the output should be stored in sign ups 1 vector and Atlantis

349
00:34:59,760 --> 00:35:02,880
number of hidden nodes

350
00:35:07,540 --> 00:35:16,250
commit a typo here and once this is done we can go ahead and compute the Y heart which is the same as

351
00:35:16,330 --> 00:35:16,820
a 2

352
00:35:29,730 --> 00:35:36,000
and that is simply calling our sigmoid function the single sigmoid function.

353
00:35:36,000 --> 00:35:41,850
So this might affect our sigmoid so we have to expose this function we wrote that we thought we might

354
00:35:41,850 --> 00:35:44,760
not need to expose where is that where is it.

355
00:35:44,850 --> 00:35:49,680
Um this one here this is sigmoid of a single value put it over here

356
00:35:57,060 --> 00:35:57,630
right

357
00:36:04,970 --> 00:36:15,520
so to compute a well why heart we simply say y heart each one equals sigmoid.

358
00:36:16,040 --> 00:36:19,930
And then we pause C to each you want over here.

359
00:36:19,940 --> 00:36:22,820
Like this.

360
00:36:22,820 --> 00:36:23,510
Right.

361
00:36:23,560 --> 00:36:28,040
So we are done we can print out the result.

362
00:36:28,730 --> 00:36:30,650
I'm gonna come over here see print f

363
00:36:34,950 --> 00:36:44,570
y hard each one and send percentage Jeff over here.

364
00:36:51,450 --> 00:36:53,160
Right.

365
00:36:53,610 --> 00:37:00,480
And I say let's print the spring term see what US will print f

366
00:37:04,550 --> 00:37:09,150
this is going to print all of them c to the G one

367
00:37:23,870 --> 00:37:25,130
scooch new lines

368
00:37:29,190 --> 00:37:31,440
and then we can print.

369
00:37:32,900 --> 00:37:35,060
Yeah I would advise you print the rest out.

370
00:37:35,060 --> 00:37:39,340
I mean I cannot spend the rest of the time this right in print statement.

371
00:37:39,350 --> 00:37:46,610
So yes um write a number of for loops print them out LSC and you can go through the computation you

372
00:37:46,610 --> 00:37:49,240
can in fact use pen and paper and go through it.

373
00:37:49,250 --> 00:37:56,470
If you find an anomaly you can send me a message but I'm confident it should be perfect.

374
00:37:56,470 --> 00:37:58,420
And when I reset I'm gonna clear First

375
00:38:04,280 --> 00:38:05,550
I'm gonna reset my board

376
00:38:09,990 --> 00:38:11,250
so that's what we have.

377
00:38:11,250 --> 00:38:14,730
We still print in the training example the Y had value.

378
00:38:14,880 --> 00:38:18,480
We came up with this one to a train an example of this.

379
00:38:18,500 --> 00:38:22,140
And um let me comment about the training example

380
00:38:24,840 --> 00:38:26,650
comment this out dude

381
00:38:30,730 --> 00:38:31,750
download onto the port

382
00:38:35,370 --> 00:38:40,500
go to terror to Claire we said to my boat.

383
00:38:40,680 --> 00:38:50,330
So we had ze one of course this and this the Y how to value we got right so um this brings us to the

384
00:38:50,330 --> 00:38:57,050
end of the lesson but like I said um I would advise that you print each level of computation over here

385
00:38:57,140 --> 00:39:02,720
and go through it make sure you understand it and it corresponds with what we said in the theoretical

386
00:39:02,720 --> 00:39:03,260
class.

387
00:39:03,860 --> 00:39:08,910
If you find any anomaly or if there is something you do not understand then just let me know.

388
00:39:08,990 --> 00:39:14,990
And also I should point out this is for just training example one mean and this is four to eight two

389
00:39:14,990 --> 00:39:16,670
hundred.

390
00:39:17,000 --> 00:39:17,400
Right.

391
00:39:18,740 --> 00:39:25,080
So we're not going to go into back propagation until we've learned other things with regards to Chi

392
00:39:25,090 --> 00:39:28,600
Carlos and other things about gradient descent.

393
00:39:28,700 --> 00:39:32,780
We have to pivot upon our knowledge before we can continue.

394
00:39:32,960 --> 00:39:40,580
But ideally after forward propagation we um we compute the difference between a y hut and a y value

395
00:39:41,240 --> 00:39:47,570
and then we land from there and then go on to perform third up you know weight adjustments like we mentioned

396
00:39:47,570 --> 00:39:55,010
earlier learning is simply reducing weight reducing weight or reducing loss as it is often said.

397
00:39:55,010 --> 00:39:59,420
So that's the end of this particular lesson if you have any questions just let me know and I'll see

398
00:39:59,420 --> 00:40:00,050
you later.

399
00:40:00,050 --> 00:40:00,700
Have a nice day.