1
00:00:00,080 --> 00:00:05,780
In this session, we shall be looking at modeling and how to build simple machine learning models with

2
00:00:05,780 --> 00:00:06,530
TensorFlow.

3
00:00:06,530 --> 00:00:14,540
Now, in our specific case, our model turns out to be this um function y equals mx plus c.

4
00:00:14,540 --> 00:00:16,160
So here's our model.

5
00:00:16,160 --> 00:00:19,790
This is actually what's um in this model block we have here.

6
00:00:19,790 --> 00:00:29,690
And what we want to do is put in or pass in this data into the model such that the model learns the

7
00:00:29,690 --> 00:00:35,720
right values for M and C, which best represent our data.

8
00:00:35,720 --> 00:00:40,460
So let's suppose that we have, um we randomly pick m to be zero.

9
00:00:40,490 --> 00:00:43,880
Let's, let's let's randomly initialize m to be zero.

10
00:00:43,880 --> 00:00:47,660
And then we'll also randomly initialize c to be zero.

11
00:00:47,660 --> 00:00:53,990
In that case uh, if if here we have zero then we shall take off this line.

12
00:00:53,990 --> 00:00:55,370
Let's take off this line.

13
00:00:55,580 --> 00:01:01,250
Um, and now we will have a new line which will instead be something like this.

14
00:01:01,250 --> 00:01:02,810
Well, actually, this is zero.

15
00:01:02,810 --> 00:01:07,010
This is the line y equals zero because m is zero, C is zero, so y equals zero.

16
00:01:07,010 --> 00:01:12,800
So what we'll be saying is um, this line should represent our data.

17
00:01:12,860 --> 00:01:20,540
But looking clearly at our data, this line actually doesn't, um, represent our data properly.

18
00:01:20,540 --> 00:01:22,280
Now let's change this again.

19
00:01:22,280 --> 00:01:25,550
Let's say m is um for example one.

20
00:01:25,550 --> 00:01:28,670
So let's get back and set M to be one.

21
00:01:28,670 --> 00:01:31,160
So here we have M now which is one.

22
00:01:31,160 --> 00:01:34,910
If m is one then y is equal one times x plus zero.

23
00:01:34,910 --> 00:01:35,960
So y equal x.

24
00:01:35,960 --> 00:01:43,190
The line y equals x actually looks something like this would have a line like this one, which now does

25
00:01:43,190 --> 00:01:51,350
a much better job as compared to this other line when it comes to modeling the data we have at hand.

26
00:01:51,350 --> 00:01:56,600
Now, this doesn't mean this is the best possibility as we could have, um, other possibilities.

27
00:01:56,600 --> 00:01:57,740
We could have a line like this.

28
00:01:57,740 --> 00:01:59,360
We could have a line like this.

29
00:01:59,360 --> 00:02:01,340
We could even have lines like this.

30
00:02:01,340 --> 00:02:02,720
Um, and so on and so forth.

31
00:02:02,720 --> 00:02:11,180
So the the idea here is that we want to get those values of M and C, which best represent our data.

32
00:02:11,180 --> 00:02:20,060
Now it turns out that the M and C we're talking about here are what we'll call the weights and the biases

33
00:02:20,060 --> 00:02:21,170
of the model.

34
00:02:21,170 --> 00:02:24,590
So the in this case our m is the weight.

35
00:02:24,590 --> 00:02:27,770
And then C is the model's bias.

36
00:02:27,770 --> 00:02:36,650
And so if you get that a model has for example 7 billion parameters, then it means that we have seven

37
00:02:36,650 --> 00:02:46,820
billions of this um kinds of weights, um, and biases which are being tuned such that the model represents

38
00:02:46,820 --> 00:02:48,620
the data properly.

39
00:02:48,620 --> 00:02:52,460
That said, let's get back to the code and create a simple model.

40
00:02:52,460 --> 00:02:55,460
So here we have our model TensorFlow.

41
00:02:55,670 --> 00:03:00,470
Um, Keras um sequential sequential.

42
00:03:00,740 --> 00:03:02,360
And then we have this list.

43
00:03:02,360 --> 00:03:05,420
So the first thing we'll put in this list is the normalizer.

44
00:03:05,420 --> 00:03:11,510
We have our normalizer because or whatever value of X or whatever inputs we're passing in we want them

45
00:03:11,510 --> 00:03:12,260
to be normalized.

46
00:03:12,260 --> 00:03:12,830
First.

47
00:03:12,830 --> 00:03:15,110
Remember the normalizer was a layer.

48
00:03:15,110 --> 00:03:17,240
So you could remember you could check this.

49
00:03:17,240 --> 00:03:20,150
From here we import a normalization from our layers.

50
00:03:20,150 --> 00:03:25,460
And then we um adapted our normalizer to our data.

51
00:03:25,460 --> 00:03:30,710
Remember we had done this here and then we had adapted our normalizer to our inputs.

52
00:03:30,710 --> 00:03:37,580
And so when we get back here, you see we have this normalizer which is already adapted to our inputs.

53
00:03:37,580 --> 00:03:41,480
And then now from here we have a dense layer.

54
00:03:41,660 --> 00:03:49,550
It turns out that this dense layer is simply the y equal, or it's simply the m x mx plus c, which

55
00:03:49,550 --> 00:03:50,210
we spoke of.

56
00:03:50,210 --> 00:03:55,040
So you don't need to write mx plus C or some complicated math formula.

57
00:03:55,040 --> 00:03:59,390
All you need to do is specify you just have this dense and that's fine.

58
00:03:59,390 --> 00:04:00,830
Now dense is a layer.

59
00:04:00,830 --> 00:04:05,930
So here you also include dense dense on that.

60
00:04:05,930 --> 00:04:07,820
And that should be fine.

61
00:04:07,820 --> 00:04:09,320
So let's get back.

62
00:04:09,320 --> 00:04:12,110
We have our sequential API.

63
00:04:12,230 --> 00:04:15,740
This is sequential API takes a normalizer and dense.

64
00:04:15,740 --> 00:04:20,150
And then we could um run model summary.

65
00:04:20,150 --> 00:04:22,760
So we check out our model summary.

66
00:04:22,760 --> 00:04:24,500
You see we have our inputs.

67
00:04:24,500 --> 00:04:25,790
We have our dense layer.

68
00:04:25,790 --> 00:04:28,430
We have some parameter numbers a number of parameters.

69
00:04:28,430 --> 00:04:30,350
Here we have total parameters.

70
00:04:30,350 --> 00:04:32,420
We have non trainable parameters.

71
00:04:32,420 --> 00:04:33,920
And then the trainable parameters.

72
00:04:33,920 --> 00:04:38,600
We also have um the amount of memory space this occupies.

73
00:04:38,600 --> 00:04:40,340
So this is just in bytes.

74
00:04:40,370 --> 00:04:45,830
Now it should be noted that um using the sequential API is not the only way in which you could create,

75
00:04:45,830 --> 00:04:48,650
um, models or machine learning models.

76
00:04:48,650 --> 00:04:53,630
With TensorFlow you have the sequential sequential API.

77
00:04:53,660 --> 00:04:56,450
You also have the functional API.

78
00:04:57,020 --> 00:04:59,660
And then you will have um, you could.

79
00:05:00,070 --> 00:05:00,850
Go through.

80
00:05:00,850 --> 00:05:02,080
Model.

81
00:05:03,040 --> 00:05:03,970
Model.

82
00:05:03,970 --> 00:05:04,930
Subclassing.

83
00:05:04,930 --> 00:05:06,820
So subclassing.

84
00:05:07,570 --> 00:05:12,760
Now getting back to documentation, you could head over to TensorFlow and then Keras.

85
00:05:12,790 --> 00:05:15,490
Then you click here on sequential.

86
00:05:15,490 --> 00:05:20,650
And basically what this um sequential API takes in is some layers.

87
00:05:20,650 --> 00:05:22,330
That's it and its name.

88
00:05:22,330 --> 00:05:26,350
So you could get back here and then give this a name.

89
00:05:26,350 --> 00:05:28,690
Let's call this um yeah.

90
00:05:28,720 --> 00:05:37,270
As I said we're taking the layers and the name so we could call this our first sequential API.

91
00:05:37,270 --> 00:05:38,140
Run that.

92
00:05:39,210 --> 00:05:42,270
And you notice how this this name here changes.

93
00:05:42,270 --> 00:05:44,100
So let's get back to documentation.

94
00:05:44,100 --> 00:05:46,710
We have the layers text the layers and name.

95
00:05:46,710 --> 00:05:51,480
And then you'll also notice that your the way is defined is a bit different from what we just saw.

96
00:05:51,480 --> 00:05:58,350
So this is another way in which we could um define or or build up a sequential, uh, model based on

97
00:05:58,350 --> 00:05:59,520
the sequential API.

98
00:05:59,520 --> 00:06:03,240
So let's create a new code cell down.

99
00:06:03,240 --> 00:06:08,070
And then here we have our model, which is a sequential uh, based model.

100
00:06:08,070 --> 00:06:10,470
Then we have our normalizer.

101
00:06:10,470 --> 00:06:14,070
So our normalizer let's take this off.

102
00:06:14,550 --> 00:06:16,560
Then we have the dense layer.

103
00:06:17,280 --> 00:06:20,250
And then number of outputs of the dense layer is one.

104
00:06:20,250 --> 00:06:24,840
Actually the reason why we have this number of outputs to be one is simple.

105
00:06:25,080 --> 00:06:31,710
You remember we have um this inputs and then we have a single output.

106
00:06:32,010 --> 00:06:34,020
So here we have eight inputs.

107
00:06:34,020 --> 00:06:35,340
Let's say one.

108
00:06:35,340 --> 00:06:38,970
Well we have one up to eight inputs.

109
00:06:38,970 --> 00:06:42,000
So we go from 1 to 8 inputs.

110
00:06:42,000 --> 00:06:45,660
But the output or the number of outputs here is just one.

111
00:06:45,660 --> 00:06:52,350
So because um, it's just one, when you're creating the dense layer you just need to specify the number

112
00:06:52,350 --> 00:06:56,190
of outputs your model is expected to produce.

113
00:06:56,190 --> 00:07:03,330
And so getting back here, we just um, after adding all this up we will do model summary and we get

114
00:07:03,330 --> 00:07:04,680
that model summary.

115
00:07:04,680 --> 00:07:05,820
So that's fine.

116
00:07:05,820 --> 00:07:11,820
You see we have the exact same model using the add method.

117
00:07:11,820 --> 00:07:16,290
So instead of putting all these layers in this list you could make use of the add method.

118
00:07:16,290 --> 00:07:22,980
So if you're wondering why we chose the sequential API instead of maybe the functional API or the model

119
00:07:22,980 --> 00:07:25,380
subclassing the answer is simple.

120
00:07:25,380 --> 00:07:31,740
We have a model which is made from stacking up different, um, layers.

121
00:07:31,740 --> 00:07:35,100
So here we could consider this to be our normalized layer.

122
00:07:35,100 --> 00:07:36,420
Let's let's enlarge.

123
00:07:36,420 --> 00:07:37,980
This takes up more space.

124
00:07:37,980 --> 00:07:42,840
So our model is made of stacking um up our normalizer layer.

125
00:07:42,840 --> 00:07:46,200
Let's copy this and paste this out I'll stick back that.

126
00:07:46,200 --> 00:07:48,210
So we have our normalized layer.

127
00:07:48,210 --> 00:07:50,610
And then we also have our dense layer.

128
00:07:50,610 --> 00:07:57,720
So because our model is made of um or is composed of these two layers stacked, we could uh make use

129
00:07:57,720 --> 00:08:02,850
of the sequential API as this API suits these kinds of model configuration.

130
00:08:02,850 --> 00:08:07,110
So if we have let's say n layers, let's reduce this now.

131
00:08:07,110 --> 00:08:13,260
So we could have so many layers actually stacked so far as they're stacked um sequentially.

132
00:08:13,260 --> 00:08:18,450
Then you could make use of the sequential API for more complex configurations.

133
00:08:18,450 --> 00:08:23,460
You may want to, uh, work with a functional or model subclass, but we're going to look at those in

134
00:08:23,460 --> 00:08:24,840
some subsequent sections.

135
00:08:24,840 --> 00:08:28,980
So we have this up to let's say n layers.

136
00:08:29,310 --> 00:08:33,060
See we have all those different layers which could be stacked.

137
00:08:33,060 --> 00:08:38,130
And now we could make use of our sequential API.

138
00:08:38,160 --> 00:08:45,000
Now if you look at this, uh, model summary, you'll notice that we have for the dense layer nine parameters.

139
00:08:45,000 --> 00:08:47,970
The reason why we have this nine parameters is simple.

140
00:08:47,970 --> 00:08:52,170
We have an input here which has eight.

141
00:08:52,680 --> 00:08:54,870
We are with eight different features.

142
00:08:54,870 --> 00:08:57,300
Our input was 1000 by eight.

143
00:08:57,300 --> 00:09:00,210
And so here we have eight features.

144
00:09:00,210 --> 00:09:06,360
Now with this eight features getting into our model we have uh this actually X we'll look we'll look

145
00:09:06,360 --> 00:09:07,200
at that as x.

146
00:09:07,200 --> 00:09:11,070
We're going to ignore the batch dimension here.

147
00:09:11,070 --> 00:09:12,330
We ignore this dimension.

148
00:09:12,330 --> 00:09:16,860
We focus more on on on this one uh where we have eight features.

149
00:09:16,860 --> 00:09:24,420
And then now when we, uh, taking this x, you would find that we, we would have a weight for each

150
00:09:24,420 --> 00:09:26,400
and every x here.

151
00:09:26,430 --> 00:09:27,510
Remember this is x.

152
00:09:27,510 --> 00:09:37,590
This x we have here could be rewritten as X1X2 up to x eight.

153
00:09:38,160 --> 00:09:42,960
You see that you could put this in a in a vector where we have x one up to x eight.

154
00:09:42,960 --> 00:09:52,710
And so when we have this this input we could simply do M1X1 plus M2X2.

155
00:09:53,370 --> 00:09:54,090
This is a.

156
00:09:54,090 --> 00:09:54,960
This is the x1.

157
00:09:54,960 --> 00:09:57,000
This is x2 which is from our inputs.

158
00:09:57,000 --> 00:10:03,690
So um m2 x2 then um, right up to um

159
00:10:03,690 --> 00:10:09,240
M8X8M8X8.

160
00:10:09,240 --> 00:10:13,740
So all this here represents our m x.

161
00:10:13,890 --> 00:10:15,840
And then the plus c.

162
00:10:15,870 --> 00:10:17,910
Here is the bias.

163
00:10:17,910 --> 00:10:20,580
So we'll have plus c.

164
00:10:20,580 --> 00:10:30,180
And now um for our output we have our y where y is simply equal um M1X1 plus up to M8X8 plus c.

165
00:10:30,180 --> 00:10:38,460
So one thing you could notice already is the fact that we have, um, one, two, up to eight weights

166
00:10:38,460 --> 00:10:42,540
plus this bias giving us nine parameters.

167
00:10:42,630 --> 00:10:48,240
And that's why when you got back here, you see, we have uh, when you get back here, we have nine

168
00:10:48,240 --> 00:10:55,860
parameters for the normalization layer since it's already been adapted or with the data set, you have

169
00:10:55,860 --> 00:10:57,450
known trainable parameters.

170
00:10:57,450 --> 00:11:03,210
So this represents a non trainable parameters as we don't need to update this as we train because the

171
00:11:03,210 --> 00:11:06,030
those parameters are that's the mean and standard.

172
00:11:06,030 --> 00:11:11,280
And the standard deviation or the variance have already been adapted um on the data set.

173
00:11:11,280 --> 00:11:12,750
That's why they are non trainable.

174
00:11:12,750 --> 00:11:19,650
While uh these are trainable meaning that we want to update them as we um train our data.

175
00:11:19,650 --> 00:11:25,620
Another nice utility function which we could use is the TensorFlow plot model.

176
00:11:25,620 --> 00:11:27,720
So we could do TensorFlow.

177
00:11:27,870 --> 00:11:29,400
TensorFlow.

178
00:11:29,640 --> 00:11:37,650
Um Keras plot model or actually Keras utils um plot model.

179
00:11:37,650 --> 00:11:39,420
We specify the model.

180
00:11:39,420 --> 00:11:43,380
Then once we specify the model we also specify the output shape.

181
00:11:43,590 --> 00:11:52,200
So here we have two file um my first model dot png.

182
00:11:52,410 --> 00:11:57,300
And then we set the show shapes to true.

183
00:11:57,900 --> 00:12:01,200
So we have to file equals that.

184
00:12:01,200 --> 00:12:03,090
Take that off and there we go.

185
00:12:03,090 --> 00:12:05,220
So let's run that again and see what we get.

186
00:12:06,090 --> 00:12:13,590
See we have this output now which is this um nice looking figure which is the summary of our model.

187
00:12:13,590 --> 00:12:17,730
We could now we have this image right here which we could download.

188
00:12:17,730 --> 00:12:18,900
So that's it.

189
00:12:18,930 --> 00:12:23,070
We have seen how to, um, create these kinds of summaries.

190
00:12:23,160 --> 00:12:28,020
Uh, we understand what each and every, um, statement we have here means.

191
00:12:28,020 --> 00:12:30,840
And then we've seen how to make this beautiful plots.

192
00:12:30,870 --> 00:12:35,820
Now, in the case where we didn't have a normalizer which already understands the data, given that

193
00:12:35,820 --> 00:12:40,200
it has been adapted on it, we could, uh, we need to specify our input layer.

194
00:12:40,200 --> 00:12:44,070
So here we have the input layer and then we'll specify its shape.

195
00:12:44,070 --> 00:12:45,840
So here we give it eight.

196
00:12:45,990 --> 00:12:50,010
Uh notice how we ignore the batch dimension.

197
00:12:50,040 --> 00:12:55,020
See we ignoring this because this could take any value as we could have.

198
00:12:55,020 --> 00:12:59,730
Um, or we could work with data, um, or in batches of eight.

199
00:12:59,730 --> 00:13:03,780
We could work in data of or way more larger batches and so on and so forth.

200
00:13:03,780 --> 00:13:09,510
And so because this actually varies or changes depending on whatever situation we're having to work

201
00:13:09,510 --> 00:13:13,860
with, we focus on the other dimensions.

202
00:13:13,860 --> 00:13:15,060
In this case we have eight.

203
00:13:15,060 --> 00:13:16,860
So there we have our input layer.

204
00:13:16,860 --> 00:13:20,910
Let's go ahead and import that um input layer.

205
00:13:20,910 --> 00:13:21,930
Run that again.

206
00:13:22,140 --> 00:13:25,770
And then let's run this and then see what we get.

207
00:13:25,770 --> 00:13:28,140
So we have here input layer.

208
00:13:28,140 --> 00:13:28,920
We put that comma.

209
00:13:28,920 --> 00:13:31,590
We separate the different layers because that's a list.

210
00:13:31,890 --> 00:13:35,340
Uh we get an error unrecognised keyword argument shape.

211
00:13:35,340 --> 00:13:37,200
Well this actually input shape.

212
00:13:37,200 --> 00:13:40,020
So let's change this to input shape.

213
00:13:40,930 --> 00:13:42,100
And that's fine.

214
00:13:42,100 --> 00:13:43,030
So you see that?

215
00:13:43,030 --> 00:13:44,050
That works fine.

216
00:13:44,050 --> 00:13:46,030
Now let's change this to nine.

217
00:13:46,030 --> 00:13:47,110
Run that again.

218
00:13:47,110 --> 00:13:53,470
And you see we have an error because the normalization layer expects our input shape to be eight.

219
00:13:53,980 --> 00:13:56,920
Run that again and we see our plot.

220
00:13:56,920 --> 00:13:58,600
Now notice how this plot has changed.

221
00:13:58,600 --> 00:14:02,020
Now we have eight um instead of known as before.

222
00:14:02,020 --> 00:14:06,130
So let's take let's take off the input layer run that.

223
00:14:06,790 --> 00:14:13,210
You see before we had known known, but now we have known shape when we, when we, when we add this

224
00:14:13,210 --> 00:14:14,170
input layer.

225
00:14:14,170 --> 00:14:22,990
So um, this um utils function understands that now we have inputs which are of shape known by eight

226
00:14:22,990 --> 00:14:24,340
or batch by eight.

227
00:14:24,340 --> 00:14:26,860
So that's it for the section on modeling.

228
00:14:26,860 --> 00:14:32,110
We shall now move on to our next section which is that of error measurement.