1
00:00:00,780 --> 00:00:07,960
‫In the last lecture, we saw how to create, how to compile and how to train our classification model

2
00:00:08,220 --> 00:00:09,780
‫In tensorflow keras

3
00:00:11,590 --> 00:00:17,170
‫In this video, we will build and train a regression model using keras

4
00:00:18,860 --> 00:00:26,510
‫For this, we will be using the very popular regression dataset that is California housing data set.

5
00:00:28,560 --> 00:00:33,300
‫This dataset is available in a sklearn database library.

6
00:00:35,140 --> 00:00:42,460
‫The objective here is to predict the prices of house using eight different independent variables.

7
00:00:43,540 --> 00:00:44,980
‫So let's get started.

8
00:00:45,790 --> 00:00:52,080
‫First, we are importing some basic liabilities, such as numpy, pandas and matplotlib.

9
00:00:54,920 --> 00:00:57,770
‫Then we are importing tensorflow and keras

10
00:01:00,250 --> 00:01:08,050
‫And then since this data is available in sklearn dataset, we are also importing California housing

11
00:01:08,230 --> 00:01:09,570
‫from sklearn dataset.

12
00:01:10,930 --> 00:01:15,760
‫And we are saving this database into an another variable called housing.

13
00:01:18,090 --> 00:01:23,670
‫I also want to share one small shortcut to access the help of any function.

14
00:01:23,910 --> 00:01:32,310
‫So if you just click inbetween of any function parenthesis and then hold the shift key and click on tab.

15
00:01:32,670 --> 00:01:39,900
‫So if you hit shift plus tab , it will open the help or documentation of that function.

16
00:01:41,260 --> 00:01:46,530
‫If you hit shift and tab two times, it will expand the documentation.

17
00:01:48,050 --> 00:01:50,990
‫You can see here this function will return.

18
00:01:51,080 --> 00:01:54,380
‫Different parameters such as dot data.

19
00:01:54,780 --> 00:01:57,260
‫It will give us the independent variables.

20
00:01:57,490 --> 00:02:00,600
‫Dot target will give us the dependent variable.

21
00:02:01,760 --> 00:02:05,270
‫And the feature name will give us the details of features.

22
00:02:07,130 --> 00:02:08,810
‫Let's just close this.

23
00:02:09,590 --> 00:02:16,850
‫So if you have any doubt regarding any function, just click between parentheses and hit shift plus tab

24
00:02:19,410 --> 00:02:24,330
‫So I have already listed some of the information about this database.

25
00:02:25,920 --> 00:02:29,840
‫So in this database, there are around 20000 records.

26
00:02:30,780 --> 00:02:33,240
‫There are eight independent variables.

27
00:02:34,230 --> 00:02:37,930
‫We have first variable as med inc

28
00:02:38,730 --> 00:02:41,910
‫This is the medium income in that particular block.

29
00:02:41,970 --> 00:02:43,090
‫Warehouse is located.

30
00:02:44,100 --> 00:02:46,940
‫Then we have a second variable that is house age.

31
00:02:47,430 --> 00:02:50,550
‫This is the median house age in that block.

32
00:02:51,570 --> 00:02:55,560
‫Then we have average rooms, which is the average number of rooms.

33
00:02:56,010 --> 00:03:01,050
‫And then we have average bedrooms for average number of bedrooms.

34
00:03:01,710 --> 00:03:03,660
‫Next, we have population variable.

35
00:03:04,080 --> 00:03:05,700
‫That is the block population.

36
00:03:07,680 --> 00:03:10,260
‫Then we have average occupancy.

37
00:03:10,470 --> 00:03:12,650
‫That is the average house occupancy.

38
00:03:13,260 --> 00:03:21,090
‫And then latitude and longitude of that house block using this eight independent variables.

39
00:03:21,510 --> 00:03:24,540
‫We want to predict the value of the house.

40
00:03:25,020 --> 00:03:27,810
‫The values are in hundred thousand.

41
00:03:28,080 --> 00:03:30,660
‫So suppose if our y variable is 5

42
00:03:30,720 --> 00:03:32,850
‫That means the value of that house is

43
00:03:33,360 --> 00:03:34,980
‫Five hundred thousand dollars.

44
00:03:37,370 --> 00:03:38,960
‫So this is our dataset.

45
00:03:39,440 --> 00:03:44,150
‫We have this eight independent variable and one target variable as price

46
00:03:45,650 --> 00:03:50,960
‫If you want some more detail about this dataset, you can click on this documentation link.

47
00:03:52,700 --> 00:03:57,550
‫This will open the official sklearn documentation of this dataset. Here

48
00:03:58,090 --> 00:04:01,220
‫You will get to know about all the parameters that you can give.

49
00:04:02,040 --> 00:04:05,360
‫And what are included in this database?

50
00:04:06,620 --> 00:04:07,910
‫Let's just go back.

51
00:04:10,990 --> 00:04:15,160
‫So this housing is in the form of a dictionary.

52
00:04:17,650 --> 00:04:20,950
‫We have one key value pair as feature name

53
00:04:21,860 --> 00:04:25,100
‫So just look at the feature names first.

54
00:04:26,680 --> 00:04:31,120
‫You can see these are the eight variable names that we have discussed already.

55
00:04:33,720 --> 00:04:41,950
‫Now to access, the independent data we have to use housing.data and to access the independent data set.

56
00:04:42,330 --> 00:04:44,540
‫We have to use housing dor target.

57
00:04:47,430 --> 00:04:56,190
‫So in this line of code, we are splitting our data first into train full and test dataset.

58
00:04:57,450 --> 00:05:04,890
‫Then we add further dividing this train full dataset inti x train and X validation dataset.

59
00:05:08,200 --> 00:05:16,240
‫We will be using test train split from sklearn model selection, and then we will use test train,

60
00:05:16,330 --> 00:05:19,240
‫the split method to divide our data.

61
00:05:20,860 --> 00:05:24,760
‫We are not giving any additional parameter for test size.

62
00:05:25,330 --> 00:05:31,080
‫That's because by default, the test size is 25 percent of the total data.

63
00:05:32,200 --> 00:05:40,120
‫So 25 percent of the total data that is around 20000 rows will go into test set.

64
00:05:40,630 --> 00:05:45,780
‫And then the remaining 75 percent will go into training set

65
00:05:47,160 --> 00:05:49,320
‫All of that 75 percent.

66
00:05:49,920 --> 00:05:56,890
‫Again, 25 percent will go into validation set and rest of 75 percent will go into our training set.

67
00:05:58,270 --> 00:05:59,800
‫Let's run this.

68
00:06:02,700 --> 00:06:11,430
‫Next step is to process our data and we will be using the standard scalar from sklearn to standardize

69
00:06:11,430 --> 00:06:11,930
‫our data.

70
00:06:14,180 --> 00:06:20,490
‫In standardizing, we subtract the mean of each variable from their individual values.

71
00:06:21,020 --> 00:06:30,080
‫And then we also divide it by the variance because at the end we want all the variables with mean as

72
00:06:30,080 --> 00:06:33,080
‫zero and their variance as one 1

73
00:06:34,340 --> 00:06:42,230
‫This is a standard procedure to create any machine learning model the steps here are very simple.

74
00:06:42,530 --> 00:06:47,270
‫First, we are importing standard scalar from sklearn pre processing.

75
00:06:48,320 --> 00:06:53,900
‫Then we are creating the scalar object using standard scalar method.

76
00:06:55,150 --> 00:06:56,780
‫Then we are training.

77
00:06:56,780 --> 00:07:00,140
‫This scalar Object using our x train data

78
00:07:01,920 --> 00:07:09,230
‫So on our x train data, this scalar will find the values to subtract, as a mean, and to divide

79
00:07:09,230 --> 00:07:10,020
‫as a variance.

80
00:07:10,620 --> 00:07:12,480
‫Then we will use those values.

81
00:07:12,660 --> 00:07:18,120
‫Or this is scalar object to standardize our validation and test sets as well.

82
00:07:19,550 --> 00:07:22,200
‫Just so repeat, we are fitting.

83
00:07:22,260 --> 00:07:24,780
‫This is scalar object on our training data.

84
00:07:25,320 --> 00:07:29,070
‫And we are transforming our validation and test set using.

85
00:07:29,070 --> 00:07:33,150
‫This is scalar object that we have fitted it on our x train data.

86
00:07:35,840 --> 00:07:40,190
‫To fit will be using fit underscore transform method.

87
00:07:40,440 --> 00:07:44,700
‫And we will be using x train as parameter to transform.

88
00:07:44,760 --> 00:07:51,990
‫We will be using dort transform method of this scalar object and we will be using relevant datasets

89
00:07:51,990 --> 00:07:52,220
‫here.

90
00:07:55,720 --> 00:07:59,210
‫So let's create a standardized datasets.

91
00:08:00,430 --> 00:08:03,400
‫We are saving this object in their original name only.

92
00:08:03,760 --> 00:08:08,110
‫So we are replacing the ordinal x train with the standardized version of x train.

93
00:08:09,280 --> 00:08:14,370
‫Origional x validation set into standardized version of X validation set

94
00:08:14,470 --> 00:08:17,420
‫And same for test set as well.  Let's just.

95
00:08:17,830 --> 00:08:22,340
‫Run this if you want some more information on this scaling.

96
00:08:22,690 --> 00:08:25,900
‫You can always refer to sklearn documentation

97
00:08:26,100 --> 00:08:27,640
‫Abiut standard scalars.

98
00:08:29,380 --> 00:08:31,990
‫Next step is to set random seeds.

99
00:08:33,830 --> 00:08:37,760
‫This is to generate the same result every time we run this modern.

100
00:08:42,220 --> 00:08:49,630
‫Now, as I said earlier, our initial data set was of around 20000 rows ir records

101
00:08:50,440 --> 00:08:55,060
‫Now let us see the shape of our x train dataset.

102
00:08:57,560 --> 00:09:04,580
‫Here you can see we have eight columns and around eleven thousand six hundred records.

103
00:09:05,480 --> 00:09:05,960
‫In our x train dataset

104
00:09:06,020 --> 00:09:06,980
‫.

105
00:09:08,950 --> 00:09:17,710
‫We should around five thousand records in our X test dataset and around 4000 in valuation set.

106
00:09:20,820 --> 00:09:26,220
‫Now, let's create the structure for our regression neural network.

107
00:09:28,890 --> 00:09:31,590
‫Here we will be first having an input layer.

108
00:09:32,960 --> 00:09:37,520
‫Then we will be having the first dense layer with 30 neurons.

109
00:09:38,180 --> 00:09:42,920
‫Then we want to create second dense layer with another 30 neurons.

110
00:09:43,760 --> 00:09:51,620
‫And then since this is a regression problem, we will be having a single output neuron without any

111
00:09:51,620 --> 00:09:52,730
‫activation function.

112
00:09:54,620 --> 00:09:59,840
‫A single neuron, since we want a continuous value as our output.

113
00:10:01,370 --> 00:10:04,060
‫Again, we will be using the sequential api.

114
00:10:05,060 --> 00:10:10,040
‫We are saving this structure or our model as model.

115
00:10:12,550 --> 00:10:14,800
‫And then for the first layer

116
00:10:16,850 --> 00:10:17,630
‫We are writing.

117
00:10:17,690 --> 00:10:19,730
‫Keras dot layers, dot dense.

118
00:10:20,360 --> 00:10:25,950
‫Here in parenthesis, we have to provide the number of neurons, which is thirty.

119
00:10:26,280 --> 00:10:31,710
‫Then, as discussed in our three lectures, we will be using activation function as RELU.

120
00:10:32,450 --> 00:10:40,010
‫And then since this is our first written layer, we need to provide the input shape since the number

121
00:10:40,010 --> 00:10:43,630
‫of independent variables in our data is eight.

122
00:10:44,400 --> 00:10:47,240
‫We will be using input shape equal to 8

123
00:10:52,130 --> 00:10:54,860
‫You can also use like this input shape.

124
00:10:55,220 --> 00:11:02,680
‫Equal to x train dot shape and then calling the second and so on, elements of over input shape.

125
00:11:03,860 --> 00:11:09,140
‫This way you don't have to worry about changing this number every time you change your data base.

126
00:11:09,650 --> 00:11:17,870
‫You can just write x train dot shape and it will automatically get the number of variables from the

127
00:11:17,870 --> 00:11:20,900
‫shape attribute of our x train object.

128
00:11:23,680 --> 00:11:29,050
‫So this is the structure of our first dense layer will be create

129
00:11:29,140 --> 00:11:31,330
‫Second, dense layer in a similar fashion.

130
00:11:31,780 --> 00:11:34,950
‫We will be using kreas dot layer dot dense.

131
00:11:35,380 --> 00:11:39,580
‫And then the number of neurons in the parenthesis, which is 30.

132
00:11:40,030 --> 00:11:42,300
‫And activation function is relu.

133
00:11:44,070 --> 00:11:49,700
‫Similarly, for the output layer, we will be using keras, dot layers dense.

134
00:11:50,130 --> 00:11:56,550
‫And since this is a regression problem, we will be using a single neuron without any activation function.

135
00:11:58,670 --> 00:11:59,570
‫Just run this.

136
00:12:01,580 --> 00:12:08,000
‫And again, one important thing you can comment using hash symbol inside the cells.

137
00:12:09,560 --> 00:12:16,980
‫So python will execute only this part of the code and will not be executing any code which starts with

138
00:12:16,980 --> 00:12:20,050
‫Hash, hash represent for starting the comment.

139
00:12:21,670 --> 00:12:30,700
‫So now we have created the structure or architecture of our neural network. Now just to conform and view

140
00:12:30,700 --> 00:12:31,310
‫The structure.

141
00:12:31,670 --> 00:12:35,860
‫We can call dot summary method just write model dot

142
00:12:40,890 --> 00:12:47,490
‫Summary, here you will get the information about the structure that we have created.

143
00:12:47,640 --> 00:12:50,280
‫So we have first dense layer with 30 neuron

144
00:12:51,330 --> 00:12:53,460
‫We have second dense, layered with 30 neurons.

145
00:12:53,940 --> 00:12:58,350
‫And lastly, we have a single output layer with one neuron.

146
00:13:00,450 --> 00:13:02,030
‫This is what we wanted.

147
00:13:03,180 --> 00:13:06,680
‫The next step should be to compile this model.

148
00:13:09,280 --> 00:13:14,830
‫Again, the compile method works similar for both classification and regression model.

149
00:13:15,970 --> 00:13:19,900
‫First, we have to mention the loss in classification.

150
00:13:19,930 --> 00:13:21,380
‫We were using Cross entropy

151
00:13:22,360 --> 00:13:29,970
‫But here, since we are running regression, we have to use mean squared error, also known as mse

152
00:13:31,840 --> 00:13:34,140
‫The second parameter is optimizer.

153
00:13:36,010 --> 00:13:41,200
‫Again, here also we are using as sgd, socastic gradient descent.

154
00:13:42,370 --> 00:13:49,870
‫And here we are also providing the learning rate by default, the value of learning rated zero point

155
00:13:49,870 --> 00:13:52,740
‫zero one and to change

156
00:13:53,740 --> 00:13:54,850
‫You can just write.

157
00:13:56,150 --> 00:13:58,330
‫The new value in the parentheses

158
00:14:00,540 --> 00:14:04,090
‫We have already discussed what is learning rate in our theory lecture.

159
00:14:05,080 --> 00:14:08,330
‫So if you have any doubts, just revisit that lecture.

160
00:14:09,940 --> 00:14:13,270
‫And then the next parameter that we are passing is metrics.

161
00:14:13,600 --> 00:14:15,250
‫This is an optional parameter.

162
00:14:17,490 --> 00:14:20,060
‫In classification, we were using accuracy.

163
00:14:21,080 --> 00:14:31,490
‫But in regression, we can use mean absolute error or mae absolute error is the difference between

164
00:14:31,490 --> 00:14:34,040
‫the predicted value and the actual value.

165
00:14:35,060 --> 00:14:39,960
‫Whereas the mean is squared error is the square of that difference.

166
00:14:41,390 --> 00:14:49,160
‫So we are calculating both mean squared error as our function and nae as the metrics we additionally

167
00:14:49,160 --> 00:14:50,060
‫want to calculate.

168
00:14:51,480 --> 00:14:59,370
‫Again, just remember, if you want to look at the documentation or help, just click inside any of

169
00:14:59,370 --> 00:15:00,840
‫the parenthesis and.

170
00:15:01,940 --> 00:15:07,550
‫And press shift plus tab, you will get this kind of documentation.

171
00:15:07,880 --> 00:15:12,850
‫And here you can see that by default the learning rate value is 0.01.

172
00:15:13,970 --> 00:15:17,420
‫So in classification, we did not provided any learning rate.

173
00:15:17,750 --> 00:15:22,530
‫So the learning rate that was used, there was zero point zero one.

174
00:15:24,260 --> 00:15:27,430
‫But you can always change these values according to your need.

175
00:15:29,450 --> 00:15:30,550
‫Let's run this.

176
00:15:31,360 --> 00:15:33,140
‫So we have compiled our model.

177
00:15:35,640 --> 00:15:40,740
‫The next step, the next step is to train our model  using training data.

178
00:15:42,240 --> 00:15:44,340
‫The method or the process is same.

179
00:15:44,940 --> 00:15:48,760
‫We are creating an another object model underscoring history for training.

180
00:15:49,530 --> 00:15:57,930
‫Then we are using model dot fit and we are passing our training dataset number of epochs and the validation

181
00:15:57,980 --> 00:15:59,730
‫dataset that we have created.

182
00:16:00,890 --> 00:16:02,950
‫Just run this statement.

183
00:16:05,860 --> 00:16:06,250
‫Again.

184
00:16:08,740 --> 00:16:14,980
‫Just like the classification model, you will get the lost value, which is the mean squared error.

185
00:16:15,910 --> 00:16:20,380
‫You will get the mae value mean absolute terror.

186
00:16:21,340 --> 00:16:25,960
‫And similarly, you will get these two values for your validation set as well

187
00:16:26,000 --> 00:16:26,380
‫.

188
00:16:30,070 --> 00:16:37,150
‫And you can see that the loss on both training set and validation set is decreasing with each epoch.

189
00:16:39,550 --> 00:16:43,900
‫Now we have these values for our training and validation set.

190
00:16:44,680 --> 00:16:49,600
‫We can also evaluate performance of this train model on our test set.

191
00:16:50,860 --> 00:16:55,120
‫And we are going to use the same method as fitted with classification model.

192
00:16:55,540 --> 00:16:58,140
‫We'll call model DOT evaluate.

193
00:16:58,450 --> 00:17:00,860
‫And then we will pass our training test.

194
00:17:01,860 --> 00:17:02,500
‫Run this

195
00:17:04,990 --> 00:17:10,180
‫You can see on our training dataset loss is zero point three.

196
00:17:10,420 --> 00:17:19,870
‫That is mse or meann squared error and mae mean absolute error is zero point four four nine three.

197
00:17:22,420 --> 00:17:29,200
‫Now, just like in classification, we can call model history dot history, that will give us the

198
00:17:29,320 --> 00:17:32,860
‫values of all this metrices  in the form of dictionary.

199
00:17:33,800 --> 00:17:40,210
‫Run this, here you will get the loss and mae on training dataset.

200
00:17:40,300 --> 00:17:40,990
‫.

201
00:17:41,130 --> 00:17:43,200
‫And validation loss validation mae.

202
00:17:43,890 --> 00:17:53,410
‫The beauty of this is we can plot this dictionary on a plot just like we did for classification, and

203
00:17:53,410 --> 00:17:55,630
‫that will show us how we're training.

204
00:17:55,630 --> 00:18:03,160
‫Loss and validation loss are changing with each epoch and whether we have achieved the convergence or

205
00:18:03,160 --> 00:18:03,540
‫not.

206
00:18:04,770 --> 00:18:05,830
‫Let's run this.

207
00:18:07,970 --> 00:18:14,480
‫So you can see we have the loss values and the mae values for our training and validation

208
00:18:14,480 --> 00:18:16,100
‫set plotted on this graph.

209
00:18:16,880 --> 00:18:24,920
‫And one thing to notice is this graph is still going down, meaning that if we run some more epochs,

210
00:18:25,700 --> 00:18:31,190
‫this will further decrease the losses and improve the accuracy of our model.

211
00:18:33,220 --> 00:18:40,140
‫So this is the one way to tell whether you have achieved convergence or not or whether you have to increase

212
00:18:40,140 --> 00:18:41,650
‫your epoch value or not.

213
00:18:42,700 --> 00:18:46,440
‫You have to look at this validation, loss and validation mae value.

214
00:18:47,620 --> 00:18:49,090
‫So this the validation loss.

215
00:18:49,420 --> 00:18:53,410
‫And you can clearly see that it is going down.

216
00:18:55,490 --> 00:18:59,630
‫So to improve the accuracy, we can re run this code to

217
00:19:02,060 --> 00:19:04,210
‫Run it for 20 more epochs.

218
00:19:09,140 --> 00:19:11,240
‫So just go there.

219
00:19:13,310 --> 00:19:20,780
‫Now, one important thing about keras is, keras have the weights and Bias's value in the memory.

220
00:19:21,020 --> 00:19:27,850
‫So if you just re run this whole this statement, again, this will not train the data from start.

221
00:19:28,190 --> 00:19:32,720
‫But it will start training the data from this position.

222
00:19:34,790 --> 00:19:40,960
‫So if we run this statement two times that is similar to running a statement with 40 epochs.

223
00:19:42,660 --> 00:19:45,420
‫If we just re run it one more time.

224
00:19:48,020 --> 00:19:50,450
‫You can see earlier the lost values were.

225
00:19:51,600 --> 00:19:55,160
‫Around point seven or  point eight for the first epoch

226
00:19:55,660 --> 00:19:57,730
‫And then gradually decreasing.

227
00:19:58,240 --> 00:20:02,170
‫But now we have started from the 21st epoch.

228
00:20:07,620 --> 00:20:11,190
‫Last time, the lost value on our test set was zero point three zero.

229
00:20:11,790 --> 00:20:16,020
‫Let's see whether we have improved this lost value or not.

230
00:20:18,000 --> 00:20:23,310
‫You can see the loss is decrease from zero point three to zero point two five.

231
00:20:25,230 --> 00:20:32,260
‫So our hypothesis was correct that the model was not converged in 20 epoch.

232
00:20:33,000 --> 00:20:38,360
‫There was room of improvement and B, B, then the whole more than 420 more epochs.

233
00:20:38,460 --> 00:20:40,370
‫That is a total of 40 epochs.

234
00:20:43,630 --> 00:20:45,300
‫And we can see this graph.

235
00:20:46,850 --> 00:20:48,160
‫You can just.

236
00:20:49,050 --> 00:20:51,240
‫Focus on this validation loss line.

237
00:20:51,870 --> 00:20:56,910
‫Earlier, it was 0.4 around, re run this.

238
00:20:59,300 --> 00:21:02,780
‫There is a slight decrease in validation loss.

239
00:21:03,020 --> 00:21:06,320
‫Now you can see that the line is flattened out.

240
00:21:06,890 --> 00:21:11,060
‫This means we have achieved the convergence on this model.

241
00:21:12,440 --> 00:21:14,360
‫So not just with regression.

242
00:21:14,780 --> 00:21:16,880
‫If you are running classification model as well.

243
00:21:18,050 --> 00:21:22,640
‫Just look at this graph to identify whether you have achieved convergence or not.

244
00:21:24,250 --> 00:21:30,740
‫Now, to predict the values on the new dataset, you can always use, dot predict method.

245
00:21:31,400 --> 00:21:34,310
‫So your object name and dot predict method.

246
00:21:34,760 --> 00:21:36,330
‫And then the new dataset.

247
00:21:37,310 --> 00:21:38,830
‫I don't have any new dataset.

248
00:21:38,960 --> 00:21:44,000
‫So I'm just creating the sample of first three values of my X dataset.

249
00:21:44,120 --> 00:21:53,420
‫And considering it as my new dataset and then saving the information in Y predicted values and using

250
00:21:53,520 --> 00:21:54,070
‫model dot

251
00:21:54,080 --> 00:21:59,870
‫Predict method to predict the values this are the predicted values using this model.

252
00:22:01,760 --> 00:22:04,580
‫That's all for this lecture. In the next lecture.

253
00:22:04,640 --> 00:22:08,820
‫We will be looking at the functional api of keras.

254
00:22:09,830 --> 00:22:10,220
‫Thank you.