1
00:00:00,780 --> 00:00:07,960
‫In the last lecture we saw how to create how to combine and how to screen our classification more than

2
00:00:08,260 --> 00:00:12,730
‫intensive look at us in this video.

3
00:00:12,820 --> 00:00:19,500
‫We will be and train a regression model using gave us for this.

4
00:00:19,520 --> 00:00:29,380
‫We will be using the very popular regression dataset that is California housing data set this dataset

5
00:00:29,830 --> 00:00:35,100
‫is available in a Skillern database library.

6
00:00:35,110 --> 00:00:42,460
‫The objective here is to predict the prices of homes using a different independent variables.

7
00:00:43,540 --> 00:00:45,690
‫So let's get started.

8
00:00:45,820 --> 00:00:56,210
‫First we are importing some basic liabilities such as Num by on does and my lib then we are importing

9
00:00:56,220 --> 00:01:06,860
‫tens of flowing gave us and then since this data is available in and school on dataset we are also importing

10
00:01:06,860 --> 00:01:14,720
‫California housing from escalating dataset and we are saving this database and going another way label

11
00:01:15,320 --> 00:01:18,090
‫housing.

12
00:01:18,090 --> 00:01:23,940
‫I also want to share one small shortcut to access the help of any function.

13
00:01:23,940 --> 00:01:31,980
‫So if you just click in between of any function parenthesis and then hold leadership key and click on

14
00:01:31,980 --> 00:01:32,580
‫tab.

15
00:01:32,700 --> 00:01:40,910
‫So if you had ship plus that it will open the help or documentation of that function.

16
00:01:41,280 --> 00:01:49,170
‫If you shift type food banks it will expand their documentation you can see here.

17
00:01:49,340 --> 00:01:54,830
‫This function will return different parameters such as not data.

18
00:01:54,830 --> 00:02:02,520
‫It will give us the independent variables that target will give us the dependent variable and the feature

19
00:02:02,560 --> 00:02:09,590
‫me will give us the details of features let's just close this.

20
00:02:09,590 --> 00:02:16,500
‫So if you have any doubt regarding any function just click between parenthesis and head shift.

21
00:02:16,550 --> 00:02:25,920
‫Let's add so I have alluded listed some of the information about this database.

22
00:02:25,920 --> 00:02:30,620
‫So in this database there are around don't polls and records.

23
00:02:30,780 --> 00:02:38,270
‫There are eight independent variables we have first variable as mad in.

24
00:02:38,720 --> 00:02:43,160
‫This is the medium income in that particular block warehouses located.

25
00:02:44,090 --> 00:02:47,410
‫Then we have a second variable that is House 8.

26
00:02:47,450 --> 00:02:51,550
‫This is the median house each in that block.

27
00:02:51,560 --> 00:02:59,510
‫Then we have average rooms which is the average number of rooms and then we have average bedrooms for

28
00:02:59,510 --> 00:03:01,670
‫average number of bedrooms.

29
00:03:01,700 --> 00:03:07,670
‫Next we have a population variable that is the block population.

30
00:03:07,670 --> 00:03:10,490
‫Then we have average occupancy.

31
00:03:10,490 --> 00:03:19,590
‫That is the average house occupancy and then latitude and longitude of that holds block using this eight

32
00:03:19,620 --> 00:03:21,480
‫independent variables.

33
00:03:21,480 --> 00:03:24,970
‫We want to predict the value of the house.

34
00:03:25,020 --> 00:03:28,060
‫The values are in hundred thousands.

35
00:03:28,110 --> 00:03:30,710
‫So supposing forward why variable is five.

36
00:03:30,720 --> 00:03:39,390
‫That means the value of that houses five hundred thousand dollars so this is our dataset.

37
00:03:39,440 --> 00:03:45,190
‫We have this independent variable and one target variable as base.

38
00:03:45,650 --> 00:03:52,340
‫If you want some more detail about this dataset you can click on this documentation link.

39
00:03:52,700 --> 00:03:57,940
‫This will open the official Eskil on documentation of this dataset here.

40
00:03:58,100 --> 00:04:06,360
‫You will get to know about all the parameters that you can give and what are included in this database.

41
00:04:06,520 --> 00:04:07,970
‫Let the school back

42
00:04:11,000 --> 00:04:20,970
‫so this housing is in the form of a dictionary and we have one devalued beer as feature name.

43
00:04:21,860 --> 00:04:25,950
‫So just look at the feature names first.

44
00:04:26,690 --> 00:04:35,280
‫You can see these are the variable names that we have discussed already now to access the independent

45
00:04:35,280 --> 00:04:35,870
‫data.

46
00:04:36,060 --> 00:04:42,200
‫We have to use housing data and to access the independent data set.

47
00:04:42,330 --> 00:04:44,450
‫We have to use housing data gate

48
00:04:47,430 --> 00:04:58,110
‫so in this line of code we are splitting our data first in blue green full and best data set then we

49
00:04:58,110 --> 00:05:07,540
‫add further dividing this green full dataset and do X green and x y new addition dataset.

50
00:05:08,210 --> 00:05:16,850
‫We will be using test Green is from Eskil on model selection and then we will use best green display

51
00:05:17,090 --> 00:05:20,840
‫metric to divide our data.

52
00:05:20,840 --> 00:05:25,320
‫We are not giving any additional parameter for test sites.

53
00:05:25,310 --> 00:05:32,110
‫That's because by default the test size is 25 percent of the total data.

54
00:05:32,210 --> 00:05:42,590
‫So 25 percent of the total data that is around 20000 goals will go into test set and then the remaining

55
00:05:42,590 --> 00:05:52,430
‫75 percent will go in training site all of that 75 percent again 25 percent will go into validation

56
00:05:52,430 --> 00:05:57,010
‫set and rest of 75 percent will go into our green site.

57
00:05:58,280 --> 00:05:59,780
‫Let's run this

58
00:06:02,690 --> 00:06:11,450
‫next step is to process our data and we will be using the standard is scalable from a scalar to standardize

59
00:06:11,450 --> 00:06:15,670
‫our data and standardizing.

60
00:06:15,980 --> 00:06:24,080
‫We subtract the mean of each variable from their individual values and then we also divide it by the

61
00:06:24,080 --> 00:06:34,280
‫variance because at the end we want all the variables with mean zero and their variance as one.

62
00:06:34,340 --> 00:06:42,520
‫This is a standard procedure to create and the machine learning more than the steps here are very simple.

63
00:06:42,560 --> 00:06:51,110
‫First we are importing standard skill it from a skill and processing then we are creating a scalar object

64
00:06:51,920 --> 00:07:00,170
‫using standard scalar method then we are creating this scalar object using what extreme data.

65
00:07:01,910 --> 00:07:09,440
‫So on our extended data this is scalar will find the values to subtract as I mean and to divide as a

66
00:07:09,440 --> 00:07:10,330
‫variance.

67
00:07:10,640 --> 00:07:17,870
‫Then we will use those values or this is scalar object to standardize our validation and as that as

68
00:07:17,870 --> 00:07:22,280
‫well just so repeat we are fitting.

69
00:07:22,280 --> 00:07:28,540
‫This is scalar object on our training data and we are transforming our validation and test set.

70
00:07:28,550 --> 00:07:36,190
‫Using this is scalar object that we have to take on our extreme data look for it.

71
00:07:37,110 --> 00:07:44,780
‫We will be using it underscore transform method and we will be using extreme as parameter to transform.

72
00:07:44,780 --> 00:07:52,210
‫We will be using dot transform method of this scalar object and we will be using relevant datasets here

73
00:07:55,740 --> 00:08:00,330
‫so let's create a standardised datasets.

74
00:08:00,410 --> 00:08:03,700
‫We are saving this object in their original name only.

75
00:08:03,780 --> 00:08:08,970
‫So we are replacing the ordinal extreme with this standardised version of extreme.

76
00:08:08,970 --> 00:08:15,330
‫Our order is null ex validation set in a standardised version of X in the relation set and same for

77
00:08:15,380 --> 00:08:17,160
‫this site as well.

78
00:08:17,420 --> 00:08:24,760
‫It says this If you want some more information among this the Skilling you can always do to as skill

79
00:08:24,790 --> 00:08:29,230
‫and documentation center disclaimers.

80
00:08:29,370 --> 00:08:33,700
‫Next step is to set random seeds.

81
00:08:33,840 --> 00:08:37,740
‫This is to generate the same result every time we run this modern

82
00:08:42,220 --> 00:08:42,800
‫now.

83
00:08:42,820 --> 00:08:50,410
‫As I said earlier our initial dataset was off or on point deposing rules or records.

84
00:08:50,410 --> 00:08:57,530
‫Now let us see the shape of the extended asset.

85
00:08:57,560 --> 00:09:06,980
‫Here you can see we have eight columns and don't let thousand six hundred records in our extremely desolate

86
00:09:08,950 --> 00:09:20,330
‫we should never own five puzzling records in our X dataset and around 4000 in valuation say.

87
00:09:20,780 --> 00:09:29,360
‫Now let's create the structure for our regression neural network here.

88
00:09:29,390 --> 00:09:37,970
‫We will be first having an input layer then we will be having the first dense layer with neurons.

89
00:09:38,180 --> 00:09:43,410
‫Then we want to create second dense layer with another 30 neurons.

90
00:09:43,760 --> 00:09:52,220
‫And then since this is a regression problem we will be having a single output neuron without any activation

91
00:09:52,220 --> 00:09:54,060
‫function.

92
00:09:54,620 --> 00:10:03,520
‫A single neuron since we want a continuous value as our output again we will be using the sequential

93
00:10:03,610 --> 00:10:17,950
‫EPA we are saving this structure or our model as model and then for the first layer we are writing gave

94
00:10:17,960 --> 00:10:22,430
‫us dot layer dot dense here in parenthesis.

95
00:10:22,430 --> 00:10:26,270
‫We have to provide the number of neurons which is 30.

96
00:10:26,320 --> 00:10:32,080
‫Then as discussed in our two re lectures we will be using activation function as the loop.

97
00:10:32,450 --> 00:10:40,280
‫And then since this is our first tool on there we need to provide the input shape since the number of

98
00:10:40,400 --> 00:10:44,230
‫independent variables in our data is eight.

99
00:10:44,290 --> 00:10:51,860
‫We will be using good shape equal to.

100
00:10:52,130 --> 00:11:00,260
‫You can also use like this input shape will too extreme dot shape and then calling the second and so

101
00:11:00,260 --> 00:11:03,820
‫on elements of over input chip.

102
00:11:03,860 --> 00:11:09,550
‫This way you don't have to worry about changing this number every time you change your data base.

103
00:11:09,650 --> 00:11:17,870
‫You can just trade extreme not shape and it will automatically get the number of variables from the

104
00:11:17,870 --> 00:11:23,580
‫shape attribute of our extreme object.

105
00:11:23,680 --> 00:11:29,170
‫So this is the structure of our first dance layer will be cleared.

106
00:11:29,170 --> 00:11:31,570
‫Second dense layer in a similar fashion.

107
00:11:31,780 --> 00:11:39,610
‫We will be using US DOT layer not dance and then the number of neurons in the parenthesis which is 30

108
00:11:40,030 --> 00:11:46,270
‫and activation function as a low similarly for the output layer.

109
00:11:46,410 --> 00:11:50,080
‫We will be using data as Dot players or dance.

110
00:11:50,130 --> 00:11:56,550
‫And since this the regression problem we will be using the single neuron without any activation function

111
00:11:58,670 --> 00:12:00,870
‫just run this.

112
00:12:01,580 --> 00:12:09,590
‫And again one important thing you can come in using hash symbol inside the cells.

113
00:12:09,590 --> 00:12:17,050
‫So python will execute on this part of the code and it will not be executing any code which starts with

114
00:12:17,060 --> 00:12:21,630
‫hash as the loop then for starting the coming.

115
00:12:21,700 --> 00:12:30,280
‫So now we have created the structure or architecture of the word neural network on just look on form

116
00:12:30,280 --> 00:12:31,640
‫and view this structure.

117
00:12:31,680 --> 00:12:35,800
‫We can call thought somebody might have moderated or

118
00:12:40,890 --> 00:12:47,490
‫somebody here you will get the information about the structure that we have created.

119
00:12:47,670 --> 00:12:51,050
‫So we have first dense layer with 13 neuron.

120
00:12:51,320 --> 00:12:53,890
‫We have second dense layer with pretty neurons.

121
00:12:53,940 --> 00:13:00,450
‫And lastly we have a single output layer with one neuron.

122
00:13:00,480 --> 00:13:02,520
‫This is what we want.

123
00:13:03,180 --> 00:13:09,750
‫The next step should be to compile this Martin again.

124
00:13:09,790 --> 00:13:15,870
‫The combined method works similar for both classification and duration model.

125
00:13:16,000 --> 00:13:19,930
‫First we have to mention the loss in classification.

126
00:13:19,930 --> 00:13:27,490
‫We were using cross entropy but here since we are running regression we have to use means squared added

127
00:13:28,540 --> 00:13:36,380
‫also known as MVC the second parameter is optimizer again.

128
00:13:36,580 --> 00:13:41,960
‫Here also we are using as Zuhdi spastic gradient descent.

129
00:13:42,370 --> 00:13:50,260
‫And here we are also providing the learning rate by default the value of learning rated zero point zero

130
00:13:50,260 --> 00:13:53,670
‫one and two change.

131
00:13:53,740 --> 00:13:57,990
‫You can just write the new value in the parent.

132
00:13:58,040 --> 00:14:05,070
‫This we have already discussed what is learning rate in auditory lecture.

133
00:14:05,080 --> 00:14:08,350
‫So if you have any doubts just read the entire lecture.

134
00:14:09,930 --> 00:14:11,080
‫And then the next.

135
00:14:11,070 --> 00:14:13,590
‫But I think that we are passing these metrics.

136
00:14:13,590 --> 00:14:18,680
‫This is an optional parameter in classification.

137
00:14:18,680 --> 00:14:25,190
‫We are using accuracy but in regression we can use mean absolute.

138
00:14:25,200 --> 00:14:35,540
‫Edit R M E absolute error is the difference between the predicted value and the actual value whereas

139
00:14:35,720 --> 00:14:36,770
‫the mean is squared.

140
00:14:36,770 --> 00:14:40,860
‫Edit is the square of that difference.

141
00:14:41,390 --> 00:14:49,160
‫So we are calculating both mean squared error as a lost function and add me as the metrics we additionally

142
00:14:49,160 --> 00:14:50,030
‫want to calculate.

143
00:14:51,490 --> 00:15:00,250
‫Again just remember if you want to look at the documentation or hell just click inside of the parenthesis

144
00:15:00,580 --> 00:15:07,910
‫and and press shift less tab you will get this kind of documentation.

145
00:15:07,920 --> 00:15:13,700
‫And here you can see that by default the learning rate value is 0.01.

146
00:15:14,000 --> 00:15:21,230
‫So in classification we did not provided any learning rate so the learning rate that was used there

147
00:15:21,410 --> 00:15:23,220
‫was 0.01.

148
00:15:24,260 --> 00:15:28,940
‫But you can always change these values into your named.

149
00:15:29,300 --> 00:15:39,430
‫Let us run this so we have compiled our more than the next step the next step is to print out more.

150
00:15:39,690 --> 00:15:44,760
‫Using training data the method or the process is same.

151
00:15:44,940 --> 00:15:49,480
‫We are creating and another object model underscoring street for creating.

152
00:15:49,530 --> 00:15:57,930
‫Then we are using model not fit and we are passing our training dataset number of epochs and the validation

153
00:15:57,940 --> 00:16:00,640
‫dataset that we have created.

154
00:16:00,980 --> 00:16:02,940
‫Just this a statement

155
00:16:05,880 --> 00:16:10,870
‫again just like the classification model.

156
00:16:11,080 --> 00:16:15,640
‫You will get the lost value which is the meaning of square data.

157
00:16:15,910 --> 00:16:20,870
‫You will get the end me when you mean absolute error.

158
00:16:21,340 --> 00:16:26,390
‫And similarly you will get these two values for your validation set that's when

159
00:16:30,080 --> 00:16:39,520
‫and you can see that the loss on both training set and validation set is decreasing with each epoch.

160
00:16:39,540 --> 00:16:44,600
‫Now we have these values for our training and validation set.

161
00:16:44,700 --> 00:16:52,500
‫We can also evaluate performance of this grain model on our test set and we are going to use the same

162
00:16:52,500 --> 00:16:58,350
‫method as we did with classification Martin will call model not evaluate.

163
00:16:58,470 --> 00:17:00,630
‫And then we will pass our training there.

164
00:17:00,650 --> 00:17:10,420
‫I say this then this you can see on our training does the loss is zero point three.

165
00:17:10,420 --> 00:17:22,760
‫That is MFC or mini squared at it and m e e mean absolute error is zero point 4 4 9 3 No.

166
00:17:22,770 --> 00:17:30,520
‫Just in classification we can call model history dot history that will give us the values of all these

167
00:17:30,520 --> 00:17:40,770
‫my crisis in a formal dictionary and this year you will get the loss and add me on training dataset

168
00:17:41,100 --> 00:17:48,710
‫and validation loss and validation and me the beauty of this is we can load this dictionary on a block

169
00:17:49,210 --> 00:17:57,300
‫and just like we did for classification and that will show us how our training loss and regulation loss

170
00:17:57,390 --> 00:18:03,490
‫are changing with each epoch and whether we have achieved the convergence or not.

171
00:18:04,930 --> 00:18:14,480
‫Just this so you can see we have the lost values and the Emmy values for over framing and validation

172
00:18:14,480 --> 00:18:23,180
‫set noted on this graph and one thing to notice is this graph is still going down meaning that if we

173
00:18:23,180 --> 00:18:32,520
‫run some more epochs this will further decrease the losses and improve the accuracy of our more.

174
00:18:33,230 --> 00:18:39,830
‫So this is the one way to tell whether you have achieved convergence or not or whether you have been

175
00:18:39,830 --> 00:18:46,410
‫pleased you are awkwardly or not you have to look at this validation loss and validation Emmy value

176
00:18:47,340 --> 00:18:57,080
‫so this the validation lost and you can clearly see that it is going down so to improve the accuracy

177
00:18:57,560 --> 00:19:04,200
‫we can read on the score to run it for finding more epochs

178
00:19:09,140 --> 00:19:19,670
‫so just go there now one important thing about guitars these guitars have the beats and biases value

179
00:19:19,880 --> 00:19:21,020
‫in the memory.

180
00:19:21,020 --> 00:19:29,480
‫So if you just read on this whole segment again this will not screen the data from Isaak but it will

181
00:19:29,570 --> 00:19:34,250
‫start draining the data from this position.

182
00:19:34,790 --> 00:19:42,090
‫So if we run this a statement bu times that is similar to running this statement for the epochs.

183
00:19:42,680 --> 00:19:53,470
‫If we just read on one more time you can see earlier the lost values were around point seven nought

184
00:19:53,480 --> 00:19:58,010
‫point take for the first sequel and then gradually decreasing.

185
00:19:58,240 --> 00:20:02,100
‫But now we have started from the 20 plus epoch

186
00:20:07,630 --> 00:20:11,800
‫last time the lost value on our test set was 0 1 3 0.

187
00:20:11,810 --> 00:20:22,480
‫Let's see whether we have improved this lost value or not and see the losses decrease from 0 1 3 0 0

188
00:20:22,480 --> 00:20:25,160
‫point to 5.

189
00:20:25,210 --> 00:20:34,720
‫So our hypothesis was correct that the model was not converged in 2012 there was room of improvement

190
00:20:34,960 --> 00:20:38,470
‫and be lead in the whole more than 420 more epochs.

191
00:20:38,470 --> 00:20:40,570
‫That is a total loss for the box

192
00:20:43,610 --> 00:20:50,820
‫and we can see this graph in just focus on this regulation.

193
00:20:50,820 --> 00:21:02,800
‫Lastly earlier it was zero point for all we read and this there is a slight decrease in valuation loss.

194
00:21:03,020 --> 00:21:06,880
‫Now you can see that the line is flattened turned out.

195
00:21:06,890 --> 00:21:12,440
‫This means we have achieved the convergence on this model.

196
00:21:12,440 --> 00:21:14,690
‫So not just with regression.

197
00:21:14,780 --> 00:21:20,930
‫If you are running classification model as well just look at this graph to identify whether you have

198
00:21:20,990 --> 00:21:22,680
‫achieved convergence or not.

199
00:21:24,290 --> 00:21:30,820
‫Now to predict the values on the new dataset you can always use credit market.

200
00:21:31,370 --> 00:21:34,390
‫So your object name and not really matter.

201
00:21:34,760 --> 00:21:37,170
‫And then the new dataset.

202
00:21:37,310 --> 00:21:45,110
‫I don't have any new dataset so I am just creating the sample of 4 3 values of my X asset and considering

203
00:21:45,200 --> 00:21:54,470
‫it as my new dataset and then saving the information in y predicted values and using model lot predict

204
00:21:54,530 --> 00:21:57,010
‫my total to predict the values.

205
00:21:57,340 --> 00:22:01,710
‫These are the values using this model.

206
00:22:01,760 --> 00:22:04,670
‫That's all for this lecture in the next lecture.

207
00:22:04,670 --> 00:22:09,720
‫We will be looking at the functional API of get us.

208
00:22:09,830 --> 00:22:10,210
‫Thank you.