﻿1
00:00:00,560 --> 00:00:10,370
‫In the last video, we used data augmentation techniques to increase our validation accuracy to 82 to

2
00:00:10,370 --> 00:00:11,090
‫83%.

3
00:00:12,700 --> 00:00:21,700
‫In this video, we will use Vgg 16 model architecture to further increase our validation accuracy above

4
00:00:21,700 --> 00:00:22,780
‫90%.

5
00:00:26,350 --> 00:00:34,480
‫Vgg, 16, was the runner up of 2014in VRC competition.

6
00:00:35,970 --> 00:00:43,800
‫The problem is statement of that competition was to categorize millions of pictures into thousand different

7
00:00:43,800 --> 00:00:44,730
‫categories.

8
00:00:46,350 --> 00:00:51,090
‫The pictures were of animals, humans, etcetera.

9
00:00:51,820 --> 00:00:57,460
‫And the categories were of different animal species and many other.

10
00:00:58,980 --> 00:01:09,540
‫So the problem we are trying to solve, you can consider it as a subset of original 2014 VRC competition

11
00:01:09,540 --> 00:01:10,320
‫data.

12
00:01:12,950 --> 00:01:15,320
‫As we discussed in our theory lecture.

13
00:01:15,860 --> 00:01:24,020
‫We can use convolutional part of this pre-trained model architectures in our problems.

14
00:01:26,810 --> 00:01:35,930
‫These models consist of two parts convolutional base and then a fully connected neural network base.

15
00:01:36,170 --> 00:01:42,950
‫The convolutional base is used to identify features from the images and then.

16
00:01:43,790 --> 00:01:48,740
‫The fully connected neural base is used to classify those features.

17
00:01:51,690 --> 00:02:00,780
‫So for any similar kind of problem, we can easily use Pre-trained convolutional base to extract features

18
00:02:00,780 --> 00:02:09,210
‫from our images and then we can add two three layers of fully connected neural network to classify the

19
00:02:09,210 --> 00:02:11,760
‫result of this conv basis.

20
00:02:13,180 --> 00:02:14,320
‫In this video.

21
00:02:14,320 --> 00:02:15,670
‫The idea is same.

22
00:02:15,790 --> 00:02:20,770
‫We will use the conv base of vgg16 model.

23
00:02:21,640 --> 00:02:29,260
‫And then we will add one fully connected hidden layer and one output layer to classify the features

24
00:02:29,290 --> 00:02:32,890
‫extracted from our vgg16 common base.

25
00:02:35,580 --> 00:02:36,990
‫So let's start.

26
00:02:37,350 --> 00:02:42,750
‫First, we will be creating two object train generator and validation generator.

27
00:02:42,990 --> 00:02:49,410
‫We have already used the same generators in our previous cases also.

28
00:02:50,190 --> 00:02:52,500
‫So we are using the same setting.

29
00:02:52,500 --> 00:02:58,830
‫We are using rescaling of one by 255 to convert our RGB values.

30
00:02:59,410 --> 00:03:04,780
‫From zero 2 to 55 to 0 to 1.

31
00:03:05,080 --> 00:03:12,340
‫Then we have rotation ranges, width shift height, shift shear range, zoom range and horizontal flip.

32
00:03:12,820 --> 00:03:15,310
‫To create dummy augmented data.

33
00:03:16,650 --> 00:03:23,820
‫And our target size of images is 150 by 150 and we are using a batch size of 20.

34
00:03:24,480 --> 00:03:32,880
‫We have already discussed this in detail in our previous videos, so we are not going to discuss this

35
00:03:32,880 --> 00:03:33,450
‫here.

36
00:03:34,460 --> 00:03:39,500
‫Just like in our previous case, we have around 2000 images.

37
00:03:40,130 --> 00:03:43,190
‫For training and 1000 images for validation.

38
00:03:45,380 --> 00:03:49,970
‫Now, the second step is to create architecture for our model.

39
00:03:51,190 --> 00:03:59,500
‫Now our idea is to first use a base of vgg16 and then use two dense layer.

40
00:04:03,300 --> 00:04:10,260
‫So to use con base of vgg16, you can directly import that from Keras.

41
00:04:10,440 --> 00:04:15,570
‫There is no need to manually build all the con layers in that base.

42
00:04:15,600 --> 00:04:26,220
‫So to import vgg16 you can just right from tensorflow keras dot application import vgg16 and then give

43
00:04:26,220 --> 00:04:28,350
‫these three different parameters.

44
00:04:32,060 --> 00:04:40,310
‫So we are creating our com base object and we are using vgg16 and these are the three parameters that

45
00:04:40,310 --> 00:04:41,390
‫we are passing.

46
00:04:41,870 --> 00:04:45,140
‫First, we need to provide weights.

47
00:04:46,440 --> 00:04:54,450
‫So in any convolutional neural network, first we provide randomized weight and then our convolutional

48
00:04:54,450 --> 00:04:57,300
‫network tries to optimize those weights.

49
00:04:58,990 --> 00:04:59,920
‫Since.

50
00:05:00,520 --> 00:05:09,070
‫V 16 was used in that competition and we can use the final weights of that model.

51
00:05:11,970 --> 00:05:17,490
‫So to use those weights, we have to write weights equal to ImageNet.

52
00:05:17,520 --> 00:05:22,590
‫ImageNet is the competition I vrc competition.

53
00:05:25,320 --> 00:05:30,480
‫So to use pre-trained weights, we just have to write weight equal to image net.

54
00:05:30,630 --> 00:05:36,260
‫And then there were two parts of that Vgg 16 model.

55
00:05:36,270 --> 00:05:41,970
‫First was the base and then the fully connected neural network base.

56
00:05:42,180 --> 00:05:49,800
‫We only want the conv base from that model since cone bases are reusable.

57
00:05:49,830 --> 00:05:56,000
‫Those are mainly used to extract features and not to categorize the images.

58
00:05:56,010 --> 00:06:02,340
‫So we will be only using the cone base and we only need to import cone base.

59
00:06:02,490 --> 00:06:07,740
‫That's why we are using include underscore top equal to false.

60
00:06:09,720 --> 00:06:16,410
‫If we want to import the whole model along with the fully connected dense layers, then you have to

61
00:06:16,410 --> 00:06:17,700
‫change it to true.

62
00:06:19,460 --> 00:06:26,720
‫But in our case, since we are only importing the convolutional base, we are providing false here.

63
00:06:29,480 --> 00:06:33,230
‫Then next parameter is to give the input shape.

64
00:06:34,650 --> 00:06:39,390
‫The input shape of our images are 150 by one, 50 by three.

65
00:06:39,420 --> 00:06:43,170
‫That's why we are providing this tuple here.

66
00:06:43,710 --> 00:06:45,210
‫Let's run this.

67
00:06:45,930 --> 00:06:51,120
‫So we have imported our corn base from Vgg16 model.

68
00:06:53,870 --> 00:06:55,790
‫Now to look at this.

69
00:06:56,000 --> 00:06:58,940
‫You can just write cone base dot summary.

70
00:07:02,070 --> 00:07:11,760
‫Well, if you run this, you will get details of all the layers of this vgg16 corn base.

71
00:07:14,820 --> 00:07:17,610
‫Now as we discussed in our theory lecture.

72
00:07:18,240 --> 00:07:22,800
‫Which is the 16 have this convolutional blocks.

73
00:07:22,890 --> 00:07:24,770
‫So here you can see.

74
00:07:24,810 --> 00:07:26,550
‫First convolutional block.

75
00:07:26,580 --> 00:07:28,620
‫Then second convolutional block.

76
00:07:28,920 --> 00:07:32,460
‫In each block there are multiple conv layers.

77
00:07:32,460 --> 00:07:40,440
‫So in first and second blocks there are two conv layers and then a max pooling layer in third, fourth

78
00:07:40,440 --> 00:07:41,820
‫and fifth block.

79
00:07:41,910 --> 00:07:47,250
‫There are three conv layers and then a max pooling layer.

80
00:07:52,080 --> 00:08:01,710
‫So in a way, by importing Vgg 16, we avoided creating these many layers and we have already imported

81
00:08:01,710 --> 00:08:04,760
‫the final weights of that model.

82
00:08:04,770 --> 00:08:10,080
‫So there is no need to randomly provide weights and optimize those weights.

83
00:08:10,650 --> 00:08:16,080
‫We already have the final model weights with us in this model.

84
00:08:18,290 --> 00:08:25,880
‫Now the next step is to add fully connected dense layer and output layer in front of this Conv base.

85
00:08:27,440 --> 00:08:31,670
‫Now this is similar to creating any CNN model.

86
00:08:33,140 --> 00:08:35,510
‫We just have to create our model first.

87
00:08:35,840 --> 00:08:44,360
‫We are using models dot sequential and then just like you add any other layer, you can add the corn

88
00:08:44,360 --> 00:08:46,400
‫base that we have imported.

89
00:08:47,300 --> 00:08:50,480
‫So we will write model dot Add.

90
00:08:50,480 --> 00:08:56,930
‫And here you can just write the variable in which we have stored this vgg16.

91
00:08:57,290 --> 00:08:59,900
‫So our variable name was con base.

92
00:09:00,230 --> 00:09:03,740
‫So first we can add this con base.

93
00:09:04,130 --> 00:09:11,610
‫Next we have to use of Latin layer and then include a fully connected dense layer and then output layer.

94
00:09:11,630 --> 00:09:19,610
‫So first we are adding flatten layer, then a dense layer with 256 neurons and then an output layer

95
00:09:19,610 --> 00:09:21,050
‫with a single neuron.

96
00:09:22,130 --> 00:09:28,280
‫The activation is Relu in the dense layer and the activation is sigmoid in the output layer.

97
00:09:30,920 --> 00:09:36,230
‫You can run this and then you can look at the model summary as well.

98
00:09:37,670 --> 00:09:43,970
‫So if you see this is our model summary, our first layer is vgg16.

99
00:09:44,570 --> 00:09:51,440
‫We have around 14 million trainable parameter in this 16 layer.

100
00:09:51,680 --> 00:09:53,540
‫Then we have a flatten layer.

101
00:09:53,570 --> 00:09:57,980
‫Then we have a dense layer with around 2 million trainable parameters.

102
00:09:58,190 --> 00:10:03,560
‫And then finally an output layer with 257 trainable parameters.

103
00:10:03,890 --> 00:10:09,110
‫The total trainable parameter in our model is around 16 million.

104
00:10:09,620 --> 00:10:16,670
‫Now, as I have told you earlier, we were using the weights of the final Vgg16 model.

105
00:10:17,840 --> 00:10:22,970
‫So the weights are already optimized in this vgg16 layer.

106
00:10:23,420 --> 00:10:30,170
‫Now if you don't want to train those weights, you can just freeze that layer.

107
00:10:30,200 --> 00:10:35,840
‫To freeze that, you can use con base dot trainable, equal to false.

108
00:10:36,050 --> 00:10:46,170
‫In that case, the trainable parameter here will turn to zero and our model will not try to optimize

109
00:10:46,170 --> 00:10:48,020
‫the weights of this layer.

110
00:10:48,030 --> 00:10:56,160
‫In that way, we can significantly reduce the number of trainable parameters in our model and significantly

111
00:10:56,160 --> 00:10:58,260
‫improve our execution time.

112
00:11:00,080 --> 00:11:04,160
‫So if you run this con based, trainable, equal to false.

113
00:11:04,460 --> 00:11:10,970
‫Our number of trainable parameters will reduce from 16 million to just 2.1 million.

114
00:11:12,980 --> 00:11:19,430
‫But here we are not running this and we are training all the 16 million parameters.

115
00:11:19,670 --> 00:11:25,730
‫But in case you want to save time, you can run this on base dot trainable, equal to false.

116
00:11:31,360 --> 00:11:34,420
‫Now the next step is to compile our model.

117
00:11:36,320 --> 00:11:40,490
‫We will be using a loss function of binary cross entropy.

118
00:11:40,790 --> 00:11:50,330
‫Since we have two classes, then we are using Rmsprop as our optimizer and a learning rate of two into

119
00:11:50,330 --> 00:11:51,920
‫ten raise to power minus five.

120
00:11:53,150 --> 00:12:01,190
‫We are using somewhat a smaller learning rate here just because we want to fine tune our already trained

121
00:12:01,190 --> 00:12:01,790
‫model.

122
00:12:01,970 --> 00:12:10,940
‫So the weights of this convolutional layers are already optimized and we just want to optimize it in

123
00:12:10,940 --> 00:12:13,730
‫little steps according to our problem.

124
00:12:14,090 --> 00:12:20,390
‫So since we are fine tuning it, we are not training it from randomly assigned weights.

125
00:12:20,420 --> 00:12:23,240
‫We can use a smaller learning rate.

126
00:12:23,930 --> 00:12:28,070
‫That's why we are using two into ten raise to power minus five.

127
00:12:28,400 --> 00:12:32,210
‫And the metrics we want to calculate is of accuracy.

128
00:12:33,860 --> 00:12:38,870
‫So training these models will take somewhere between 8 to 10 hours.

129
00:12:39,080 --> 00:12:45,770
‫So it is better to use callbacks to save our model after each epoch.

130
00:12:47,780 --> 00:12:55,310
‫So we are creating our checkpoint callback and we are saving our model for each epoch.

131
00:12:57,680 --> 00:13:04,970
‫You can also use save based on the parameter here if you don't want to save 30 different models.

132
00:13:05,900 --> 00:13:13,580
‫And if you give say best equal to true, we will save the model with the best validation score.

133
00:13:16,250 --> 00:13:19,580
‫Now the next step is to fit the training data.

134
00:13:21,410 --> 00:13:29,420
‫The step is similar to the last time We will use fitgenerator and then train generator with steps per

135
00:13:29,420 --> 00:13:36,110
‫epoch and we will also give validation generator and validation steps for our validation data.

136
00:13:36,680 --> 00:13:43,100
‫And here we are also providing callback just to save our model after each epoch.

137
00:13:43,580 --> 00:13:49,490
‫So I have already executed this and these are the results.

138
00:13:51,700 --> 00:13:58,450
‫So if you see the validation accuracies are in the range of 90 to 97.

139
00:13:59,080 --> 00:14:08,710
‫So at the end of 30 epoch, we were getting a training accuracy of around 98% and the validation accuracy

140
00:14:08,710 --> 00:14:10,660
‫of 98% as well.

141
00:14:13,630 --> 00:14:18,820
‫You can see each epoch is taking around 15 minutes to train.

142
00:14:19,330 --> 00:14:25,120
‫So just remember, this may take up to 8 to 10 hours to train our model.

143
00:14:28,570 --> 00:14:34,990
‫Now let's look at how accuracies and losses are changing with each epoch.

144
00:14:38,480 --> 00:14:41,820
‫The orange line here is for training accuracy.

145
00:14:41,840 --> 00:14:44,690
‫The red line here is for validation accuracy.

146
00:14:44,690 --> 00:14:52,160
‫And similarly, we have validation loss in green and training, loss in blue.

147
00:14:55,030 --> 00:15:06,040
‫You can see that validation accuracy is oscillating between 97 to 98 and there is no further improvement

148
00:15:06,040 --> 00:15:10,780
‫in accuracy as we move from lower epoch value to higher epoch value.

149
00:15:10,780 --> 00:15:17,800
‫So we can say that we have achieved a convergence in our model and it is not possible to further improve

150
00:15:17,800 --> 00:15:22,270
‫this validation accuracy with increase in number of epochs.

151
00:15:23,830 --> 00:15:31,270
‫So if you compare the validation accuracy with our last CNN model and the last CNN model, we were getting

152
00:15:31,270 --> 00:15:33,820
‫a maximum accuracy of 84%.

153
00:15:34,330 --> 00:15:45,040
‫But in this, by using Vgg 16 Pre-trained model, we are achieving up to 97 to 98% of validation accuracy.

154
00:15:46,450 --> 00:15:52,540
‫And it is very easy to train our model using this pre-trained models.

155
00:15:55,150 --> 00:16:00,090
‫So there is no need to create your own con basis.

156
00:16:01,360 --> 00:16:06,520
‫You can just use any one of these pre-trained con bases.

157
00:16:06,550 --> 00:16:12,730
‫If the problem statement is somewhat similar to the image net problem statements.

158
00:16:17,440 --> 00:16:21,610
‫Now I am also saving this history variable into a CSV file.

159
00:16:21,760 --> 00:16:27,470
‫There is no need to do this step here now.

160
00:16:27,490 --> 00:16:33,910
‫Till now we were only calculating the accuracies on our validation sets.

161
00:16:35,470 --> 00:16:42,640
‫But now it's time to use our test set to see how this model performs on our test set.

162
00:16:45,030 --> 00:16:50,430
‫Now we have to follow the same steps to evaluate our model performance.

163
00:16:51,720 --> 00:16:55,920
‫Again, we will be using test generator.

164
00:16:58,660 --> 00:17:01,570
‫So we are creating another generator.

165
00:17:01,660 --> 00:17:03,730
‫We are calling it test generator.

166
00:17:06,190 --> 00:17:08,890
‫We are using the test underscore data gen.

167
00:17:09,070 --> 00:17:12,760
‫This is the same object we used for validation as well.

168
00:17:12,970 --> 00:17:20,290
‫So here in this object we are just reshaping our data from 0 to 255 to 0 to 1.

169
00:17:20,500 --> 00:17:27,040
‫And then we are using flow from directory and here we are providing test directory instead of validation

170
00:17:27,040 --> 00:17:27,780
‫directory.

171
00:17:27,790 --> 00:17:30,010
‫So our test generator is ready.

172
00:17:31,120 --> 00:17:39,880
‫Now, normally if we have data in the form of array or dataframe, we use evaluate method.

173
00:17:39,970 --> 00:17:48,520
‫But since we have our data flowing from our directory, that's why we have to use evaluate underscore

174
00:17:48,520 --> 00:17:49,270
‫generator.

175
00:17:51,310 --> 00:17:53,710
‫So there is a simple pattern for it.

176
00:17:53,740 --> 00:17:55,660
‫We were using fit generator.

177
00:17:56,680 --> 00:18:04,060
‫Similarly for evaluate, we are using evaluate generator and here also we have to provide our test generator

178
00:18:04,060 --> 00:18:06,880
‫object and the number of steps.

179
00:18:07,480 --> 00:18:09,640
‫We have a batch size of 20.

180
00:18:09,670 --> 00:18:13,890
‫We have total test data size of around 1000 images.

181
00:18:13,900 --> 00:18:21,370
‫That's why we need 50 steps, 1000 divided by 20 equal to 50.

182
00:18:21,490 --> 00:18:26,100
‫So we will be able to cover all our test images in 50 steps.

183
00:18:26,110 --> 00:18:31,850
‫So if you run this just like evaluate method, you will get two values.

184
00:18:31,850 --> 00:18:35,360
‫First is the loss value and second is the accuracy value.

185
00:18:35,600 --> 00:18:41,360
‫And here you can see that the accuracy we are getting is around 97%.

186
00:18:43,950 --> 00:18:51,090
‫So just to revise, we started with a simple convolutional model.

187
00:18:51,390 --> 00:18:56,340
‫At that time, we were getting accuracy of around 73 to 74%.

188
00:18:56,970 --> 00:19:04,080
‫Then we use data augmentation techniques to create dummy data and avoid overfitting.

189
00:19:04,260 --> 00:19:09,840
‫In that case, we were getting accuracy of around 83 to 84%.

190
00:19:11,790 --> 00:19:17,250
‫And in this case, we used a pre-trained vgg16 model.

191
00:19:19,150 --> 00:19:20,500
‫For our problem.

192
00:19:21,210 --> 00:19:26,820
‫And in this case, we are getting an accuracy of around 97%.

193
00:19:27,420 --> 00:19:36,900
‫So we have increased our validation accuracy from 73% to 98% during this project.

194
00:19:38,130 --> 00:19:40,020
‫That's all for this project.

195
00:19:40,050 --> 00:19:40,830
‫Thank you.

