1
00:00:00,210 --> 00:00:06,000
Hello, everyone, and welcome to this new session in which we are going to treat data augmentation.

2
00:00:06,810 --> 00:00:13,560
Previously, we had seen how to load our dataset from this dataset directory right here, and then we

3
00:00:13,560 --> 00:00:20,760
trained a learned model which performs very well on the training data but didn't perform as well on

4
00:00:20,760 --> 00:00:21,930
the validation data.

5
00:00:21,930 --> 00:00:29,400
And then we're able to evaluate this model on different evaluation metrics like the accuracy, top key

6
00:00:29,430 --> 00:00:32,160
accuracy and the confession metrics.

7
00:00:32,190 --> 00:00:35,670
In this session, we are going to focus on data augmentation.

8
00:00:35,670 --> 00:00:41,940
That is, we're going to see the effect of augmenting our data artificially without actually getting

9
00:00:41,940 --> 00:00:50,130
to add an element in this dataset right here and then seeing how this affects our model performance

10
00:00:50,400 --> 00:00:51,450
in the session.

11
00:00:51,450 --> 00:01:02,010
We'll see how to augment our data like those data right here and see how this technique of data augmentation

12
00:01:02,010 --> 00:01:06,090
helps in making the model even more performance.

13
00:01:06,120 --> 00:01:12,420
Now, we have looked at data augmentation previously, but if there's one thing you have to note about

14
00:01:12,420 --> 00:01:17,850
that, augmentation is simply the fact that it promotes diversity in your dataset.

15
00:01:17,850 --> 00:01:22,320
And so if you have data like this one year, let's open this.

16
00:01:23,200 --> 00:01:25,480
We consider this original data.

17
00:01:25,480 --> 00:01:27,020
So this our original data.

18
00:01:27,040 --> 00:01:32,500
This is the data we actually gather and then we have this brightness here.

19
00:01:32,500 --> 00:01:39,880
So we modify this data is all this image is brightness, and we obtain this other data point and then

20
00:01:39,880 --> 00:01:45,100
we modify from this, we rotate this and we obtain this other data point.

21
00:01:45,100 --> 00:01:48,880
You see that it's exact same image which has been modified.

22
00:01:48,880 --> 00:01:55,060
And so now the model doesn't only get used to seeing this image right here, but now you could see this

23
00:01:55,060 --> 00:01:57,040
one, this one.

24
00:01:57,910 --> 00:02:06,580
And so that augmentation is this method or this technique for promoting robustness in models, hence

25
00:02:06,580 --> 00:02:13,450
fighting overfitting as now the model can see different versions of certain data points.

26
00:02:13,630 --> 00:02:18,120
So let's close this up and get back to the code here.

27
00:02:18,130 --> 00:02:19,450
Let's get back up here.

28
00:02:19,450 --> 00:02:21,070
We have the augmentation.

29
00:02:21,070 --> 00:02:23,890
We're going to simply get back to this year.

30
00:02:23,890 --> 00:02:25,690
We had that augmentation.

31
00:02:25,690 --> 00:02:33,070
We looked at this already in the previous section so you could get back and try to understand exactly

32
00:02:33,070 --> 00:02:34,300
how this is carried out.

33
00:02:34,300 --> 00:02:40,480
Previously, we had seen that we could carry out the augmentation by using these kinds of TensorFlow

34
00:02:40,510 --> 00:02:42,520
image methods like this one.

35
00:02:42,520 --> 00:02:46,570
You see, you could rotate the image, you could flip left, right, you could adjust the saturation,

36
00:02:46,690 --> 00:02:49,000
or you could use cross layers.

37
00:02:49,000 --> 00:02:50,650
You are going to use cross layers.

38
00:02:50,650 --> 00:02:56,290
And then also we saw how to use our argumentation library, which is this amazing library which permits

39
00:02:56,290 --> 00:03:00,910
us carry out their augmentation on different tasks, not only classification very easily.

40
00:03:00,910 --> 00:03:02,740
So you could check on that video.

41
00:03:02,770 --> 00:03:08,560
Now, that said, let's simply copy this out here, then we paste it here.

42
00:03:08,560 --> 00:03:10,030
So we have this.

43
00:03:10,600 --> 00:03:12,490
We're going to we have this augmented layers.

44
00:03:12,490 --> 00:03:16,390
We have random rotation, random flip, random contrast.

45
00:03:16,390 --> 00:03:22,750
Let's actually get back here and this layers and then add the random contrast.

46
00:03:22,750 --> 00:03:25,450
So we have random contrast.

47
00:03:25,480 --> 00:03:31,810
Okay, So we have that and you can feel free to get back to the documentation on the TensorFlow Keras

48
00:03:31,840 --> 00:03:32,700
layers.

49
00:03:32,710 --> 00:03:40,150
Let's get to cross layers and then check out this different augmentation strategies.

50
00:03:40,150 --> 00:03:46,150
Here you can see random brightness contrast, crop flip height, rotation, translation, random width,

51
00:03:46,180 --> 00:03:51,010
random zoom, and you could check out your documentations in case you have any doubts.

52
00:03:51,010 --> 00:03:59,300
So you could just have this here and you see how to use each and every one of these augmentation strategies.

53
00:03:59,320 --> 00:04:02,440
Now, getting back to the code, we have this three year.

54
00:04:02,560 --> 00:04:11,380
Now what was generally done is you could actually run one and then test this all test how it helps in

55
00:04:11,380 --> 00:04:13,690
making the model perform better.

56
00:04:13,690 --> 00:04:18,010
And then you could add the other and see whether it helps and so on and so forth.

57
00:04:18,010 --> 00:04:26,230
So it's kind of like it's not a fixed kind of method where you just have some fixed methods or some

58
00:04:26,230 --> 00:04:31,330
fixed augmentation strategies, which we just place in this other sequentially.

59
00:04:31,330 --> 00:04:33,520
And then it will always work magically.

60
00:04:33,520 --> 00:04:39,400
Generally, you will have to test this out, different tests, different strategies out and then see

61
00:04:39,400 --> 00:04:43,210
which one works well for the data you're working with.

62
00:04:44,410 --> 00:04:45,850
Now we have this set.

63
00:04:45,850 --> 00:04:48,970
Let's run this cell right here.

64
00:04:49,120 --> 00:04:51,310
We have this is not defined.

65
00:04:51,850 --> 00:04:55,720
Let's let's get back here and be sure that this we run this cell.

66
00:04:55,720 --> 00:04:56,980
Let's run the cell.

67
00:04:57,310 --> 00:04:58,810
Normally, that should be fine.

68
00:04:59,230 --> 00:05:04,240
Get back and we run this one here and that's fine.

69
00:05:04,240 --> 00:05:05,410
So we have that.

70
00:05:05,410 --> 00:05:10,930
And then now at the level of this dataset preparation, we're going to include this, We're going to

71
00:05:10,930 --> 00:05:11,620
do the mapping.

72
00:05:11,620 --> 00:05:18,340
So we have we have our augment layers, not that we're not going to do this for the validation dataset,

73
00:05:18,340 --> 00:05:20,710
we're going to only do this for the training data set.

74
00:05:20,860 --> 00:05:27,580
We have this augmented layers and then we're going to specify the number of parallel calls.

75
00:05:27,580 --> 00:05:34,570
So we have non parallel calls and then this has gotten via the auto tune.

76
00:05:34,570 --> 00:05:36,640
So we have that automatically.

77
00:05:37,120 --> 00:05:44,590
Then before we move on, we're going to copy this out here and then simply paste this here.

78
00:05:44,920 --> 00:05:54,220
So we create this augmented layer which takes this and then has it outputs the augmented image and the

79
00:05:54,220 --> 00:05:55,000
level.

80
00:05:55,090 --> 00:05:56,470
We're not going to have this here.

81
00:05:56,470 --> 00:05:57,700
So let's just take this off.

82
00:05:57,700 --> 00:05:59,200
We have just this image.

83
00:05:59,650 --> 00:06:00,490
So that's it.

84
00:06:01,030 --> 00:06:06,100
We have this augmented layers and then we define this function as augmented layer.

85
00:06:06,100 --> 00:06:10,330
So right here, let's take this one off and run this again.

86
00:06:11,680 --> 00:06:13,870
With this, we're now set to train.

87
00:06:13,870 --> 00:06:15,820
So let's go ahead and retrain.

88
00:06:15,820 --> 00:06:22,540
Our model train is now complete and we could see that the model doesn't perform as well.

89
00:06:22,670 --> 00:06:25,730
As used to do without augmentation.

90
00:06:25,730 --> 00:06:28,250
So here we did augmentation.

91
00:06:28,250 --> 00:06:33,770
The model performs even poorer as compared to when there was no augmentation.

92
00:06:33,770 --> 00:06:35,360
Let's run this here.

93
00:06:35,360 --> 00:06:41,140
So you could see I could compare this with the previous results we got or the previous evaluation.

94
00:06:41,150 --> 00:06:47,060
You see, we go from 75% to 54%, and then you will go from 90% to 83%.

95
00:06:47,930 --> 00:06:54,080
To understand the reason why we have this drop in performance, let's look at this visualization of

96
00:06:54,080 --> 00:06:55,010
our data set.

97
00:06:55,220 --> 00:07:04,100
We'll see that after carrying out the rotation, we have images which are rotated at very unusual angles.

98
00:07:04,100 --> 00:07:07,790
So you see like this image here is unusual.

99
00:07:08,600 --> 00:07:17,750
This angle to looks very unusual as compared to the kind of data we would have in our test or validation

100
00:07:17,750 --> 00:07:18,320
set.

101
00:07:18,500 --> 00:07:28,070
So we have to ensure that when carrying out this random operation, we limit the angle at which we could

102
00:07:28,070 --> 00:07:29,680
carry out this rotation.

103
00:07:29,690 --> 00:07:36,890
So that said, if we have a face like this or let's say we have an input image like this, we should

104
00:07:36,890 --> 00:07:44,570
limit this rotation such that this image cannot be rotated at, say, 180 degrees.

105
00:07:44,570 --> 00:07:45,830
So let's have this here.

106
00:07:45,830 --> 00:07:47,060
So you see that better.

107
00:07:47,060 --> 00:07:49,970
We have this, this and this.

108
00:07:49,970 --> 00:07:52,280
So we do not want these kinds of rotations.

109
00:07:52,280 --> 00:07:59,900
What we want is rotations where we have the face a bit tilted like this and that's it.

110
00:07:59,900 --> 00:08:03,140
So we want this kind of rotations, but not this type.

111
00:08:03,140 --> 00:08:08,720
So to solve this problem, what we're going to do is we're going to get back to the documentation that's

112
00:08:08,720 --> 00:08:09,770
random rotation.

113
00:08:09,770 --> 00:08:12,080
We check out this factor right here.

114
00:08:12,080 --> 00:08:14,990
We see that the value we put your.

115
00:08:15,080 --> 00:08:24,560
Takes us from -20% of two pi pyres 180 degrees pi and radians convert this two degrees we have pi which

116
00:08:24,560 --> 00:08:26,030
is 180 degrees.

117
00:08:26,030 --> 00:08:29,300
So we have two times 180 which is 360.

118
00:08:29,300 --> 00:08:36,830
So when you say 0.2 or -0.2, what you're in fact saying is -20% of 360.

119
00:08:36,830 --> 00:08:39,350
So when you have let's get back to the code.

120
00:08:39,350 --> 00:08:47,390
So when you have your let's add this cell, you're when you have your 0.25, what you have in is 0.25

121
00:08:47,390 --> 00:08:52,280
times 360 running this, you see you go in 90 degrees.

122
00:08:52,280 --> 00:08:57,800
So this means that if you had an image which was already somehow tilted, this image, we end up in

123
00:08:57,800 --> 00:09:04,880
a very unusual position where the face will look something like this.

124
00:09:05,360 --> 00:09:10,490
And so what we'll do here is we are going to limit this rotation.

125
00:09:10,490 --> 00:09:15,860
So we're going to we're going to go from 0.025, for example, to this.

126
00:09:15,860 --> 00:09:25,760
So let's limit that and we'll go from -0.025, meaning that we have your 0.025, that will be nine degrees.

127
00:09:25,760 --> 00:09:34,400
So limiting this to nine degrees and then going from -0.025 to 0.025 simply means that if you have this

128
00:09:34,400 --> 00:09:40,880
axis here and then you have the face like this, let's put the mouth in this.

129
00:09:40,880 --> 00:09:52,760
Then after rotation you can only go 90 degrees this direction or 90 degrees in this, or rather nine

130
00:09:52,760 --> 00:09:53,810
degrees in this direction.

131
00:09:53,810 --> 00:09:54,800
So you have a limit.

132
00:09:54,800 --> 00:10:03,050
So you can only pick a random value between negative nine degrees and nine degrees in this other direction.

133
00:10:03,050 --> 00:10:06,740
So you can only go this way 90 degrees or this way, 90 degrees.

134
00:10:06,740 --> 00:10:10,910
So that said, the extreme will have a face like this.

135
00:10:10,910 --> 00:10:12,080
That's after rotation.

136
00:10:12,080 --> 00:10:14,270
So we'll go from this blue to this red.

137
00:10:14,300 --> 00:10:17,240
Or you could also get something like this.

138
00:10:17,840 --> 00:10:24,080
So this is what we are going to get after rotation, unlike with the case of 90 degrees, where if you

139
00:10:24,080 --> 00:10:28,640
have a face, which is already let's let's change this, let's delete this.

140
00:10:28,640 --> 00:10:36,590
If we have a face which is already, say, tilted like this, we have a face tilted like this after

141
00:10:36,590 --> 00:10:38,300
rotating 90 degrees.

142
00:10:38,300 --> 00:10:46,640
What you have now let's let's have your this 90 degrees, what you have now will be a face tilted this

143
00:10:46,640 --> 00:10:47,120
way.

144
00:10:47,540 --> 00:10:53,090
And this is in a very usual position when taking an image.

145
00:10:53,090 --> 00:10:54,800
So when taking a photo.

146
00:10:54,800 --> 00:10:57,680
So the image is not validation or test.

147
00:10:57,680 --> 00:10:59,960
This set wouldn't look like this.

148
00:10:59,960 --> 00:11:03,410
And so that's why we actually limited this year.

149
00:11:03,710 --> 00:11:12,800
So let's get back to our code now and we run this let's run this training data and then let's visualize

150
00:11:12,810 --> 00:11:13,730
our data set.

151
00:11:15,720 --> 00:11:16,530
There we go.

152
00:11:16,560 --> 00:11:22,080
As you can see, you do not have images which are upside down as we had before.

153
00:11:22,350 --> 00:11:23,740
And that's it.

154
00:11:23,760 --> 00:11:25,470
So we have this now.

155
00:11:25,470 --> 00:11:29,310
We now go ahead and retrain our model and see what we get.

156
00:11:30,420 --> 00:11:37,260
After training for over 20 epochs, your results, we get you see that we go up to 78%.

157
00:11:37,260 --> 00:11:39,050
So this is the highest we get.

158
00:11:39,060 --> 00:11:43,080
And when we run the evaluation, you see here we have this.

159
00:11:43,560 --> 00:11:51,210
When we run the evaluation, what we get is 77.8% for the accuracy and the top case 91%.

160
00:11:51,210 --> 00:11:58,960
So we've improved compared to what we had before or what we had before the data augmentation.

161
00:11:58,980 --> 00:12:03,900
Now we are going to use another data augmentation strategy, which is the cut mix.

162
00:12:03,930 --> 00:12:11,730
Now, the cut mix isn't like this other than augmentation strategies, where we just modify a single

163
00:12:11,730 --> 00:12:17,640
image with a cut mix, as we have seen in the previous sections, as we actually combine two images.

164
00:12:17,640 --> 00:12:23,790
So what we're going to do here is we're going to simply copy out this code and then put it out in this

165
00:12:24,870 --> 00:12:26,070
code base right here.

166
00:12:26,100 --> 00:12:29,760
Now, if you haven't or if you're new to the cut, mix the documentation.

167
00:12:29,760 --> 00:12:33,690
You could check out the previous sections where we treat this in detail.

168
00:12:33,690 --> 00:12:36,780
So let's get your and there we go.

169
00:12:36,780 --> 00:12:41,730
We're going to apply Cut, mix and see the effect it has after training our model.

170
00:12:41,730 --> 00:12:45,020
So we have that augmentation in here.

171
00:12:45,030 --> 00:12:55,410
Let's let's add the cell and have your cut mix, augmentation, cut cell, and let's paste all that

172
00:12:55,680 --> 00:12:56,940
part of the code.

173
00:12:57,450 --> 00:13:02,640
We also paste out this part where we have the training data set one and our training data set to then

174
00:13:02,640 --> 00:13:05,070
we create this mixed dataset.

175
00:13:05,070 --> 00:13:10,380
So from here we have augment augments layer.

176
00:13:10,470 --> 00:13:15,600
Let's take this off and here we have AUGMENT layer.

177
00:13:15,600 --> 00:13:19,020
So we carry out augmentation for this two separately.

178
00:13:19,020 --> 00:13:24,840
And then once we have this two year, we're going to shuffle the shuffling already.

179
00:13:24,840 --> 00:13:30,210
So we just do the mapping, we do the mapping, and that's what we're saying.

180
00:13:30,210 --> 00:13:34,890
Once we have this two year, we're going to combine this into this one mixed data set.

181
00:13:34,980 --> 00:13:38,130
So that's it, that's have that.

182
00:13:38,130 --> 00:13:41,840
And then from year now, we build our training data set.

183
00:13:41,850 --> 00:13:46,500
So let's take this year, let's take this up.

184
00:13:47,520 --> 00:13:48,330
There we go.

185
00:13:48,330 --> 00:13:55,650
This is a preparation and then we now going to command this part so we could command this one.

186
00:13:55,650 --> 00:13:58,170
And we have a validation data set.

187
00:13:58,170 --> 00:13:59,070
That's fine.

188
00:13:59,070 --> 00:14:01,320
Everything looks fine and that's okay.

189
00:14:01,320 --> 00:14:03,360
So we have this set.

190
00:14:03,360 --> 00:14:06,300
Let's run this now for our cut mix.

191
00:14:06,300 --> 00:14:15,120
You could always try to mix up all the cut out augmentation strategies and see how it better or how

192
00:14:15,120 --> 00:14:17,550
it ameliorates your model performance.

193
00:14:17,550 --> 00:14:19,530
So let's have this.

194
00:14:19,530 --> 00:14:20,790
Let's run this.

195
00:14:21,240 --> 00:14:22,050
There we go.

196
00:14:22,050 --> 00:14:30,660
There's our cut mix here, and we combine the two data sets and then once the combined, we now apply

197
00:14:30,660 --> 00:14:32,400
the cut mix augmentation.

198
00:14:34,170 --> 00:14:37,260
Here we have this size not defined.

199
00:14:39,090 --> 00:14:43,860
Let's actually we should have okay, this should be our configuration.

200
00:14:43,860 --> 00:14:46,620
So we should have this configuration.

201
00:14:47,610 --> 00:14:50,760
We run this cell now run that again.

202
00:14:50,760 --> 00:14:56,130
We want to cut mix and we run the cells and now everything should be fine.

203
00:14:56,130 --> 00:14:56,970
So there we go.

204
00:14:56,970 --> 00:15:07,110
We have all this fine, and then we get to validation, run that training data set and validation data

205
00:15:07,110 --> 00:15:07,860
sets.

206
00:15:08,430 --> 00:15:09,090
So that's it.

207
00:15:09,090 --> 00:15:17,970
Let's go ahead and retrain our model and see what we get after training for 20 epochs, we notice that

208
00:15:17,970 --> 00:15:27,720
unlike previously where the training was went to about 99%, while the validation was about 77%.

209
00:15:27,720 --> 00:15:31,680
Here the training is about 80%.

210
00:15:31,680 --> 00:15:35,310
And the validation let's curl this way.

211
00:15:35,340 --> 00:15:41,010
The validation is about all the highest we have here is 78%.

212
00:15:41,250 --> 00:15:48,960
So this shows us clearly that the model isn't overfitting because the training and the validation data

213
00:15:48,960 --> 00:15:52,770
set both are evolving in a similar manner.

214
00:15:53,040 --> 00:15:55,560
So let's scroll down here and look at this.

215
00:15:55,560 --> 00:15:59,610
Curves you see here before, look at the accuracy plot.

216
00:15:59,610 --> 00:16:04,230
Before what we had was something like this.

217
00:16:04,620 --> 00:16:09,600
We had the training and yeah, we had one.

218
00:16:10,140 --> 00:16:14,640
And then the validation was like this.

219
00:16:15,380 --> 00:16:20,870
But now we have these two curves which are evolving in a similar manner, though sometimes we have those

220
00:16:20,870 --> 00:16:22,970
kinds of peaks anyway.

221
00:16:25,130 --> 00:16:30,710
Clearly, our model isn't overfitting, so what we'll do is we're going to train for more epochs.

222
00:16:30,710 --> 00:16:37,940
So that set, let's go ahead and retrain this model for more epochs.

223
00:16:37,940 --> 00:16:42,290
So we're going to modify the number of epochs.

224
00:16:42,290 --> 00:16:46,010
So yeah, we just have to or what we can do simply is we just keep training.

225
00:16:46,010 --> 00:16:47,750
So we train for 20 more epochs.

226
00:16:47,750 --> 00:16:50,810
So let's run this and then we'll wait for 20 more epochs.

227
00:16:50,840 --> 00:16:53,480
Train now complete the other results we get.

228
00:16:53,480 --> 00:16:59,920
As you could see, the validation accuracy starts to stagnate around this.

229
00:16:59,930 --> 00:17:05,310
And then after evaluation we obtain an accuracy of 79.71%.

230
00:17:05,330 --> 00:17:12,680
But what's interesting to note here is the fact that our training accuracy is still having this value

231
00:17:12,680 --> 00:17:14,540
of 86.25%.

232
00:17:14,540 --> 00:17:21,260
Unlike previously, when where our model was overfitting, we had this accuracy of about 99%.

233
00:17:21,260 --> 00:17:22,160
So that's it.

234
00:17:22,160 --> 00:17:24,070
We have that we've evaluated.

235
00:17:24,080 --> 00:17:28,820
We see these values and then test on this values.

236
00:17:28,820 --> 00:17:34,130
Here we see that we have 14 out of 16 images predicted correctly.

237
00:17:34,130 --> 00:17:37,100
And here's our conversion metrics right here.

238
00:17:37,100 --> 00:17:45,080
See that clearly this model performs best or performs better than all the other previous models.