1
00:00:00,150 --> 00:00:05,490
Hello, everyone, and welcome to this new session in which we're going to implement the CAD Mix data

2
00:00:05,490 --> 00:00:12,000
augmentation strategy with tensor flow to the CAT Mix data augmentation strategy, though based on the

3
00:00:12,000 --> 00:00:17,250
fact that we are combining two different samples is different from that of the mix up.

4
00:00:17,250 --> 00:00:23,520
Instead, with a cut mix data augmentation, we are going to take a random patch from one of the samples

5
00:00:23,520 --> 00:00:29,010
and attach to the other sample while modifying the labels accordingly.

6
00:00:29,010 --> 00:00:36,000
We've looked at how to implement data augmentation with TensorFlow and also how to implement more advanced

7
00:00:36,000 --> 00:00:40,710
data augmentation strategies like the mix up in the section.

8
00:00:40,710 --> 00:00:45,840
We'll look at this cut mix data augmentation strategy here.

9
00:00:45,840 --> 00:00:53,520
If we suppose that we have these two images, Image one and image two, we are going to randomly crop

10
00:00:53,520 --> 00:00:55,020
a part of this image.

11
00:00:55,020 --> 00:01:01,950
So just like you can see in this output here, the randomly cropped the section from this image and

12
00:01:01,950 --> 00:01:11,400
then attach that to this other image such that what you have in the output is this one with this patch.

13
00:01:11,400 --> 00:01:16,920
So this is how this cut mix data augmentation is implemented.

14
00:01:17,730 --> 00:01:26,340
If we take this example where we have this cut and this dog right here will try to randomly crop apart

15
00:01:26,340 --> 00:01:36,420
from this dog and then attach that pad at the same position on this cat image here to do this cropping

16
00:01:36,420 --> 00:01:38,940
operation, we get to TensorFlow image.

17
00:01:38,940 --> 00:01:40,620
Let's have this.

18
00:01:40,620 --> 00:01:44,850
We have this TensorFlow image and then we have the crop to bounding box.

19
00:01:44,850 --> 00:01:50,880
So we click on this and then we have this definition.

20
00:01:51,510 --> 00:01:56,880
We see the arguments, we pass in the image, and then we are going to see, oh, we're going to specify

21
00:01:56,880 --> 00:02:00,480
the offset height, offset wheat, target height and target wheat.

22
00:02:00,510 --> 00:02:03,600
Now, let's explain what all this means.

23
00:02:04,050 --> 00:02:09,690
If you have an image like this one, just as it's giving you the offset height is a vertical coordinate

24
00:02:09,690 --> 00:02:13,590
of the top left corner of the bounding box in the image.

25
00:02:14,730 --> 00:02:21,930
And so this means that if we randomly select, for example, this box right here, let's suppose we

26
00:02:21,940 --> 00:02:31,890
randomly selected this box and our our offset height will be this distance that is will our reference

27
00:02:31,890 --> 00:02:37,050
is this top left corner and our offset height will be this distance here.

28
00:02:37,560 --> 00:02:44,040
This distance and our offset width will be this other distance.

29
00:02:44,400 --> 00:02:49,260
So that's how we we have this offset height and the offset width.

30
00:02:51,270 --> 00:02:54,120
And then the target height is the height of the bounding box.

31
00:02:54,120 --> 00:03:01,350
So here we have this bounding boxes height and target wheat the width of the bounding box.

32
00:03:02,520 --> 00:03:09,960
So once you provide this, it will be able to automatically crop out this zone from the image.

33
00:03:11,100 --> 00:03:15,990
So coming back to the code, we are going to add this other extra subplot.

34
00:03:15,990 --> 00:03:20,730
Let's have this other subplot we have.

35
00:03:20,730 --> 00:03:22,500
Let's pass this code first.

36
00:03:23,040 --> 00:03:24,720
Could you just copy this out?

37
00:03:24,720 --> 00:03:27,960
We have this and then we have the subplot.

38
00:03:27,960 --> 00:03:32,430
Third position, what we'll be doing is we're not going to be having this.

39
00:03:32,430 --> 00:03:35,370
We're going to have image, let's call that image three.

40
00:03:35,370 --> 00:03:38,280
And then what we'll be doing here is we'll be making use of this method.

41
00:03:38,280 --> 00:03:45,090
So we have this method and then we'll specify this offset height, offset width, target height, target

42
00:03:45,090 --> 00:03:45,480
width.

43
00:03:45,480 --> 00:03:50,700
Let's suppose that our offset height is, let's say 20.

44
00:03:50,700 --> 00:03:58,170
So here we have 20, 20, say 15, so we have 2015.

45
00:03:58,170 --> 00:04:07,740
And then let's suppose this target height is 100 hundred and you're say 98.

46
00:04:07,770 --> 00:04:14,370
Okay, so we have this specified and then without passing the image, the image will be using as image

47
00:04:14,370 --> 00:04:17,250
to this image here and that will be it.

48
00:04:17,250 --> 00:04:24,140
So here we take this off and then we could simply show this parses here and that's fine.

49
00:04:24,150 --> 00:04:25,650
Now we could take this off.

50
00:04:26,070 --> 00:04:27,420
Okay, so we have this.

51
00:04:27,420 --> 00:04:30,540
We can now plot this out and see what we get.

52
00:04:31,350 --> 00:04:32,010
Okay.

53
00:04:32,010 --> 00:04:40,740
As you could see, what we obtain here is the correct is a cropping of this zone, as you could see.

54
00:04:40,740 --> 00:04:43,830
You see that we have something around this.

55
00:04:43,830 --> 00:04:46,080
We crop out this zone.

56
00:04:46,080 --> 00:04:47,460
So actually it takes more of this.

57
00:04:47,460 --> 00:04:48,690
So it's something like this.

58
00:04:49,050 --> 00:04:54,240
Now we could modify this, like we could shift the height and the width so we could actually maintain

59
00:04:54,240 --> 00:04:57,150
the height, but shift the weight so that we could get more off the dog.

60
00:04:57,150 --> 00:04:58,560
So let's do just that.

61
00:04:58,560 --> 00:04:59,610
We shift the width.

62
00:05:00,140 --> 00:05:06,020
Let's say we we take 100 to we run that and here is what we get.

63
00:05:06,020 --> 00:05:08,840
You see that we get the docs face this time around.

64
00:05:09,050 --> 00:05:13,840
And basically this is how we crop out a region from this image.

65
00:05:13,850 --> 00:05:21,650
Now, once we've done this cropping, we want to create another image which is made of only this crop

66
00:05:21,650 --> 00:05:26,510
while the remaining zones are actually left out.

67
00:05:26,540 --> 00:05:32,870
Now to do that, we'll make use of this other method, which is the path to bounding box.

68
00:05:33,110 --> 00:05:38,860
So yeah, we have this image PA two bounding box, we'll copy it out and you see how it works.

69
00:05:38,870 --> 00:05:41,270
So here we have this image path two bounding box.

70
00:05:41,270 --> 00:05:42,680
We're going to create another plot.

71
00:05:42,680 --> 00:05:44,870
So let's, let's increase this number of plots.

72
00:05:45,170 --> 00:05:51,260
We have four and year four and year four, and let's pick this out first.

73
00:05:51,260 --> 00:05:52,700
So we have that.

74
00:05:52,700 --> 00:05:58,040
We copied this pieces out and then create this foot plot.

75
00:05:58,160 --> 00:06:07,160
So we have this foot plot and now what we'll do is just copy this here and paste it out.

76
00:06:07,490 --> 00:06:11,510
So here we, we paste this out and then let's get back.

77
00:06:11,990 --> 00:06:15,830
And then what we pass in as image now is this cropped image.

78
00:06:15,830 --> 00:06:18,020
So let's, let's actually copy this.

79
00:06:18,020 --> 00:06:24,830
Let's say we have this crop, let's call it a crop, and then we have let's take this off.

80
00:06:25,790 --> 00:06:28,580
We have the crop pays it out.

81
00:06:29,000 --> 00:06:32,990
So yeah, we have the crop and now we're going to take in the crop now.

82
00:06:32,990 --> 00:06:35,870
So after let's look at this.

83
00:06:35,870 --> 00:06:43,910
So after we've all, we've had an error before, so let's do this so you could see better.

84
00:06:46,730 --> 00:06:47,060
Okay.

85
00:06:47,060 --> 00:06:53,420
So actually what we're seeing here is we have this pattern to be done and then we're taking this image.

86
00:06:53,420 --> 00:07:02,480
And then what we want to pad it all, like stamped on another image which contains only zero pixels.

87
00:07:02,810 --> 00:07:04,490
So let's look at that.

88
00:07:05,570 --> 00:07:12,950
We have that, We have this pad, we pass in the crop, we have the offset height and width, and then

89
00:07:12,950 --> 00:07:16,010
the target height has to be given.

90
00:07:16,010 --> 00:07:23,450
So here we're going to put in the image size because we want this to be padded on an image with this

91
00:07:23,450 --> 00:07:24,320
dimensions.

92
00:07:24,320 --> 00:07:26,210
So we have the image size there.

93
00:07:26,240 --> 00:07:29,240
Now we could run this and see what we get.

94
00:07:30,230 --> 00:07:31,550
We have this error.

95
00:07:31,790 --> 00:07:33,800
Let's modify that quickly.

96
00:07:34,150 --> 00:07:43,060
So yeah, we have this taken off and then we have 2100 or 100 OC.

97
00:07:43,070 --> 00:07:47,900
So let's run that and we should have some reasonable offset with the gain.

98
00:07:47,990 --> 00:07:49,550
OC We should take this off.

99
00:07:50,840 --> 00:07:55,910
Okay, So let's look at this and there we go.

100
00:07:55,940 --> 00:07:58,680
As you could see, we have exactly what we expect.

101
00:07:58,700 --> 00:08:03,800
You see that we have this somehow the same image, but we've taken out only this crop.

102
00:08:03,800 --> 00:08:05,630
So that's what we actually want.

103
00:08:05,630 --> 00:08:13,250
We want to be able to take all of this crop and then take this like this crop and then add it with this

104
00:08:13,250 --> 00:08:14,120
image here.

105
00:08:14,120 --> 00:08:21,680
So what we want to do is want to take this image now and then add it with this image so that we could

106
00:08:21,680 --> 00:08:26,180
be able to create our data augmentation pipeline.

107
00:08:27,110 --> 00:08:29,780
Now, we could we could call this image four.

108
00:08:29,780 --> 00:08:36,260
So let's let's have this as an image for equal this year.

109
00:08:36,740 --> 00:08:41,450
Let's take this off image for we paste that out.

110
00:08:41,450 --> 00:08:43,310
And then once we pay that out, we could.

111
00:08:43,310 --> 00:08:46,430
Now let's also do another subplot.

112
00:08:46,430 --> 00:08:48,470
So we keep on doing the subplots.

113
00:08:48,740 --> 00:08:51,440
Let's copy this and paste it here.

114
00:08:51,920 --> 00:08:52,730
That's fine.

115
00:08:52,730 --> 00:08:57,530
We take the crop as we have the the aurora.

116
00:08:57,530 --> 00:09:02,480
We have this image four plus our initial image image one.

117
00:09:02,720 --> 00:09:04,940
So let's have that and then we plug it out.

118
00:09:05,720 --> 00:09:08,150
Okay, We get in this plot, but it isn't very clear.

119
00:09:08,150 --> 00:09:10,730
So let's let's increase that figure size.

120
00:09:10,730 --> 00:09:14,840
Let's add the figure size here and that's it.

121
00:09:14,840 --> 00:09:16,190
So we run that again.

122
00:09:16,190 --> 00:09:18,260
And then let's look at this now.

123
00:09:18,260 --> 00:09:21,290
Claire Okay, here's what we get.

124
00:09:21,290 --> 00:09:28,250
So as you could see here, we have this patch, like it looks like some is working, but there is a

125
00:09:28,250 --> 00:09:33,070
problem as we have some sort of mixture of this and the initial image.

126
00:09:33,080 --> 00:09:40,430
Now this is logical since when you do this addition for this for this black region, you just have only

127
00:09:40,430 --> 00:09:41,060
the cut.

128
00:09:41,060 --> 00:09:47,000
But for this region you still have the part of this cat image.

129
00:09:47,000 --> 00:09:54,020
So what we need to do is we need to remove this part such that when you take this and add to this,

130
00:09:54,020 --> 00:10:01,280
it just fits in like a puzzle then to crop out just this part from this image, we are going to use

131
00:10:01,280 --> 00:10:05,560
the same process we've used for this dark image right here.

132
00:10:05,570 --> 00:10:08,210
We simply copy out what we had already here.

133
00:10:08,210 --> 00:10:10,340
So we had this cropping.

134
00:10:10,370 --> 00:10:14,210
I could copy this out and piece it out here.

135
00:10:14,510 --> 00:10:16,700
We run that and here is what we get.

136
00:10:16,700 --> 00:10:19,040
So you see, we have this crop again.

137
00:10:19,040 --> 00:10:25,820
Now we'll take this crop and then back to bounding box as we had done here previously.

138
00:10:25,970 --> 00:10:27,320
So let's copy this.

139
00:10:27,320 --> 00:10:29,180
And then what do we have?

140
00:10:29,330 --> 00:10:33,890
We have the seventh and then we're going to pass in this crop.

141
00:10:33,890 --> 00:10:35,660
Kat So we passing the crop.

142
00:10:35,660 --> 00:10:37,400
Kat And that's it.

143
00:10:38,270 --> 00:10:45,230
But we'll print out this image, so let's call this image five and here we have image five.

144
00:10:45,230 --> 00:10:48,500
We run that and this is actually what we get.

145
00:10:48,530 --> 00:10:49,220
Now.

146
00:10:49,220 --> 00:10:57,320
The aim is for us to be able to take this and subtract, like take this and subtract.

147
00:10:58,310 --> 00:11:04,310
This image from eight such that I will be left with the full cat.

148
00:11:04,310 --> 00:11:06,560
Without this portion.

149
00:11:06,560 --> 00:11:08,000
Without this portion.

150
00:11:08,000 --> 00:11:10,250
This portion right here.

151
00:11:11,730 --> 00:11:20,400
So that said, if just here you do image one minus this image five, you would get this answer.

152
00:11:20,670 --> 00:11:25,560
So as you could see, you have this whole cat without this portion.

153
00:11:25,560 --> 00:11:27,370
And this is exactly what we want.

154
00:11:27,390 --> 00:11:33,870
Now that we have this part, we could add it up with what we wanted initially here.

155
00:11:33,870 --> 00:11:37,890
So we add it out with this image four.

156
00:11:38,010 --> 00:11:46,890
So here we have plus image four and we get the response and there we go.

157
00:11:46,890 --> 00:11:55,530
So we've completed this process of cutting out this portion from your and then fit it in on this image

158
00:11:55,530 --> 00:11:56,130
one.

159
00:11:57,090 --> 00:12:01,640
We now take off this other parts or take this off.

160
00:12:03,500 --> 00:12:04,400
And there we go.

161
00:12:04,400 --> 00:12:12,260
We now have the crop, the image four, crop cut, image five, and then our final image.

162
00:12:12,710 --> 00:12:17,390
We copy out this mix up code we had done previously.

163
00:12:17,390 --> 00:12:19,070
So we get back to this.

164
00:12:20,480 --> 00:12:25,220
You see now, we could easily integrate this since we have image one and image two.

165
00:12:25,250 --> 00:12:26,810
We just left with the labels.

166
00:12:26,870 --> 00:12:28,490
Let's just copy this out.

167
00:12:29,450 --> 00:12:30,080
Cut that.

168
00:12:30,080 --> 00:12:33,280
And then here we take this off.

169
00:12:33,290 --> 00:12:34,490
Pays it out.

170
00:12:35,600 --> 00:12:36,740
There we go.

171
00:12:37,430 --> 00:12:39,890
And we now have the output image.

172
00:12:40,190 --> 00:12:44,780
We now modify this variable name, such that crop is crop one.

173
00:12:44,780 --> 00:12:48,740
So here we have crop one instead of crop.

174
00:12:48,740 --> 00:12:50,840
And then yeah, we have crop two.

175
00:12:51,200 --> 00:12:59,390
So everywhere we meet crop crop cut, we have crop two, here we have crop one and then this image two

176
00:12:59,390 --> 00:13:06,380
is maintained is from here we have image two, then the image four we could turn this to pad one, So

177
00:13:06,380 --> 00:13:07,910
let's call this pad one.

178
00:13:09,260 --> 00:13:15,110
Yeah, we have pad one, but this is actually pad two since we're working with the image two.

179
00:13:15,110 --> 00:13:17,330
So let's call this pad two and crop two.

180
00:13:17,330 --> 00:13:19,070
We have the crop two.

181
00:13:19,070 --> 00:13:23,480
Then from this we have crop one and your image one.

182
00:13:23,480 --> 00:13:29,450
Yeah, we have crop one and here we have this image five will call it pad one.

183
00:13:30,620 --> 00:13:31,850
Then coming right here.

184
00:13:31,850 --> 00:13:38,240
Image five, image five is pad one and then image four, pad two.

185
00:13:38,780 --> 00:13:40,250
So we have that done.

186
00:13:40,250 --> 00:13:44,510
Now we could focus on how to get this bounding boxes.

187
00:13:44,510 --> 00:13:50,360
So yeah, we just picked this bounding box, but how do we get this bounding box?

188
00:13:50,360 --> 00:13:55,730
To get an answer to this question, we are going to make reference to this formulas which we're given

189
00:13:55,730 --> 00:13:57,290
in the paper right here.

190
00:13:57,290 --> 00:14:05,930
We are told that our X is drawn from the uniform distribution which takes parameters zero and the width,

191
00:14:05,930 --> 00:14:10,970
and then our Y is drawn from the uniform distribution, which takes parameter zero and the height.

192
00:14:11,840 --> 00:14:22,490
Now if we consider this image and say this box randomly picked, then our X is a center that is this

193
00:14:22,490 --> 00:14:27,170
distance from this that is based on this origin actually.

194
00:14:27,170 --> 00:14:34,580
So we have this vertical distance to this center, which is our Y, and then this horizontal distance,

195
00:14:34,580 --> 00:14:41,690
which R, which is our X, So that's how we obtain our Y and our X, we should draw from the uniform

196
00:14:41,690 --> 00:14:42,710
distribution.

197
00:14:42,950 --> 00:14:50,120
Now to obtain our W and our H, we have this formula right here.

198
00:14:50,120 --> 00:14:53,180
RW equal w w the width of the image.

199
00:14:53,180 --> 00:14:57,420
So we have w times the square root of one minus lambda.

200
00:14:57,440 --> 00:15:03,170
Recall how we obtained lambda with a mix up data augmentation strategy.

201
00:15:03,170 --> 00:15:06,350
So is this exactly the same way we get lambda?

202
00:15:06,350 --> 00:15:14,240
So here we have w times one square root of one minus lambda r h equal the h the height of the image,

203
00:15:14,810 --> 00:15:17,030
Times Square root of one minus lambda two.

204
00:15:17,120 --> 00:15:26,090
Now note that when you multiply this two, that is, if you take our W times our H, you will have the

205
00:15:26,090 --> 00:15:27,530
numerator right here.

206
00:15:27,980 --> 00:15:39,320
Let's have this our w times our h gives you this numerator and then if you multiply these two you will

207
00:15:39,320 --> 00:15:49,760
have w h times the square root of one minus lambda.

208
00:15:50,180 --> 00:15:55,460
So if you multiply these times this, it gives you w h these times, this square root of one minus lambda

209
00:15:55,460 --> 00:16:01,940
times the square root of one minus lambda, and you have the square root of one minus lambda.

210
00:16:02,570 --> 00:16:08,570
Okay, so this is equal our w r h.

211
00:16:09,740 --> 00:16:11,930
Now we'll see how they obtain this formula.

212
00:16:11,960 --> 00:16:23,430
You see here that since you have RW rh you could divide your by w h and divide year by w h.

213
00:16:24,950 --> 00:16:27,890
This goes away and we left with this.

214
00:16:28,040 --> 00:16:38,000
Obviously, if I have the square root of X times the square root of X, then it is equal x since I'm

215
00:16:38,000 --> 00:16:40,250
having the square root of x squared.

216
00:16:40,460 --> 00:16:43,240
Now the square comes with the square root, it gives you x.

217
00:16:43,250 --> 00:16:45,620
So in this case our x is one minus lambda.

218
00:16:45,620 --> 00:16:49,370
So this is equal one minus lambda.

219
00:16:49,370 --> 00:16:52,160
And that's how they obtained this relationship right here.

220
00:16:52,790 --> 00:16:57,580
Now, that said, we have our H or we've taken this off.

221
00:16:57,590 --> 00:16:59,660
Let's go one step back.

222
00:17:00,020 --> 00:17:02,600
We have our H, Let's take this.

223
00:17:02,600 --> 00:17:03,350
We have.

224
00:17:03,770 --> 00:17:07,070
Sorry our W and then we have our H.

225
00:17:07,070 --> 00:17:11,960
So this is our W the width and then yours are our H.

226
00:17:11,960 --> 00:17:12,980
The height.

227
00:17:13,340 --> 00:17:14,870
Our height.

228
00:17:16,400 --> 00:17:24,500
But recall that what we have to pass into the method which permits us crop out the zone, for example,

229
00:17:25,490 --> 00:17:35,090
is actually this point right here, the top left corner of this bounding box.

230
00:17:35,150 --> 00:17:39,710
So we are not going to use the center, but instead it's top left corner.

231
00:17:39,740 --> 00:17:45,130
Now how do we get the top left corner to to to leave from the center to the top left corner?

232
00:17:45,140 --> 00:17:52,130
What we need to take into consideration the fact that where we notice this width, for example, now,

233
00:17:52,130 --> 00:17:59,870
if we know this distance and we we have this distance from this to this point here, we're supposed

234
00:17:59,950 --> 00:18:04,490
to the center, then we could obviously get this distance here.

235
00:18:04,970 --> 00:18:06,680
We could get this distance.

236
00:18:07,400 --> 00:18:11,090
To obtain this distance, we need to take all this.

237
00:18:12,140 --> 00:18:15,260
Minus just this portion.

238
00:18:16,650 --> 00:18:19,020
So obtain this potion right here.

239
00:18:19,410 --> 00:18:26,180
Now, to obtain this portion is easy because we already know the width are a RW.

240
00:18:26,190 --> 00:18:28,260
So all we need to do is divide this by two.

241
00:18:28,290 --> 00:18:31,680
If we divide RW by two, then we have this distance.

242
00:18:31,680 --> 00:18:39,960
And since we already have this distance our X, then we can obtain the coordinate, the x coordinate

243
00:18:39,960 --> 00:18:42,210
for this point right here.

244
00:18:42,630 --> 00:18:50,960
Now the next thing to do is to find our H, and we're going to use sorry, we're going to find our Y,

245
00:18:50,970 --> 00:18:58,320
and we're going to use the same method we did to find our X from base in the top left corner for our

246
00:18:58,320 --> 00:18:58,770
Y.

247
00:18:58,770 --> 00:19:00,780
We know this distance.

248
00:19:01,440 --> 00:19:03,610
Now, do we know this distance?

249
00:19:03,630 --> 00:19:09,660
Yes, we know this distance because this is half of the total height since we are found at the center.

250
00:19:09,900 --> 00:19:17,010
Now, if we have this distance, then we could find this distance because this distance plus this distance

251
00:19:17,010 --> 00:19:19,560
gives us this distance.

252
00:19:19,560 --> 00:19:27,210
And so to get this distance, we need to take this distance minus this distance, to obtain this distance.

253
00:19:27,210 --> 00:19:31,530
And if we have this distance, then we have the y coordinate of this point right here.

254
00:19:32,910 --> 00:19:39,990
Recall that the reason why we go into all this is simply because the methods of by TensorFlow consider

255
00:19:39,990 --> 00:19:45,990
that the bounding box coordinates are given based on the step left on this top left corner right here.

256
00:19:46,560 --> 00:19:54,090
So now we define this function box, which takes in the lambda, takes in lambda.

257
00:19:54,480 --> 00:20:02,490
And then what this does is it makes use of this uniform distribution to obtain our X, our Y, and then

258
00:20:02,910 --> 00:20:06,000
makes use of lambda to obtain rw rh.

259
00:20:06,120 --> 00:20:13,230
Now, getting back to our uniform distribution, we could copy this right here and then put this in

260
00:20:13,230 --> 00:20:13,860
our code.

261
00:20:13,860 --> 00:20:16,440
Let's for now, let's keep the functional side.

262
00:20:16,440 --> 00:20:17,550
Let's just have this.

263
00:20:17,550 --> 00:20:20,130
So here we have our uniform distribution.

264
00:20:20,410 --> 00:20:23,130
Recall our low is zero.

265
00:20:23,160 --> 00:20:24,630
Let's take this back.

266
00:20:24,630 --> 00:20:29,280
Our low is zero and our high is the width.

267
00:20:29,280 --> 00:20:34,290
So we have your M size and we have that OC.

268
00:20:34,290 --> 00:20:43,330
So we've defined this and then we have our W or rather our X, that's our x.

269
00:20:43,350 --> 00:20:53,070
Now we want to have our Y, let's have it, our x, our Y, and then we could simply copy this out.

270
00:20:53,250 --> 00:20:54,180
There we go.

271
00:20:54,180 --> 00:20:57,360
We have this, let's print this out.

272
00:20:57,360 --> 00:20:58,500
So we see what we get.

273
00:20:59,310 --> 00:21:02,370
Print out our X, for example.

274
00:21:02,370 --> 00:21:03,030
There we go.

275
00:21:03,030 --> 00:21:06,720
We have our x m size not defined.

276
00:21:06,720 --> 00:21:07,830
We started a notebook.

277
00:21:07,830 --> 00:21:11,940
So let's get back to we defining this in size.

278
00:21:12,390 --> 00:21:13,230
We run it again.

279
00:21:13,230 --> 00:21:15,140
And this time around everything looks fine.

280
00:21:15,150 --> 00:21:20,040
Now we could draw a sample from this distribution, so let's have sample.

281
00:21:20,040 --> 00:21:23,220
We draw a single sample and see what we get.

282
00:21:23,220 --> 00:21:24,300
You see, we have that.

283
00:21:24,450 --> 00:21:29,340
We could take this zero element and there we go.

284
00:21:29,340 --> 00:21:32,370
So now we're able to draw a sample from our distribution.

285
00:21:32,370 --> 00:21:34,170
We could do the same for our Y.

286
00:21:34,170 --> 00:21:42,030
So here basically we have this, and then what we want to do is to ensure that this coordinates are

287
00:21:42,030 --> 00:21:42,690
integers.

288
00:21:42,690 --> 00:21:44,490
So we could cast this.

289
00:21:45,330 --> 00:21:51,480
We can see that we have the D type equal int to two.

290
00:21:51,510 --> 00:21:52,440
That's fine.

291
00:21:53,040 --> 00:21:55,560
Copy this out and paste it here.

292
00:21:56,130 --> 00:22:03,960
So we have that and then we cast this to now we have our X are Y, let's just do the sampling directly.

293
00:22:03,960 --> 00:22:11,010
So we've taken this and then we do the sampling right here.

294
00:22:11,640 --> 00:22:17,910
Sample, take a single sample, take zero elements and that's fine.

295
00:22:18,120 --> 00:22:21,960
We do the same year, pays it out and everything looks okay.

296
00:22:21,960 --> 00:22:28,140
So yeah, we could now print out, we could now print out our X and our Y.

297
00:22:29,550 --> 00:22:30,480
Okay, it looks great.

298
00:22:30,480 --> 00:22:35,490
We have now our random, our X and our Y go run again.

299
00:22:35,490 --> 00:22:37,020
So you could see the response.

300
00:22:37,770 --> 00:22:41,880
Now we have to obtain in size times the square root of one minus lambda.

301
00:22:42,120 --> 00:22:48,030
So we suppose that we have lambda, we actually going to use the same method we had used previously.

302
00:22:48,030 --> 00:23:00,030
So suppose here we have our lambda and then right here to obtain our w we have the size times tf match

303
00:23:00,030 --> 00:23:07,740
square root of one minus lambda OC one minus lambda.

304
00:23:07,740 --> 00:23:08,910
Okay, that's fine.

305
00:23:08,910 --> 00:23:15,060
Now we have our h, our height, our H same.

306
00:23:15,330 --> 00:23:17,220
Size one minus lambda.

307
00:23:17,250 --> 00:23:17,970
That's okay.

308
00:23:17,970 --> 00:23:24,260
So now we have our H, our RW and our H, our X and our Y.

309
00:23:24,270 --> 00:23:36,420
What we'll have to do is to modify this our x so we will have our x equal our x minus the width divided

310
00:23:36,420 --> 00:23:37,260
by two.

311
00:23:37,290 --> 00:23:42,090
Now we want to have a whole number, so we have that minus that divided by two.

312
00:23:43,470 --> 00:23:45,360
But the width is the M size.

313
00:23:45,360 --> 00:23:48,010
So we want to have this in size.

314
00:23:48,030 --> 00:23:48,880
There we go.

315
00:23:48,900 --> 00:23:57,810
Now, if you don't get why we're using this, just get back to this image to obtain this distance.

316
00:23:57,810 --> 00:23:59,430
That's to obtain this coordinate.

317
00:23:59,430 --> 00:24:06,990
This top left corner will simply take this distance to the center, minus the width divided by two,

318
00:24:06,990 --> 00:24:08,590
and we do the same for the height.

319
00:24:08,610 --> 00:24:09,450
So that's it.

320
00:24:09,690 --> 00:24:11,370
Now we're doing that.

321
00:24:11,370 --> 00:24:14,580
We have our x equal x minus M side divided by two.

322
00:24:14,610 --> 00:24:16,020
We repeat the same.

323
00:24:16,680 --> 00:24:20,100
And then we have it for Y.

324
00:24:20,820 --> 00:24:28,560
Now, this is actually this width divided by two, the width of the box, not the another width of the

325
00:24:28,560 --> 00:24:29,400
whole image.

326
00:24:29,400 --> 00:24:30,900
So we're making an error here.

327
00:24:31,050 --> 00:24:32,390
Let's get back to this.

328
00:24:32,400 --> 00:24:36,950
We're making use of this width, actually, not the width of the whole image.

329
00:24:36,960 --> 00:24:40,440
So let's get back and we'll put this code after this.

330
00:24:40,440 --> 00:24:51,940
So because we're going to be making use of this, our H and our W, now we have here this is our w rw.

331
00:24:52,410 --> 00:24:53,370
That's fine.

332
00:24:53,730 --> 00:24:54,390
Okay.

333
00:24:54,390 --> 00:25:03,540
Now, yeah, we're going to make use of our H So we have the rh, r y, our Y, okay, so now we have

334
00:25:03,540 --> 00:25:13,830
our X and our Y, we could now print this again and then also print out our W and our H.

335
00:25:14,790 --> 00:25:16,200
We get in this error.

336
00:25:16,830 --> 00:25:25,200
We're told that in this computation right here, the RW is meant to be an inch that was passed as a

337
00:25:25,200 --> 00:25:29,370
float since after this computation here we would have that as a float.

338
00:25:29,460 --> 00:25:34,530
Now we could, we could always print out our w dx type.

339
00:25:35,250 --> 00:25:37,170
Let's run that and see what we get.

340
00:25:37,380 --> 00:25:44,730
You see, we have a float, so we'll modify that and cast it to ensure that we have an int.

341
00:25:45,060 --> 00:25:45,990
There we go.

342
00:25:45,990 --> 00:25:49,140
We have D type equal.

343
00:25:49,140 --> 00:25:53,610
Let's just copy this to F instead of two.

344
00:25:53,820 --> 00:26:00,000
So we paste this out here and paste it out this way to that cast and that's fine.

345
00:26:00,000 --> 00:26:05,870
So now we expect to have the expected response and there we go.

346
00:26:05,880 --> 00:26:14,490
You see we have our X, we have our Y, we have our W, and we have our H.

347
00:26:15,090 --> 00:26:22,260
When we run our we run the cell several times you will notice that you will have some negative values

348
00:26:22,260 --> 00:26:24,060
popping up from time to time.

349
00:26:24,240 --> 00:26:31,290
And this actually happens when, for example, we could have our center at this level, but our weight

350
00:26:31,290 --> 00:26:35,400
is so large that it doesn't fit in the image.

351
00:26:35,400 --> 00:26:37,440
So it goes in a negative direction.

352
00:26:37,440 --> 00:26:44,100
And this kind of situations are, say, in this kind of situation, you could have our box going out

353
00:26:44,100 --> 00:26:45,090
of the image.

354
00:26:45,090 --> 00:26:51,960
And so we have to ensure that each time we create in this box, we limit it to the image.

355
00:26:51,960 --> 00:26:54,720
So this box now let's take this off.

356
00:26:55,290 --> 00:26:59,310
This box now will be this one.

357
00:26:59,310 --> 00:27:05,160
And this auto box right here will be this box.

358
00:27:05,430 --> 00:27:09,440
So we redraw this box, but it wouldn't go out of the image.

359
00:27:09,450 --> 00:27:12,420
Our aim is to ensure that we take this pan off.

360
00:27:12,810 --> 00:27:14,340
Okay, so that's it.

361
00:27:14,340 --> 00:27:19,680
We are now going to keep this values programmatically by using the clip by value method.

362
00:27:19,680 --> 00:27:24,330
So here we have to clip by value.

363
00:27:24,600 --> 00:27:25,620
There we go.

364
00:27:25,710 --> 00:27:26,760
We have that.

365
00:27:26,760 --> 00:27:30,720
And then once we pass this in, we're going to specify the range.

366
00:27:30,720 --> 00:27:36,450
So we're making sure that the values always fall in this range zero to the size.

367
00:27:36,870 --> 00:27:37,890
So that's fine.

368
00:27:38,400 --> 00:27:39,330
Piece that out.

369
00:27:39,330 --> 00:27:42,900
And we have this clip by value right here.

370
00:27:42,900 --> 00:27:45,180
So that's it for our X are Y.

371
00:27:45,180 --> 00:27:51,780
And now after running, you should have only values fall in that range after clipping.

372
00:27:51,780 --> 00:28:01,410
What we have is if we had say, initially this box right here, we now have this box.

373
00:28:01,980 --> 00:28:04,290
So this is what we get after clipping.

374
00:28:05,520 --> 00:28:12,240
Of course we get the coordinates of the top left corner, but the width and this particular case has

375
00:28:12,240 --> 00:28:12,960
changed.

376
00:28:12,960 --> 00:28:14,880
In another situation we could have.

377
00:28:15,020 --> 00:28:23,540
US, for example, if we have this in this case, the height changes because we don't we no longer have

378
00:28:23,540 --> 00:28:31,640
this height, but now this new height, in a situation where we had a box which was like this, for

379
00:28:31,640 --> 00:28:35,870
example here, but the height and the width will have to change.

380
00:28:35,870 --> 00:28:43,790
And so based on on this new modifications, we have to ensure that we actually pass the right width

381
00:28:43,790 --> 00:28:45,530
and the right height.

382
00:28:46,160 --> 00:28:53,960
And so what we could do is make sure that once we have the center and we subtract the width or we subtract

383
00:28:54,080 --> 00:29:01,100
half of the wheat and half of the height, to obtain this, we call we had the center, we had x minus

384
00:29:01,100 --> 00:29:12,200
the width divided by two to obtain the x coordinate of this point and then y minus the height divided

385
00:29:12,200 --> 00:29:16,040
by two to obtain the Y coordinate of this point.

386
00:29:16,100 --> 00:29:22,040
Now what we'll do is we'll obtain this coordinate right here.

387
00:29:22,580 --> 00:29:29,960
But while implementing the clip function and so this means that if we had a box like normally centered

388
00:29:29,960 --> 00:29:37,460
in the image like this, and then we have this would obtain this point top left corner, and then we'll

389
00:29:37,460 --> 00:29:51,740
obtain this point bottom right corner just by using X plus W divided by two and Y plus W divided by

390
00:29:51,740 --> 00:29:52,210
two.

391
00:29:52,220 --> 00:29:58,880
So this point is of has coordinate x plus W divided by two and y plus W divided by two.

392
00:29:59,060 --> 00:30:04,730
And that said after clipping would have just this now.

393
00:30:04,820 --> 00:30:12,800
But what's interesting about the method of clipping after getting this is that now we know exactly where

394
00:30:12,800 --> 00:30:16,880
this bottom left is found or rather this bottom right is found.

395
00:30:16,880 --> 00:30:22,100
And if we know if we have this cut in it and we have this cut in it, then we could recalculate the

396
00:30:22,100 --> 00:30:24,560
width and the height to recalculate the width.

397
00:30:24,560 --> 00:30:27,740
And the height suffices to take, for example, for the width.

398
00:30:27,740 --> 00:30:38,150
We take this points here, the x axis minus this x axis right here or minus x axis at this point, because

399
00:30:38,150 --> 00:30:49,100
this point and this for the x remain constant, whereas for the Y axis, this point and this also remain

400
00:30:49,100 --> 00:30:49,790
constant.

401
00:30:49,790 --> 00:30:59,060
So that said, all we need to do now is to take this Y minus this Y, so obtain the height and then

402
00:30:59,060 --> 00:31:03,080
this x minus this x to obtain the width.

403
00:31:03,980 --> 00:31:07,910
So based on that, we will re copy this out and paste.

404
00:31:07,910 --> 00:31:17,960
So yeah, we have x button right x bottom right and y y bottom right.

405
00:31:18,290 --> 00:31:20,960
All we need to do is to change this and add a plus.

406
00:31:20,960 --> 00:31:23,870
And here we add a plus, we clipping by value.

407
00:31:23,870 --> 00:31:32,300
So we'll always ensure that it's found in the image then also now to obtain this final RW.

408
00:31:32,540 --> 00:31:37,880
So our RW now is equal to our y bottom, right?

409
00:31:37,880 --> 00:31:49,370
Minus our our Y, and then the r h will be equal our Y instead.

410
00:31:49,370 --> 00:31:54,080
This is actually X and here is x class H That's W sorry.

411
00:31:54,080 --> 00:31:59,680
So we have y be R minus our Y.

412
00:31:59,690 --> 00:32:08,030
Okay, So now we've modified the way we calculate this based on the fact that the box may be out of

413
00:32:08,030 --> 00:32:14,900
the image and that now it has some modifications to be done on the width and the height.

414
00:32:15,230 --> 00:32:26,870
Now the next step we have to take is if this r w is equal equals zero, then we have to make sure that

415
00:32:26,870 --> 00:32:33,500
our W becomes one and then you're we repeat the same process.

416
00:32:33,500 --> 00:32:41,570
If our H equals zero, then our H becomes one.

417
00:32:41,570 --> 00:32:42,360
So that is it.

418
00:32:42,380 --> 00:32:47,480
Now we have our X, our Y, our W, and our H.

419
00:32:47,600 --> 00:32:50,630
That said, let's now create our box method.

420
00:32:50,630 --> 00:32:53,150
We have our box method right here.

421
00:32:53,330 --> 00:32:54,440
Let's have that.

422
00:32:54,440 --> 00:32:56,540
And this is what we return.

423
00:32:57,290 --> 00:32:59,030
So we return this.

424
00:32:59,030 --> 00:33:07,040
But note that here we taken the height, the offset height, offset width, target height, target width.

425
00:33:07,340 --> 00:33:14,690
And so this means that what we have to output in this method right here has to be our Y.

426
00:33:15,460 --> 00:33:15,730
R.

427
00:33:15,730 --> 00:33:17,760
X and then RH.

428
00:33:18,070 --> 00:33:19,300
RW.

429
00:33:19,510 --> 00:33:20,710
So there we go.

430
00:33:20,710 --> 00:33:24,850
We return that and normally everything should be fine.

431
00:33:24,850 --> 00:33:27,580
So we have this box method defined.

432
00:33:27,580 --> 00:33:32,800
We run that, we get in this error, we should pass in our lambda.

433
00:33:34,180 --> 00:33:35,590
Let's go down.

434
00:33:35,590 --> 00:33:38,050
We have here lambda.

435
00:33:38,320 --> 00:33:41,500
Okay, so we've passed in lambda and that's fine.

436
00:33:42,190 --> 00:33:52,240
Now next thing to do is get back to our mix up and then we like we did before with this lambda, we

437
00:33:52,240 --> 00:33:53,530
actually could get this.

438
00:33:53,530 --> 00:33:57,610
We, we passed in the lambda so we could take this off from here.

439
00:33:57,700 --> 00:34:05,890
So we run this again and then we'll define this lambda creation just right here we have Lambda.

440
00:34:06,370 --> 00:34:10,720
Let's go back, we have lambda and then now we have the box.

441
00:34:10,720 --> 00:34:14,410
So this box is going to produce this output we need here.

442
00:34:14,410 --> 00:34:28,870
Let's call it our Y and that other in the documentation are y rh are w equal box of lambda.

443
00:34:29,590 --> 00:34:33,430
So now we have box of lambda and then we have this outputs here.

444
00:34:33,580 --> 00:34:44,320
Now instead of passing this, we have our Y, our X, our H, and our W, Let's take this off those

445
00:34:44,320 --> 00:34:46,630
boxes which we fixed initially.

446
00:34:47,380 --> 00:34:48,580
We have this.

447
00:34:50,260 --> 00:34:57,190
Yeah, we have our Y and our X scroll down.

448
00:34:57,190 --> 00:34:57,940
That's our X.

449
00:34:57,940 --> 00:34:58,570
Okay.

450
00:34:59,350 --> 00:35:01,120
Now we repeat the same process here.

451
00:35:01,120 --> 00:35:07,180
We just simply copy this out and paste here, paste it out.

452
00:35:07,180 --> 00:35:10,540
Then also we have our Y, our X again.

453
00:35:10,540 --> 00:35:14,650
So our y, our x, let's take that off.

454
00:35:14,830 --> 00:35:19,840
Okay, So that's what we have now and the mix up seems to be fine.

455
00:35:19,840 --> 00:35:25,360
So we could, we could get back, run this method and then run our mix up method again.

456
00:35:25,780 --> 00:35:33,790
Now as usual, we are going to mix up these two data sets, so it suffices to run this.

457
00:35:33,850 --> 00:35:38,440
And then for this one we are going to instead of using the mix up.

458
00:35:38,740 --> 00:35:45,880
So yeah, we commend this part and then we have the map and cut mix.

459
00:35:45,880 --> 00:35:50,680
So here we have the cut mix method, we run this and everything should be fine.

460
00:35:51,520 --> 00:35:52,660
We get this error.

461
00:35:52,690 --> 00:35:54,640
Let's check on how we call this.

462
00:35:54,640 --> 00:35:55,960
Oh, we call this mix up.

463
00:35:55,960 --> 00:35:57,480
Still, this should be cut.

464
00:35:57,500 --> 00:35:59,680
Mix can mix.

465
00:35:59,680 --> 00:36:00,700
We run that.

466
00:36:01,180 --> 00:36:01,990
Fine.

467
00:36:02,500 --> 00:36:07,870
Let's get back to this here and run it again.

468
00:36:08,470 --> 00:36:09,760
Okay, that looks fine.

469
00:36:10,120 --> 00:36:11,260
Train data set.

470
00:36:11,260 --> 00:36:13,000
Everything looks fine now.

471
00:36:13,000 --> 00:36:15,400
We'll try to plot out some values.

472
00:36:15,400 --> 00:36:18,610
So you see clearly what this looks like.

473
00:36:18,970 --> 00:36:28,240
Let's create this new code, sell down your PCs out, and we are going to show this image from our training

474
00:36:28,240 --> 00:36:28,890
data.

475
00:36:28,900 --> 00:36:29,950
And there we go.

476
00:36:29,950 --> 00:36:32,950
We could notice this patch could run this again.

477
00:36:33,250 --> 00:36:36,130
And this time around we even have a bigger patch.

478
00:36:36,130 --> 00:36:37,030
So that's it.

479
00:36:37,060 --> 00:36:41,890
We've seen how to come up with this data augmentation strategy.

480
00:36:42,190 --> 00:36:46,120
But yeah, we had to do do with the label.

481
00:36:46,120 --> 00:36:52,630
So I think if we print out this label, let's print out a level here.

482
00:36:52,720 --> 00:36:56,530
See, we print out the label, we just get an all zeros.

483
00:36:56,530 --> 00:37:03,250
Anyways, let's go ahead and implement this section for the label.

484
00:37:03,550 --> 00:37:11,530
And to do that, if you could recall this formula, you have one minus lambda equals this.

485
00:37:11,530 --> 00:37:25,360
So this means that lambda is equal one minus rw r h divided by w h And the reason why we need to do

486
00:37:25,360 --> 00:37:35,710
this again is simply because the RH like at this level where we had this RH and RW modifications, all

487
00:37:35,710 --> 00:37:41,070
those clippings, we have to ensure that when creating the label, that condition is verified.

488
00:37:41,080 --> 00:37:43,930
So let's get back, let's check on this formula.

489
00:37:43,930 --> 00:37:46,300
Lambda Equal one minus RH.

490
00:37:46,300 --> 00:37:51,040
RW So here again we have Lambda equal.

491
00:37:51,460 --> 00:37:57,880
RH Times our W or RW.

492
00:37:57,880 --> 00:38:10,030
RH Anyway, we have this RW RH then divided by w H So here we have MX size M size times.

493
00:38:10,030 --> 00:38:11,290
M size.

494
00:38:11,320 --> 00:38:14,650
Okay, Now this is one minus all the.

495
00:38:15,380 --> 00:38:18,080
Then we apply the same formula for the mix up.

496
00:38:18,080 --> 00:38:22,040
We have this lambda we've got now lambda.

497
00:38:22,040 --> 00:38:23,720
We now pass this.

498
00:38:24,260 --> 00:38:27,550
We have level one and level two.

499
00:38:27,730 --> 00:38:29,180
Okay, so there we go.

500
00:38:29,180 --> 00:38:33,290
We have our cut mix and then let's get back.

501
00:38:33,290 --> 00:38:41,600
Actually, let's, we will have to re run this run our cut mix and then for our training data would

502
00:38:41,600 --> 00:38:42,980
have to re run again.

503
00:38:42,980 --> 00:38:46,520
So we'll have to run these cells again.

504
00:38:47,330 --> 00:38:50,810
We run this and then recreate our training data set.

505
00:38:53,120 --> 00:38:58,130
We're getting this error meaning that Lambda is of type float 64.

506
00:38:58,550 --> 00:39:07,490
We get back to this and then we do some casting to send this make this float 32 OC That said, we run

507
00:39:07,490 --> 00:39:14,420
our cut mix cut makes run we we run this.

508
00:39:14,420 --> 00:39:15,560
There we go.

509
00:39:15,590 --> 00:39:20,900
Yours can mix we run this again and this is fine now.

510
00:39:20,900 --> 00:39:25,220
So we have our training data set and now we'll go ahead and visualize it.

511
00:39:26,150 --> 00:39:28,430
We have this double knowns right here.

512
00:39:28,430 --> 00:39:35,060
So it's preferable for us to actually we run this from this point.

513
00:39:36,290 --> 00:39:42,980
Okay, So we run that from that point and then we get back to this.

514
00:39:42,980 --> 00:39:45,410
We run this one.

515
00:39:45,410 --> 00:39:46,160
That's fine.

516
00:39:46,160 --> 00:39:50,270
We can run this and this should be OC now.

517
00:39:50,270 --> 00:39:50,990
Okay, that's fine.

518
00:39:50,990 --> 00:39:54,250
Now we have everything in tact, so that's fine.

519
00:39:54,260 --> 00:40:02,470
We can now visualize our data so you can see we have our patch and then unlike before where we had at

520
00:40:02,630 --> 00:40:09,890
zero or one, now we have values between zero and one as our levels.

521
00:40:09,890 --> 00:40:16,700
So let's now run this model compiling and training.

522
00:40:17,210 --> 00:40:20,660
After the training is completed, we obtain this results.

523
00:40:20,660 --> 00:40:24,020
We see here how this accuracy.

524
00:40:24,020 --> 00:40:30,140
Let's scroll down here we have this accuracy, which doesn't really change much.

525
00:40:30,140 --> 00:40:38,900
So it's just around this 4648 and how is this 50%?

526
00:40:38,900 --> 00:40:42,770
So it's around this 45 to 50% range.

527
00:40:42,770 --> 00:40:45,590
And the last two doesn't really change much.

528
00:40:45,590 --> 00:40:49,910
Now, this is due to the fact that the model is getting confused.

529
00:40:49,910 --> 00:40:56,300
And the reason why this model gets confused is because if you have, say, this uninfected cell right

530
00:40:56,300 --> 00:41:04,490
here and this prototype cell, it's actually this portion which permits the model, know that the cell

531
00:41:04,490 --> 00:41:05,900
is parasitized.

532
00:41:05,900 --> 00:41:15,920
And so when you come and crop this part right here and then attach it to this cell, you're at this

533
00:41:15,920 --> 00:41:26,090
position, the actual part of this parasitized cell image which makes this parasitized isn't taken into

534
00:41:26,090 --> 00:41:27,020
consideration.

535
00:41:27,020 --> 00:41:35,030
And so the model gets confused as now it doesn't really know how to differentiate between and uninfected

536
00:41:35,030 --> 00:41:36,290
and parasitized.

537
00:41:36,290 --> 00:41:43,040
So again, this cut mix data augmentation strategy isn't adapted for our model, though it could be

538
00:41:43,040 --> 00:41:45,500
applied in many other problems.

539
00:41:45,500 --> 00:41:49,370
We thank you for getting around up to this point and see you next time.
