1
00:00:00,050 --> 00:00:08,060
In this section, we've just seen how to create or how to edit existing data and create new data from

2
00:00:08,060 --> 00:00:10,670
that, um, already existing data.

3
00:00:10,700 --> 00:00:17,420
Now this data may be data we the model has seen already, or data the model has not yet seen.

4
00:00:17,540 --> 00:00:21,770
We're going to move to the next part where we are going to look at data augmentation.

5
00:00:21,770 --> 00:00:26,840
And we will end this section by creating a plugin with 51.

6
00:00:26,840 --> 00:00:28,880
That's essentially a 51 plugin.

7
00:00:28,880 --> 00:00:35,570
So the idea of data augmentation here is we have supposing an input image like this one.

8
00:00:35,570 --> 00:00:38,360
Then we edit this image.

9
00:00:38,360 --> 00:00:41,450
In this case you see how the boots are edited.

10
00:00:41,450 --> 00:00:42,440
You can see this.

11
00:00:42,440 --> 00:00:47,210
And then in this other example you could find that the code is edited.

12
00:00:47,210 --> 00:00:55,100
So what we're saying here is we take um an input pair like this one which is essentially composed of

13
00:00:55,100 --> 00:00:57,500
the input image and its corresponding mask.

14
00:00:57,500 --> 00:01:01,400
And then we edit this image so that we have this new image.

15
00:01:01,400 --> 00:01:04,580
We also can create this other new image.

16
00:01:04,580 --> 00:01:07,310
And then we repeat the same mask.

17
00:01:07,310 --> 00:01:12,440
So we take this mask and then we simply repeat it.

18
00:01:12,440 --> 00:01:19,820
So now you see that we're going from one sample to to two other or to three samples.

19
00:01:19,820 --> 00:01:25,400
So we we triple the sample number by simply editing this image.

20
00:01:25,730 --> 00:01:33,680
Now it should be noted that after editing this specific image right here, we could go ahead and um,

21
00:01:33,680 --> 00:01:40,280
carry out the editing on this one so we could end up with an image where we have the boot turned into

22
00:01:40,280 --> 00:01:43,370
red, the code turned into green.

23
00:01:43,550 --> 00:01:50,570
Uh, maybe the sunglasses changed into some, let's say, red sunglasses, maybe the person's skin,

24
00:01:50,900 --> 00:01:53,360
um, turned into black and so on and so forth.

25
00:01:53,360 --> 00:02:00,590
So we must not always edit from this original image that said, this way of creating new samples or

26
00:02:00,590 --> 00:02:11,240
augmenting our data via editing like this permits us to create much better and diverse, um, augmentations,

27
00:02:11,240 --> 00:02:17,030
as now we could even change the background so we could decide and change this whole background.

28
00:02:17,030 --> 00:02:22,040
We change, um, each and every region here, and we have a whole new image.

29
00:02:22,040 --> 00:02:27,650
We could even change the the person's face and say, for example, um, uh, we want a man.

30
00:02:27,650 --> 00:02:31,190
So now you'll be getting, um, a different background.

31
00:02:31,190 --> 00:02:34,430
You get, um, uh, a man like suit.

32
00:02:34,430 --> 00:02:40,640
Although a man could put on this kind of coat, um, you could change the hairstyle and so on and so

33
00:02:40,640 --> 00:02:41,030
forth.

34
00:02:41,030 --> 00:02:46,640
But nonetheless, it should be noted that, um, the shortcoming of this method, especially with a

35
00:02:46,640 --> 00:02:53,450
technique like um, or especially when dealing with segmentation, is that the newly edited image,

36
00:02:53,450 --> 00:02:57,170
uh, may not match with the the mask.

37
00:02:57,170 --> 00:03:01,700
So we may end up with a slightly noisy data set.

38
00:03:01,700 --> 00:03:10,580
That said, let's dive into the code and see how we could, um, create our new data samples from,

39
00:03:10,580 --> 00:03:12,860
um, our original data set.

40
00:03:12,860 --> 00:03:19,910
We can now get back to the code, and you should note that the the code for the data augmentation is

41
00:03:19,910 --> 00:03:24,920
going to be very similar to that of the image editing as here.

42
00:03:24,950 --> 00:03:32,720
The only difference is, unlike with the image editing where we have the model like um, here we had

43
00:03:32,720 --> 00:03:35,660
the model which predicted the mask.

44
00:03:35,660 --> 00:03:39,410
That's our trained, um, former model which predicted the mask.

45
00:03:39,500 --> 00:03:46,400
We are going to make use of the already existing mask for a given, um, data point or for a given input

46
00:03:46,400 --> 00:03:46,760
image.

47
00:03:46,760 --> 00:03:52,070
So if we have an input image like this one, we'll make use of its corresponding mask, um, which is

48
00:03:52,070 --> 00:03:58,670
given in the data set to generate this, uh, mask, which will then be added with a prompt, um, passed

49
00:03:58,670 --> 00:04:03,170
into the inpainting model to create a new data point.

50
00:04:03,170 --> 00:04:05,690
So again here we have the same model ID.

51
00:04:05,690 --> 00:04:10,520
We have this create pipeline, which essentially is the same, uh, we've seen already.

52
00:04:10,520 --> 00:04:16,160
Then we have this generate inputs, which is different from what we just saw on the image editing.

53
00:04:16,160 --> 00:04:22,880
And you will notice that the difference here, um, lies the fact that we take off this part, which

54
00:04:22,910 --> 00:04:23,990
includes the model.

55
00:04:23,990 --> 00:04:27,140
So all this resizing, we don't really need all this.

56
00:04:27,140 --> 00:04:34,940
That's you see here we, um, resize, we normalize, we pass into the model, and then we resize again,

57
00:04:35,120 --> 00:04:36,590
um, before creating our mask.

58
00:04:36,590 --> 00:04:41,720
But since our data or our data set already comes with this, we take off all this.

59
00:04:41,720 --> 00:04:49,040
So getting back to data augmentation, um, scrolling down you see here we have the image path and the

60
00:04:49,040 --> 00:04:53,360
mask path which is passed in together obviously with a with a mask ID.

61
00:04:53,510 --> 00:04:57,140
So, um, in here we have the image path.

62
00:04:57,140 --> 00:04:59,690
You see, you read that image path, you have the source image.

63
00:04:59,920 --> 00:05:04,480
You have the mask, which is also read just as the the image path was read.

64
00:05:04,480 --> 00:05:11,230
This read from our data set, um, then obviously specified as grayscale, then the same processing

65
00:05:11,230 --> 00:05:18,850
we did previously to select a particular, um, category is what we, we, we repeat here.

66
00:05:18,850 --> 00:05:20,140
So no change in the code.

67
00:05:20,140 --> 00:05:22,480
Then we convert this to Pil images.

68
00:05:22,480 --> 00:05:23,590
So that's it.

69
00:05:23,590 --> 00:05:29,560
Uh, once that's done, the next step will be again similar to what we saw with image editing.

70
00:05:29,560 --> 00:05:34,930
So here under image editing, we simply, uh, make use of our pipeline.

71
00:05:34,930 --> 00:05:38,290
And then we pass in the prompt, we pass in the image the mask.

72
00:05:38,290 --> 00:05:40,630
And then we got our output image.

73
00:05:40,630 --> 00:05:42,910
So it's the same thing we're going to do here.

74
00:05:42,910 --> 00:05:44,320
Let's scroll down.

75
00:05:44,470 --> 00:05:47,320
Here we have the pipe, um, our pipeline.

76
00:05:47,320 --> 00:05:48,670
We pass the prompt.

77
00:05:48,670 --> 00:05:54,850
But now we have this two parameters which we've added the guidance skill and uh, number of inference

78
00:05:54,850 --> 00:05:56,740
steps and the generator.

79
00:05:56,740 --> 00:05:59,170
Now, it should be noted that we could also do the same.

80
00:05:59,170 --> 00:06:02,470
We could also add this here in our image editing.

81
00:06:02,470 --> 00:06:09,400
So we could we could add the the guidance guidance skill.

82
00:06:09,820 --> 00:06:12,250
And we could also add the number of inference steps.

83
00:06:12,250 --> 00:06:15,250
So this three extra parameters have been added.

84
00:06:15,250 --> 00:06:22,840
Now this guidance skill shows the model how much it should adhere to the prompt which we've passed in

85
00:06:22,840 --> 00:06:23,290
here.

86
00:06:23,290 --> 00:06:31,090
And also obviously um increases or as we increase this guidance skill, um, increases the image quality.

87
00:06:31,090 --> 00:06:36,610
Then for the number of inference steps, if you want to hire or increase the number of inference steps,

88
00:06:36,610 --> 00:06:41,920
increase the image quality while reducing um reduces the image quality.

89
00:06:41,920 --> 00:06:47,590
Nonetheless, stable diffusion works well for with even low number of inference steps.

90
00:06:47,590 --> 00:06:50,710
Then for the generator this is for reproducibility.

91
00:06:50,710 --> 00:06:56,920
So if we set your random seed, or if you set your seed to ten, then each and every time you generate

92
00:06:56,920 --> 00:06:58,690
the exact same image.

93
00:06:58,690 --> 00:07:00,280
So that's it.

94
00:07:00,280 --> 00:07:03,310
Um, um, we have the usual image and the mask.

95
00:07:03,310 --> 00:07:08,740
Then one thing we want to do is we want to add this, um, number of images per prompt, meaning that

96
00:07:08,740 --> 00:07:15,610
for every time we call this arc paint method, we could say, for example, we generate ten images and

97
00:07:15,610 --> 00:07:16,090
that's it.

98
00:07:16,090 --> 00:07:19,180
So we put all this in a list and then we return the value here.

99
00:07:19,180 --> 00:07:24,310
We're just we're just taking in the Or we're just taking out from this just a single one.

100
00:07:24,310 --> 00:07:30,970
So we could take this off and take uh, retrieve all the, uh, edited images.

101
00:07:31,660 --> 00:07:37,960
Um, the next we have this hash method which essentially produces, um, a value like this one.

102
00:07:37,960 --> 00:07:45,730
And the reason why we needed this method is simply because here when we create, when we create our,

103
00:07:46,330 --> 00:07:51,910
our extra data, let's say we have this extra data, let's have this here.

104
00:07:51,910 --> 00:07:56,080
Let's say this is called 00033.

105
00:07:56,080 --> 00:07:59,860
Let's increase the size to say 100 okay.

106
00:07:59,860 --> 00:08:02,110
So let's say this is called 003.

107
00:08:02,110 --> 00:08:08,320
Then in that case if we want to create some extra samples or if want to augment our data, we could

108
00:08:08,320 --> 00:08:10,060
call this 003.

109
00:08:10,060 --> 00:08:15,970
Now underscore the hash value we had from here so we could copy this.

110
00:08:16,150 --> 00:08:18,880
Go back here and paste it out.

111
00:08:18,880 --> 00:08:19,810
There we go.

112
00:08:19,810 --> 00:08:24,280
So this will be the file name for this new um data.

113
00:08:24,280 --> 00:08:27,970
So we have this copied and there we go.

114
00:08:27,970 --> 00:08:31,390
Let's just change this a little so we could say this is E.

115
00:08:31,390 --> 00:08:32,020
There we go.

116
00:08:32,020 --> 00:08:35,470
So now we have this three samples with three different um, file names.

117
00:08:35,470 --> 00:08:38,200
So that's essentially why we need this hash.

118
00:08:38,530 --> 00:08:39,880
Um that's fine.

119
00:08:39,880 --> 00:08:45,970
The we'll just keep this part and then go straight into this transform sample method.

120
00:08:46,390 --> 00:08:54,460
What the transform samples method does is it takes in a sample from our 51 data set.

121
00:08:54,460 --> 00:09:00,970
It takes in um, the category ID or the specific category in this case quote.

122
00:09:00,970 --> 00:09:07,240
It takes in a prompt and then it modifies that sample based on the prompt.

123
00:09:07,240 --> 00:09:09,880
And the corresponding category.

124
00:09:09,880 --> 00:09:16,390
Then creates a new sample which is then added to the original 51 data set.

125
00:09:16,390 --> 00:09:22,360
So looking at this um, method, obviously we have the the hash which we create.

126
00:09:22,360 --> 00:09:30,490
And then we have our pipeline which we get using the create pipeline method and the model ID, um,

127
00:09:30,490 --> 00:09:32,590
and passing the model ID into the method.

128
00:09:32,590 --> 00:09:37,630
And then now we generate our inputs which we'll get into our inpainting model.

129
00:09:37,630 --> 00:09:40,030
So the way we generate our inputs is simple.

130
00:09:40,030 --> 00:09:43,810
Since we get a sample from our 51 data set.

131
00:09:43,810 --> 00:09:48,610
Remember our 51 data set uh, which we've seen already so we could get back here.

132
00:09:48,610 --> 00:09:50,740
You see, this is about 51 data set.

133
00:09:51,190 --> 00:09:56,380
Um, let's scroll down and then, um, get back to the code.

134
00:09:56,380 --> 00:09:59,500
So as we're saying, we have let's.

135
00:09:59,860 --> 00:10:03,010
Let's just do data set data set head.

136
00:10:03,220 --> 00:10:04,330
Data set head.

137
00:10:04,330 --> 00:10:06,910
So we could, um, get some samples.

138
00:10:06,910 --> 00:10:07,810
There we go.

139
00:10:07,810 --> 00:10:09,940
You see, we have this sample.

140
00:10:09,940 --> 00:10:11,980
And then from here we could get the file path.

141
00:10:11,980 --> 00:10:18,430
So as we're saying here we have this file path we should pass into our generate inputs method which

142
00:10:18,430 --> 00:10:20,860
will permit us generate an image and a mask.

143
00:10:21,160 --> 00:10:25,060
Um not forgetting the, the ground truth or rather the mask path.

144
00:10:25,060 --> 00:10:28,540
So again here you see we have our given sample.

145
00:10:28,540 --> 00:10:31,390
We have the ground truth mask path.

146
00:10:31,420 --> 00:10:32,170
That's it.

147
00:10:32,170 --> 00:10:37,720
So we specify this tool and then we obtain the input.

148
00:10:37,720 --> 00:10:43,240
So if we get back here uh we had seen that to generate this we needed this image path.

149
00:10:43,240 --> 00:10:44,170
We need a mask path.

150
00:10:44,170 --> 00:10:45,460
And then this mask ID.

151
00:10:45,610 --> 00:10:54,310
Now this ID will depend on, um, the specific category we want to modify when carrying out the augmentation

152
00:10:54,310 --> 00:10:55,150
or the editing.

153
00:10:55,150 --> 00:10:58,720
So here we have a the select class.

154
00:10:58,720 --> 00:10:59,650
Select class.

155
00:10:59,650 --> 00:11:05,890
In this case if it's code then label to ID will convert that into um the specific id.

156
00:11:05,890 --> 00:11:09,280
So if it's code then we should get something like 13.

157
00:11:09,280 --> 00:11:10,270
So that's it.

158
00:11:10,270 --> 00:11:14,860
And then now we pass this inputs into our augment method.

159
00:11:14,860 --> 00:11:16,510
So here we have our pipeline.

160
00:11:16,510 --> 00:11:17,410
We have the prompt.

161
00:11:17,410 --> 00:11:18,160
We have the image.

162
00:11:18,160 --> 00:11:19,060
We have the mask.

163
00:11:19,060 --> 00:11:20,500
We have the gradient scale.

164
00:11:20,500 --> 00:11:25,540
And we have the number of inference steps which we pass into our paint method which we've seen already.

165
00:11:25,540 --> 00:11:28,330
And now that generates the images.

166
00:11:28,330 --> 00:11:32,830
So we now have our, um, new newly generated images.

167
00:11:32,830 --> 00:11:35,710
The next step will be to add this into our data set.

168
00:11:35,710 --> 00:11:42,070
So we simply write this into our data set I have the samples file path.

169
00:11:42,070 --> 00:11:48,670
So if we have a file path like this one right here, or rather this one, let's say we have a file path

170
00:11:48,670 --> 00:11:49,930
like this one.

171
00:11:49,930 --> 00:11:51,010
Let's copy that.

172
00:11:51,790 --> 00:11:57,040
So we could better illustrate, um, or better show you what exactly is going on here.

173
00:11:57,040 --> 00:12:02,350
We if we have this file path, then we take off one, two, three, four.

174
00:12:02,350 --> 00:12:08,620
Then we add the underscore, we add the hash and then we have dot png.

175
00:12:08,650 --> 00:12:09,580
There we go.

176
00:12:09,580 --> 00:12:14,170
And then we write this or this output which is from our pandas.

177
00:12:14,170 --> 00:12:21,790
The the image which we get from our augment method as an edited augmented image is what is reading in

178
00:12:21,790 --> 00:12:22,900
this file path.

179
00:12:22,900 --> 00:12:30,100
So now, uh, we will not only have our image 0018, you also have image 0018 hash png.

180
00:12:30,130 --> 00:12:34,570
Now because we do this for the images, we must also do the same for the PNG masks.

181
00:12:34,570 --> 00:12:36,550
But for the PNG mask is just a copy.

182
00:12:36,550 --> 00:12:41,890
So we'll just copy um what we have in this mask path.

183
00:12:42,220 --> 00:12:50,260
So we'll copy this, this image here or this mask, um, into the same um, folder, but with the difference

184
00:12:50,260 --> 00:12:53,470
that we're going to change its name, its, um, file name.

185
00:12:53,470 --> 00:12:56,350
So we add this hash here, the same hash we used here.

186
00:12:56,350 --> 00:12:58,240
So let's take this off.

187
00:12:58,240 --> 00:12:58,960
That's it.

188
00:12:58,960 --> 00:13:02,260
And then now we could display our output at um.

189
00:13:02,260 --> 00:13:10,150
And it doesn't suffice only to, to say copy this new data into our data set.

190
00:13:10,420 --> 00:13:15,850
Uh, we also need to create a new, um, 51 sample.

191
00:13:15,850 --> 00:13:20,770
So to create this new sample here we have the file path for the image.

192
00:13:20,770 --> 00:13:26,440
Obviously it's going to be um image 0018 um underscore the hash dot png.

193
00:13:26,440 --> 00:13:28,720
So that's it, the same one we have here.

194
00:13:28,720 --> 00:13:33,790
And then for the ground truth, uh, we specify that it's um segmentation.

195
00:13:33,790 --> 00:13:38,410
So we expect to, to, to have a mask or a mask path.

196
00:13:38,410 --> 00:13:43,150
And so here we have our mask path which is the same path we had specified here.

197
00:13:43,150 --> 00:13:48,370
So now um, having this, we are telling 51 that we have this new sample.

198
00:13:48,370 --> 00:13:55,570
And then here we're going to simply just add this new sample into our original data set.

199
00:13:55,570 --> 00:14:00,340
So let's run this and then see what we get after running here.

200
00:14:00,340 --> 00:14:02,260
The output we obtain.

201
00:14:02,260 --> 00:14:04,570
You could see we have this green coat.

202
00:14:04,780 --> 00:14:06,730
Um that's fine.

203
00:14:06,730 --> 00:14:09,580
Let's go ahead and check um our data set.

204
00:14:09,580 --> 00:14:11,920
Let's increase this so we could see this better.

205
00:14:11,920 --> 00:14:15,130
So you see we have this new, uh, mask added.

206
00:14:15,130 --> 00:14:17,650
And also we have this new image added.

207
00:14:17,740 --> 00:14:20,350
Click that and then you could check out the mask.

208
00:14:20,350 --> 00:14:25,540
So now at this point we have our image and its corresponding mask.

209
00:14:25,660 --> 00:14:29,350
Let's go ahead or let's get back to our 51.

210
00:14:29,350 --> 00:14:32,830
Let's get back to visualization and run this again.

211
00:14:33,310 --> 00:14:37,870
So we could see if this has been added to our 51 data set.

212
00:14:37,870 --> 00:14:45,340
Scrolling down our data set, you would find that we have first of all, here at the top you have 78

213
00:14:45,340 --> 00:14:48,010
instead of 77 samples as before.

214
00:14:48,040 --> 00:14:51,490
If you keep scrolling, you will find this extra sample which we just added.

215
00:14:51,490 --> 00:14:54,310
So this is the sample which now has been modified.

216
00:14:54,310 --> 00:14:55,900
And this is what we get.

217
00:14:55,900 --> 00:14:59,650
So we click on that and let's see what we get.

218
00:15:00,030 --> 00:15:00,600
Okay.

219
00:15:00,600 --> 00:15:06,840
So you see we have the sample, you see the green coat and then you have this mask.

220
00:15:07,260 --> 00:15:14,040
Now what if we want to change this or we want to modify this edited sample such that we have red boots.

221
00:15:14,040 --> 00:15:17,490
We could simply, um, get back here.

222
00:15:17,610 --> 00:15:19,740
So here we have boots.

223
00:15:19,740 --> 00:15:20,700
Boots?

224
00:15:20,940 --> 00:15:28,170
Um, a photorealistic photo of a woman wearing a red, red colored, nice looking boot.

225
00:15:28,860 --> 00:15:29,640
Boot.

226
00:15:29,640 --> 00:15:33,900
Um, all red, all red.

227
00:15:33,900 --> 00:15:35,910
Um, here.

228
00:15:35,910 --> 00:15:37,710
All red, high resolution.

229
00:15:37,710 --> 00:15:39,360
Okay, so that's fine.

230
00:15:39,360 --> 00:15:44,820
Now, since we want to update that specific sample, let's, let's let's go.

231
00:15:44,820 --> 00:15:45,960
Let's get back here.

232
00:15:45,960 --> 00:15:48,960
Data set dot head or tail.

233
00:15:48,960 --> 00:15:50,010
That's the end.

234
00:15:50,310 --> 00:15:51,570
Um let's see what we get.

235
00:15:51,570 --> 00:15:53,880
We have all these different samples.

236
00:15:53,880 --> 00:16:01,710
Now we'll say if, if our sample um id because I think yeah, this is it here.

237
00:16:01,710 --> 00:16:06,060
If our sample ID is equal, this value.

238
00:16:06,660 --> 00:16:08,640
So if this is our sample ID.

239
00:16:11,390 --> 00:16:12,380
There we go.

240
00:16:12,380 --> 00:16:14,870
If this is our sample ID, let's take that off.

241
00:16:15,080 --> 00:16:18,920
If this is our sample ID then we are going to run this.

242
00:16:18,920 --> 00:16:22,280
So we're going to create a new sample and add it to our data set.

243
00:16:22,280 --> 00:16:25,130
We don't need a break this time around so that's fine.

244
00:16:25,130 --> 00:16:27,920
Let's run this again and see what we get.

245
00:16:27,920 --> 00:16:31,040
So if this is equal that then you should run that.

246
00:16:31,040 --> 00:16:36,950
So after running you could see we have this new sample with a red boots and with the green coat.

247
00:16:36,980 --> 00:16:43,910
Now this is um, reproducible uh by just specifying the exact same, um, seed.

248
00:16:43,910 --> 00:16:49,310
Let's take this off, get back to our visualization and run this again.

249
00:16:49,310 --> 00:16:55,340
And, um, re visualize our, um, 51 data set.

250
00:16:56,190 --> 00:16:59,400
We've now moved on to 79 samples.

251
00:16:59,400 --> 00:17:03,510
That's from 7778, as we had previously.

252
00:17:03,540 --> 00:17:08,340
Let's scroll down to this bottom and click on this.

253
00:17:08,340 --> 00:17:14,910
And now you could see that this other sample has the boot changed and also the coat.

254
00:17:14,910 --> 00:17:24,150
So now we have um this new sample uh which is a modification of the the previous modification actually.

255
00:17:24,150 --> 00:17:25,980
So this was a previous sample.

256
00:17:25,980 --> 00:17:28,590
We modified this and we got this one.

257
00:17:28,590 --> 00:17:30,210
Let's reduce that a bit.

258
00:17:30,210 --> 00:17:31,530
So you could see clearly.

259
00:17:31,530 --> 00:17:35,850
So as we're saying we modified this and then we got this new one.

260
00:17:35,850 --> 00:17:38,820
Now though you're getting a blue instead of red.

261
00:17:38,820 --> 00:17:44,160
So let's modify let's modify the the way we saved our image.

262
00:17:44,160 --> 00:17:48,000
We're not going to use um open CV to save our image here.

263
00:17:48,000 --> 00:17:53,520
We'll instead do image which will saved is going to be the output.

264
00:17:53,520 --> 00:17:58,350
That's this output uh, which is uh in the Pil format Pil appeal image.

265
00:17:58,350 --> 00:18:03,090
We save this directly without um, converting to a numpy array and saving with, um, open CV.

266
00:18:03,090 --> 00:18:08,460
Let's take that off and then rerun the code and we should get something now.

267
00:18:08,460 --> 00:18:15,540
Better now let's we could change the, we could modify this, um, this uh seed.

268
00:18:15,540 --> 00:18:18,600
So let's change the seed to say 100.

269
00:18:18,600 --> 00:18:20,970
Let's run that again so we could get a different boot.

270
00:18:20,970 --> 00:18:21,840
There we go.

271
00:18:21,840 --> 00:18:23,610
We have this new output.

272
00:18:23,610 --> 00:18:29,280
Now let's go ahead and check and rerun our, um, or refresh this.

273
00:18:29,280 --> 00:18:33,240
So we, we get, um, our new, um, data set.

274
00:18:33,240 --> 00:18:36,300
This is our new sample which has been created.

275
00:18:36,510 --> 00:18:39,690
Um, let's go ahead and take this off.

276
00:18:39,690 --> 00:18:42,270
You see, we have this boots and that's fine.

277
00:18:42,270 --> 00:18:44,250
So that's it.

278
00:18:44,250 --> 00:18:50,340
As we had said already, one disadvantage of this method of data augmentation is that it produces noisy

279
00:18:50,340 --> 00:18:50,820
data.

280
00:18:50,820 --> 00:18:58,080
So the input image, for example, which had um, only this portion to be a purse, now has this whole

281
00:18:58,080 --> 00:18:59,460
portion with a purse.

282
00:18:59,460 --> 00:19:03,210
So this labels this um, section wrongly.

283
00:19:03,810 --> 00:19:12,750
Now what if instead of having to get to the section, run this different, um, code and then have to

284
00:19:12,750 --> 00:19:19,740
get back here, refresh and get our new sample, we just make use of a plugin like this one, which

285
00:19:19,740 --> 00:19:22,560
will permit us augment our data very easily.

286
00:19:22,560 --> 00:19:24,030
So let's see how that works.

287
00:19:24,030 --> 00:19:27,510
You see here we have error occurred during operator execution.

288
00:19:27,810 --> 00:19:33,240
Um, and the reason why we're getting this is simply because we haven't selected any image.

289
00:19:33,240 --> 00:19:36,540
So let's suppose that we want to augment this image.

290
00:19:36,540 --> 00:19:39,300
We click on we click on a given image.

291
00:19:39,300 --> 00:19:43,800
Then we get back here to this to here where we have this browse operations.

292
00:19:43,800 --> 00:19:46,650
We get back there we select the plugin.

293
00:19:46,800 --> 00:19:50,610
Here we select this plugin number of augmentations per sample.

294
00:19:50,610 --> 00:19:52,650
Let's say we want to have five augmentations.

295
00:19:52,650 --> 00:19:55,620
We want to select uh a given class.

296
00:19:55,620 --> 00:20:01,890
Now the classes which we have here are only those found in the given image.

297
00:20:01,890 --> 00:20:09,450
So if we picked um, this image, if we pick this image, you find that we have a vest, we have t shirt,

298
00:20:09,450 --> 00:20:16,800
we have shorts, we have skin, we have shoes, we have sunglasses, we have hair and then we have accessories.

299
00:20:16,800 --> 00:20:24,690
So if we get back, if we get back to our plugin, if we get back to our plugin again here among the

300
00:20:24,690 --> 00:20:31,650
classes we have back hair, shoes, shorts, skin, sunglasses, t shirt, vest, uh, and accessories.

301
00:20:31,650 --> 00:20:33,780
So let's say we let's get back to this.

302
00:20:33,780 --> 00:20:34,710
We have five.

303
00:20:34,710 --> 00:20:41,550
And then the prompt is um, let's say want to update or want to modify the vest.

304
00:20:41,550 --> 00:20:48,660
So let's get back here and take out this prompt which we had already used.

305
00:20:48,660 --> 00:20:51,120
So let's copy this prompt.

306
00:20:52,020 --> 00:20:56,070
Um get back to visualization modify.

307
00:20:56,070 --> 00:21:00,990
Let's scroll down and then modify or include this prompt right here.

308
00:21:00,990 --> 00:21:03,060
So here we have it's actually a man.

309
00:21:03,060 --> 00:21:05,010
So we'll say photorealistic.

310
00:21:05,250 --> 00:21:17,310
Um photo of a man wearing uh let's say a blue, a blue colored, um, nice looking, um, let's say

311
00:21:17,310 --> 00:21:18,840
Gucci vest.

312
00:21:18,930 --> 00:21:19,620
That's fine.

313
00:21:19,620 --> 00:21:22,200
So that's fine.

314
00:21:22,230 --> 00:21:27,990
Um, all blue, all blue, all blue and high contrast.

315
00:21:27,990 --> 00:21:33,990
So you find that instead of having to go and check manually, um, the different categories which are

316
00:21:33,990 --> 00:21:38,970
found in that given image, you, you can automatically generate it from here or make use of that already

317
00:21:38,970 --> 00:21:39,660
generated.

318
00:21:39,660 --> 00:21:41,940
And then you specify a number of inference steps.

319
00:21:41,940 --> 00:21:43,770
So 50 is a default value.

320
00:21:43,800 --> 00:21:44,700
We'll set it.

321
00:21:44,700 --> 00:21:45,570
We'll leave it at that.

322
00:21:45,570 --> 00:21:48,450
The default guidance scale is seven.

323
00:21:48,690 --> 00:21:53,610
Um we could increase the guidance scale to say eight or.

324
00:21:53,610 --> 00:21:55,230
Well we have steps of.

325
00:21:55,550 --> 00:21:58,040
So we take nine and then a random set of ten.

326
00:21:58,040 --> 00:22:00,470
So let's execute that and then see what we get.

327
00:22:00,710 --> 00:22:07,640
While still pixelating we could check out our data because obviously it's already saved.

328
00:22:07,640 --> 00:22:09,440
So we scroll down.

329
00:22:09,440 --> 00:22:10,700
We get to this point.

330
00:22:10,700 --> 00:22:14,150
You see we have the old have the original image.

331
00:22:14,420 --> 00:22:15,410
See right here.

332
00:22:15,440 --> 00:22:17,450
Let's reduce this so you could see better.

333
00:22:17,450 --> 00:22:18,920
We have the original image.

334
00:22:18,920 --> 00:22:23,840
And then we would have this um augmented samples.

335
00:22:23,840 --> 00:22:24,590
We chose five.

336
00:22:24,590 --> 00:22:28,700
So that's why you have 12345 different samples.

337
00:22:28,730 --> 00:22:31,730
Now the only difference you have 01234.

338
00:22:31,760 --> 00:22:32,420
That's it.

339
00:22:32,660 --> 00:22:35,120
So you see you see this one.

340
00:22:35,120 --> 00:22:37,070
That's blue Gucci vest.

341
00:22:37,070 --> 00:22:38,090
That's fine.

342
00:22:38,480 --> 00:22:39,380
There we go.

343
00:22:39,380 --> 00:22:42,650
So this original these are the other samples.

344
00:22:43,160 --> 00:22:43,910
That's it.

345
00:22:43,910 --> 00:22:46,520
We have this other one not really blue.

346
00:22:46,550 --> 00:22:50,060
Then we have this one that's blue.

347
00:22:50,480 --> 00:22:54,200
Um although it doesn't occupy exact same region as the original.

348
00:22:54,470 --> 00:22:55,520
So that's it.

349
00:22:55,520 --> 00:22:59,390
And then we have this other one that's, um, blue.

350
00:22:59,390 --> 00:23:07,070
And then finally we have this other one which looks more like a, uh, what a Gucci vest would look

351
00:23:07,070 --> 00:23:08,240
like that's still blue.

352
00:23:08,240 --> 00:23:09,140
So that's fine.

353
00:23:09,140 --> 00:23:14,690
So this is, um, some extra samples which we've added onto our data set.

354
00:23:14,690 --> 00:23:20,150
Now we are going to dive into how to create our own plugin like the one we just used.

355
00:23:20,150 --> 00:23:28,250
That's uh, that's actually simplifying our work and also helping others in the community, um, easily

356
00:23:28,250 --> 00:23:30,770
create this or augment their own data set.

357
00:23:30,770 --> 00:23:38,120
So obviously a plugin is used or could be used in many other different cases, but in our own case we're

358
00:23:38,120 --> 00:23:40,220
using it for data augmentation.

359
00:23:40,220 --> 00:23:43,970
Now let's get back here where we'll see how to download plugins.

360
00:23:43,970 --> 00:23:46,280
So there are two plugins I'm downloading.

361
00:23:46,280 --> 00:23:48,320
Although I just need this just this one.

362
00:23:48,320 --> 00:23:55,160
So here uh we have this plugin which we download is situated in this GitHub repo.

363
00:23:55,160 --> 00:23:56,720
And then here is his name.

364
00:23:56,720 --> 00:24:00,110
So here we have in this GitHub repo the document.

365
00:24:00,320 --> 00:24:03,440
Um and with this name we simply download that.

366
00:24:03,440 --> 00:24:06,230
And then we get to use it as you.

367
00:24:06,230 --> 00:24:07,970
You saw us use it before.

368
00:24:07,970 --> 00:24:10,580
So let's get into the code.

369
00:24:10,580 --> 00:24:11,180
That's this.

370
00:24:11,330 --> 00:24:18,140
The code you have here is the exact same code you you would have, or you will be downloading to make

371
00:24:18,140 --> 00:24:20,750
use of the plugin, as we had seen already.

372
00:24:21,140 --> 00:24:26,060
So getting here, the only difference now is we have neural learn and here we have neural learn.

373
00:24:26,060 --> 00:24:27,470
So that's the only difference.

374
00:24:27,470 --> 00:24:32,330
So we we have this different files we we find here.

375
00:24:32,330 --> 00:24:32,990
Let's increase this.

376
00:24:32,990 --> 00:24:34,130
So you could see better.

377
00:24:34,130 --> 00:24:39,740
So we have the the assets see the assets just contains the icon.

378
00:24:40,040 --> 00:24:43,340
Then we have the init file.

379
00:24:43,340 --> 00:24:47,180
We have the the git ignore file.

380
00:24:47,180 --> 00:24:50,720
Well we just need we actually need just the init file.

381
00:24:50,720 --> 00:24:53,870
We have we need this 51 uh yml file.

382
00:24:53,870 --> 00:24:56,690
We have we need the requirements of txt file.

383
00:24:56,690 --> 00:24:59,720
And then we have this python utils file.

384
00:24:59,750 --> 00:25:03,200
Now let's start with the 51 Yaml.

385
00:25:03,200 --> 00:25:09,260
So essentially here we just um describing our plugin and um saying what it does.

386
00:25:09,260 --> 00:25:13,340
So here we have the 51 version which we're using.

387
00:25:13,340 --> 00:25:15,590
We have the name Neural learn.

388
00:25:15,800 --> 00:25:19,730
Um the document we have the version of our plugin.

389
00:25:19,730 --> 00:25:21,980
So we could say 1.00.

390
00:25:21,980 --> 00:25:22,880
That's fine.

391
00:25:23,390 --> 00:25:27,110
Um, then we have description test out various augmented.

392
00:25:27,110 --> 00:25:29,330
Uh, these are not argumentations.

393
00:25:29,330 --> 00:25:45,170
This is going to be um, augment um segmentation data set using um stable diffusion inpainting model.

394
00:25:45,170 --> 00:25:45,800
That's fine.

395
00:25:45,800 --> 00:25:49,640
So that's that's the description of our plugin.

396
00:25:50,800 --> 00:26:00,310
Take this off and then we have the URL that's this, um, GitHub URL right here.

397
00:26:00,310 --> 00:26:02,320
And then we have different operators.

398
00:26:02,320 --> 00:26:06,760
Now we're going to check this out in our init method or rather in our init file.

399
00:26:06,760 --> 00:26:08,350
So we have this init.

400
00:26:08,350 --> 00:26:14,830
And in this init we'll see that at the end we have the different operators which we could register.

401
00:26:14,830 --> 00:26:22,330
So we're registering this SD that's standard um stable diffusion augment operator in our plugin.

402
00:26:22,330 --> 00:26:32,170
Now this stable diffusion augment if we scroll up um we have this property, we have or we have this

403
00:26:32,170 --> 00:26:33,190
config method.

404
00:26:33,190 --> 00:26:39,550
And here we specify this name augment with standard deviation or rather with stable diffusion.

405
00:26:39,790 --> 00:26:40,840
Um inpainting.

406
00:26:40,840 --> 00:26:45,550
If we get back to 51 this is exact same name which matches with that.

407
00:26:45,550 --> 00:26:50,410
So we are looking at how to create uh, our own plugin.

408
00:26:50,410 --> 00:26:55,180
Um, essentially or most of what we'll be doing will be in this init method.

409
00:26:55,180 --> 00:27:04,180
So we could dive into or let's just dive into how the plugin is created simply now in this config here

410
00:27:04,180 --> 00:27:06,430
we have this operator config.

411
00:27:06,430 --> 00:27:07,690
We specify the name.

412
00:27:07,690 --> 00:27:12,250
We have a label, we have a description and we set dynamic to true.

413
00:27:12,250 --> 00:27:13,630
Then we have the icon.

414
00:27:13,630 --> 00:27:16,600
So that's in our assets icon SVG.

415
00:27:16,630 --> 00:27:17,290
That's it.

416
00:27:17,290 --> 00:27:18,910
And then we return this config.

417
00:27:18,910 --> 00:27:20,350
So that's it for the config.

418
00:27:20,350 --> 00:27:23,980
Now we have two these two methods which we are using.

419
00:27:23,980 --> 00:27:28,690
That's the resolve um input this resolve input method.

420
00:27:28,690 --> 00:27:34,330
And then we also have this um execute method.

421
00:27:34,330 --> 00:27:37,270
We're not really using this um resolve delegation method.

422
00:27:37,270 --> 00:27:43,360
When we get back to the plugin you will find that we have this, um, interface right here, which makes

423
00:27:43,360 --> 00:27:45,310
our work really easy.

424
00:27:45,310 --> 00:27:48,670
But this interface needs to be pre-designed.

425
00:27:48,670 --> 00:27:55,810
And, uh, where we actually pre-designed this interface is in this resolve input method.

426
00:27:55,810 --> 00:27:58,150
So this acts like our front end.

427
00:27:58,150 --> 00:27:59,260
Let's reduce this.

428
00:27:59,260 --> 00:28:00,460
So you see that clearly.

429
00:28:00,460 --> 00:28:06,760
So in our resolve input method that's this resolve input method acts like our front end.

430
00:28:06,760 --> 00:28:12,880
While the execute method now has our acts like our back end.

431
00:28:12,880 --> 00:28:17,440
In this resolve input method we're going to create this form view.

432
00:28:17,440 --> 00:28:20,560
Now we'll specify the label and the description.

433
00:28:20,560 --> 00:28:24,430
So here we have augmented with stable diffusion inpainting and description.

434
00:28:24,430 --> 00:28:30,550
Apply and uh apply uh stable diffusion inpainting to an image mask pair.

435
00:28:30,580 --> 00:28:35,140
Getting back here we have applying stable diffusion inpainting to the image mask pair.

436
00:28:35,140 --> 00:28:35,830
So that's it.

437
00:28:35,830 --> 00:28:42,220
So just as in the code we had the label augmented with stable diffusion inpainting.

438
00:28:42,220 --> 00:28:45,340
And here we also have augment with stable diffusion inpainting.

439
00:28:45,340 --> 00:28:46,540
So that's the label.

440
00:28:46,540 --> 00:28:51,850
And then the description is shown just right here.

441
00:28:51,850 --> 00:28:52,720
So that's it.

442
00:28:52,750 --> 00:28:54,700
We've created our form view.

443
00:28:54,880 --> 00:29:00,550
The next step will be to um get all this um data from the user.

444
00:29:00,550 --> 00:29:04,600
So if we get back to the code you will find that here we have this inputs.

445
00:29:04,600 --> 00:29:06,640
We should specify this one.

446
00:29:06,640 --> 00:29:08,410
We specify it to be an int.

447
00:29:08,410 --> 00:29:12,100
So we are here we have number of augmentations per sample.

448
00:29:12,100 --> 00:29:14,050
So that's number of arcs.

449
00:29:14,050 --> 00:29:16,660
This is very important because we will be using this in the back end.

450
00:29:16,660 --> 00:29:18,700
So you have to be careful with this.

451
00:29:18,700 --> 00:29:21,700
Then the label number of augmentations per sample.

452
00:29:21,700 --> 00:29:28,900
If you check this here you'll see we specify number of of um well this is it here number of augmentations

453
00:29:28,900 --> 00:29:29,830
per sample.

454
00:29:29,830 --> 00:29:35,980
And then we have the description, um, the number of random augmentations to apply to each sample,

455
00:29:35,980 --> 00:29:38,080
which is exactly what we have here.

456
00:29:38,080 --> 00:29:39,610
So that's it.

457
00:29:39,610 --> 00:29:47,260
And then because it's um, an int, you have this option, you see, you, you, you put um, let's

458
00:29:47,260 --> 00:29:49,750
say two or if you increase, you could increase.

459
00:29:49,960 --> 00:29:53,020
You see, it takes the integers.

460
00:29:53,020 --> 00:29:53,860
That's fine.

461
00:29:54,490 --> 00:29:56,080
Although takes negative negative two.

462
00:29:56,110 --> 00:29:58,600
So we should, um, fix this.

463
00:29:58,600 --> 00:30:02,050
Or you could try to fix this and make sure it takes only positives.

464
00:30:02,050 --> 00:30:03,190
Or you could take negatives.

465
00:30:03,190 --> 00:30:10,630
And then you, you return an error in case it gives or you take uh, as input a negative here from the

466
00:30:10,630 --> 00:30:11,410
back end.

467
00:30:11,410 --> 00:30:13,780
Now for our next we have the class.

468
00:30:13,780 --> 00:30:16,180
We're just going to skip this and and get to the prompt.

469
00:30:16,180 --> 00:30:17,110
And then we'll get back to this.

470
00:30:17,110 --> 00:30:21,580
Since this is a little more complicated as compared to all the remaining, um, inputs.

471
00:30:21,580 --> 00:30:23,980
So we have the prompt.

472
00:30:23,980 --> 00:30:25,060
You get back to the code.

473
00:30:25,060 --> 00:30:26,560
You see here we have prompt.

474
00:30:26,560 --> 00:30:27,940
Let's scroll through.

475
00:30:28,330 --> 00:30:29,830
Um, all this here.

476
00:30:29,830 --> 00:30:31,840
We're not going to take this into consideration for now.

477
00:30:31,840 --> 00:30:33,460
So keep that aside.

478
00:30:33,460 --> 00:30:37,360
We have the prompt which is a string which makes sense.

479
00:30:37,360 --> 00:30:42,730
So here we have prompt um we have the label and then we have the description.

480
00:30:42,730 --> 00:30:43,870
So it's required.

481
00:30:43,870 --> 00:30:45,190
So we'll set that to true.

482
00:30:45,910 --> 00:30:47,530
Um that's it.

483
00:30:47,740 --> 00:30:49,240
Now all this normally should be.

484
00:30:49,890 --> 00:30:53,430
So we should we could modify this here.

485
00:30:53,430 --> 00:30:55,170
And set required.

486
00:30:55,410 --> 00:30:58,980
Required and set to true okay.

487
00:30:58,980 --> 00:31:00,330
So that's fine.

488
00:31:00,630 --> 00:31:02,280
Um let's continue.

489
00:31:02,280 --> 00:31:07,260
We have the inference or the inference steps number of inference steps.

490
00:31:07,350 --> 00:31:12,810
Here you have an int and then the default value is 50.

491
00:31:13,290 --> 00:31:14,040
So that's it.

492
00:31:14,040 --> 00:31:15,990
We have the default value set to 50.

493
00:31:15,990 --> 00:31:17,310
And then the view.

494
00:31:17,310 --> 00:31:21,840
Now the view here is is different from what we had with the prompt.

495
00:31:21,840 --> 00:31:26,850
And and what we also had with the int or what we had with the number of augmentations.

496
00:31:26,850 --> 00:31:28,980
So this view here is is different.

497
00:31:29,430 --> 00:31:33,420
Um, the difference here is that we are having a slider.

498
00:31:33,420 --> 00:31:39,150
So here we have this inference steps slider which we've which we've created from here.

499
00:31:39,150 --> 00:31:41,760
But it's actually a slider view.

500
00:31:41,760 --> 00:31:49,410
So with the form or rather with the int here with the number of augmentations we had a field view.

501
00:31:49,410 --> 00:31:53,250
Whereas here we have a slider view.

502
00:31:53,250 --> 00:31:56,160
Now we're going to specify the mean value.

503
00:31:56,160 --> 00:31:57,390
Obviously we have a label.

504
00:31:57,390 --> 00:32:02,370
We have the mean value goes from 50 to 200 and number of steps is ten.

505
00:32:02,370 --> 00:32:07,020
So we go from 50 to 200 and we go through um steps of ten.

506
00:32:07,020 --> 00:32:13,710
So that's why when you, when you get here and you increase, you see you go from 50, 60, um, 70

507
00:32:13,710 --> 00:32:15,660
and so on and so forth up to 200.

508
00:32:15,660 --> 00:32:17,760
So that's it then?

509
00:32:17,760 --> 00:32:19,410
Um, that's it.

510
00:32:19,410 --> 00:32:20,880
We move to the next.

511
00:32:20,880 --> 00:32:24,210
We have the gradient scale is similar to what we've just seen.

512
00:32:24,210 --> 00:32:26,580
Now this a slider um again.

513
00:32:26,580 --> 00:32:30,780
And then we have min value one max value 30 steps of two.

514
00:32:30,810 --> 00:32:31,770
So that's it.

515
00:32:31,770 --> 00:32:33,570
And the default is set to seven.

516
00:32:33,570 --> 00:32:37,530
That's why if you notice here our default is set here to seven.

517
00:32:37,770 --> 00:32:38,520
See slides.

518
00:32:38,520 --> 00:32:41,280
Um we go through steps of two.

519
00:32:42,120 --> 00:32:42,870
That's fine.

520
00:32:42,870 --> 00:32:46,320
We have then the next our random seed.

521
00:32:46,500 --> 00:32:50,700
So for random seed we go from 1 to 1000 with steps of one.

522
00:32:50,700 --> 00:32:51,900
The default is ten.

523
00:32:52,410 --> 00:32:53,220
That's why we get here.

524
00:32:53,220 --> 00:32:56,490
Again you find the default to be ten okay.

525
00:32:56,490 --> 00:32:57,420
So that's it.

526
00:32:57,420 --> 00:32:59,040
Now getting to the class.

527
00:32:59,040 --> 00:33:02,340
If you get back to the class let's get back up to the class.

528
00:33:02,970 --> 00:33:04,830
Um right here.

529
00:33:04,830 --> 00:33:06,900
Let's uncomment this part.

530
00:33:07,890 --> 00:33:08,280
This.

531
00:33:08,280 --> 00:33:11,370
Uncomment this part we have.

532
00:33:11,370 --> 00:33:15,210
Well let's get back and actually uncomment this.

533
00:33:15,210 --> 00:33:18,780
So we have this uncommented.

534
00:33:19,230 --> 00:33:20,430
There we go.

535
00:33:20,760 --> 00:33:26,460
Um here again we have all unlike the others, we actually have an enum.

536
00:33:26,460 --> 00:33:30,210
So it's an enumeration instead of a string or an int.

537
00:33:30,720 --> 00:33:34,110
Um and here we have this different class choices.

538
00:33:34,140 --> 00:33:42,210
Now remember when you when you pick a specific image, the different classes you are shown will depend

539
00:33:42,210 --> 00:33:42,930
on that image.

540
00:33:42,930 --> 00:33:47,130
So here we have accessories hair necklace pants, shoes, skin vest.

541
00:33:47,130 --> 00:33:50,040
Because of the the specific image we have picked.

542
00:33:50,070 --> 00:33:52,350
Now let's say we pick this image here.

543
00:33:52,350 --> 00:33:53,700
It doesn't contain too many classes.

544
00:33:53,700 --> 00:33:57,330
So we'll take this and let's unselect this one.

545
00:33:57,330 --> 00:34:01,740
Also note that we are dealing only with one with one specific um image.

546
00:34:01,800 --> 00:34:05,550
So we click on this one and then you see.

547
00:34:06,340 --> 00:34:10,930
You see here we have sunglasses, skin shoes, hair bodysuit, accessories.

548
00:34:10,930 --> 00:34:19,780
And this is simply because once we pick a sample, we go ahead and check out all the different, um,

549
00:34:19,780 --> 00:34:22,120
categories which are found in this mask.

550
00:34:22,120 --> 00:34:25,210
So let's get back here and see how that's done.

551
00:34:25,210 --> 00:34:31,300
So for every or for when we select a given sample you see here we have our target view.

552
00:34:31,570 --> 00:34:34,870
See context view select we select one.

553
00:34:34,870 --> 00:34:38,260
We we we have a given sample which we've selected.

554
00:34:38,380 --> 00:34:41,770
We then say okay for a given target view.

555
00:34:41,770 --> 00:34:43,300
Normally we should select only one.

556
00:34:43,300 --> 00:34:47,650
So if we if we we we're going to go through this target view.

557
00:34:47,650 --> 00:34:50,170
And then we break once we're done with the first one.

558
00:34:50,170 --> 00:34:55,240
So once we select our target view we go through the different samples.

559
00:34:55,240 --> 00:34:57,190
That's normally supposed to be one.

560
00:34:57,190 --> 00:35:01,210
If we pick just one then we obtain the mask.

561
00:35:01,210 --> 00:35:07,990
So for a given sample we could get the mask by by doing uh sample ground truth mask path.

562
00:35:08,140 --> 00:35:10,960
With this with OpenCV we obtain the mask.

563
00:35:10,960 --> 00:35:15,670
And then we now get the unique, um, choices.

564
00:35:15,670 --> 00:35:19,660
So the unique or you could say we could call this unique categories.

565
00:35:19,660 --> 00:35:25,450
So what we simply do is make use of Numpy's unique method which takes in the mask.

566
00:35:25,450 --> 00:35:28,900
And then we convert the values we get.

567
00:35:28,900 --> 00:35:37,750
Because obviously if we have say um, class zero, we have class 23 and class 38 for example, then

568
00:35:37,750 --> 00:35:42,070
we want to convert this into the uh corresponding label.

569
00:35:42,070 --> 00:35:49,570
So we use um id to label and we get we take in this, I convert it and then we have for example the

570
00:35:49,570 --> 00:35:50,470
the background.

571
00:35:50,470 --> 00:35:56,290
But because we don't want the background, um, you will notice that here we take from one um to the

572
00:35:56,290 --> 00:35:58,330
end because we always start with the background.

573
00:35:58,330 --> 00:36:00,550
Obviously for every image we will have the background.

574
00:36:00,550 --> 00:36:02,080
So that's it.

575
00:36:02,080 --> 00:36:07,270
And then we create this tuple which contains our unique choices.

576
00:36:07,270 --> 00:36:12,640
The reason why we're creating this unique tuple is because of the way the drop down type works.

577
00:36:12,640 --> 00:36:16,360
So it takes in this um it takes in actually tuples.

578
00:36:16,360 --> 00:36:22,180
So that said uh, what we're doing here is we, we specify how we create this class choices.

579
00:36:22,180 --> 00:36:22,900
That's it.

580
00:36:22,900 --> 00:36:24,730
We specify a label class.

581
00:36:24,730 --> 00:36:31,960
And then for all the different classes in our unique choices we are going to add a choice.

582
00:36:31,960 --> 00:36:37,150
Or we're going to add a specific class into those class choices.

583
00:36:37,150 --> 00:36:44,110
So now you see that it's only those which we got from the mask which we're going to add into this class

584
00:36:44,110 --> 00:36:44,800
choices.

585
00:36:44,800 --> 00:36:46,510
And now this is an enum.

586
00:36:46,510 --> 00:36:52,060
But the values we are going to have is gotten from this class choices.

587
00:36:52,060 --> 00:36:54,100
So we have class choices values.

588
00:36:54,100 --> 00:36:56,290
We have the default set to skin.

589
00:36:56,290 --> 00:36:59,650
And then the view is the class choices.

590
00:36:59,650 --> 00:37:02,110
So it's obviously this drop down view.

591
00:37:02,110 --> 00:37:07,360
So saying that the view is class choices is simply saying we want to have a drop down view as we have

592
00:37:07,360 --> 00:37:08,050
seen here.

593
00:37:08,050 --> 00:37:12,580
So here you see we have this drop down view unlike here where we have the slider view.

594
00:37:12,580 --> 00:37:14,200
So that's it.

595
00:37:14,200 --> 00:37:18,640
So we for this one we have a form view for this one.

596
00:37:18,760 --> 00:37:20,890
For this one we have a drop down view for this.

597
00:37:20,890 --> 00:37:22,570
Other ones we have a slider view.

598
00:37:22,660 --> 00:37:27,730
We then return the property which takes in the input and the view.

599
00:37:28,090 --> 00:37:28,930
So that's it.

600
00:37:28,930 --> 00:37:33,490
So all these inputs we have been creating here is what we're going to have now.

601
00:37:33,490 --> 00:37:34,450
That's fine.

602
00:37:34,450 --> 00:37:40,060
We move ahead into the execution or in our execute method.

603
00:37:40,060 --> 00:37:45,430
So in the execute method which um plays a role of our back end, we're going to take all information

604
00:37:45,430 --> 00:37:48,400
from the front end which is now our resolve input method.

605
00:37:48,400 --> 00:37:50,980
And then um, carry out the augmentation.

606
00:37:50,980 --> 00:37:58,300
So where we have number of images per prompt, you see context params, we get the number of augmentations.

607
00:37:58,570 --> 00:37:59,710
The default value is one.

608
00:37:59,710 --> 00:38:06,340
So if you get back here you see we have um number of augmentation numbers.

609
00:38:06,580 --> 00:38:10,330
And then for the next we have the the selected class.

610
00:38:10,330 --> 00:38:11,770
You see the default value is keen.

611
00:38:11,770 --> 00:38:14,230
And then here we have class choices.

612
00:38:14,230 --> 00:38:19,270
So if we get to our class you see here we have class choices.

613
00:38:19,420 --> 00:38:26,830
And then we get to the prompt again here we have prompt none provided um gradient scale.

614
00:38:26,830 --> 00:38:28,540
The default value is seven.

615
00:38:28,570 --> 00:38:33,100
The name is gradient scale number of inferences uh steps.

616
00:38:33,400 --> 00:38:37,150
Um and then we also have the random seed which is by default is ten.

617
00:38:37,150 --> 00:38:38,260
So that's it.

618
00:38:38,260 --> 00:38:40,180
And then we have our target view.

619
00:38:40,180 --> 00:38:44,830
With our target view we are able to um get a specific sample.

620
00:38:44,830 --> 00:38:52,450
So the when you do when let's get out of this, when you pick um, a selected sample like this one,

621
00:38:52,450 --> 00:38:56,260
we're saying that our target view um now because it's here we have two.

622
00:38:56,290 --> 00:38:58,750
So our target view will be made up of this two samples.

623
00:38:58,750 --> 00:39:04,360
So if we take if we take this off and then we select this, our target view is made up of only this

624
00:39:04,360 --> 00:39:04,990
sample.

625
00:39:04,990 --> 00:39:05,680
So.

626
00:39:05,880 --> 00:39:12,210
For the sample, um, or samples in our target view, we're going to do exact same thing we had seen

627
00:39:12,210 --> 00:39:13,110
already in the Colab.

628
00:39:13,110 --> 00:39:20,940
So here we call on our transform sample method which takes in the sample the specific class, um, the

629
00:39:20,940 --> 00:39:26,430
prompt, the number of images, um per prompt, the gradient scale, number of inference steps, and

630
00:39:26,430 --> 00:39:27,720
then the random seed.

631
00:39:27,720 --> 00:39:34,800
So once we create this new samples, now unlike with a colab where we just pick the first one here we

632
00:39:34,800 --> 00:39:36,390
could create multiple samples.

633
00:39:36,390 --> 00:39:42,330
So once we create, once we have this multiple samples for each and every sample, we are going to add

634
00:39:42,330 --> 00:39:44,790
this into our data set.

635
00:39:44,790 --> 00:39:48,270
So you see we add that sample in our data set.

636
00:39:48,270 --> 00:39:52,830
Now the transform samples method isn't very different from what we had seen already.

637
00:39:52,830 --> 00:39:54,690
But we could take a quick look at that.

638
00:39:54,690 --> 00:39:59,940
So in our transform samples we have obviously we have the sample select class prompt and so on and so

639
00:39:59,940 --> 00:40:00,600
forth.

640
00:40:00,630 --> 00:40:09,540
Then we go ahead and generate our inputs based on um the samples file path and uh mask path.

641
00:40:09,540 --> 00:40:18,630
Then from here we, we have the paint method which obviously generates or creates our edited outputs.

642
00:40:18,630 --> 00:40:23,880
Then for each and every one of these edited outputs, here we have our list for each and every one of

643
00:40:23,880 --> 00:40:24,990
these edited outputs.

644
00:40:24,990 --> 00:40:28,020
For every edited image we are going to save that.

645
00:40:28,440 --> 00:40:32,040
And then we are going to create a new sample.

646
00:40:32,040 --> 00:40:34,230
So that's why we have this new samples list here.

647
00:40:34,230 --> 00:40:39,060
So we create a new, um, 51 sample just exact as we had in the Colab.

648
00:40:39,060 --> 00:40:40,440
And that's fine.

649
00:40:40,440 --> 00:40:49,350
So we save the image, we save the image, we copy the the mask and then we create a new 51 sample.

650
00:40:49,350 --> 00:40:51,780
Then once we're done with this, we return the samples.

651
00:40:51,780 --> 00:40:57,030
And it's those samples which we use right here, which we use right here.

652
00:40:57,030 --> 00:41:03,390
And uh, or which we make use of right here and add into our data set.

653
00:41:03,480 --> 00:41:04,620
So that's it.

654
00:41:04,860 --> 00:41:07,020
So that's fine.

655
00:41:07,020 --> 00:41:11,790
All the other methods which we call like for example arc paints are found in our utils file.

656
00:41:11,790 --> 00:41:14,520
So if we get here we have create pipeline.

657
00:41:14,520 --> 00:41:16,140
We have generate inputs.

658
00:41:16,140 --> 00:41:20,970
We have our paints uh and and then we have the create hash.

659
00:41:20,970 --> 00:41:21,990
So that's it.

660
00:41:21,990 --> 00:41:25,440
It's um that's all we need to create a plugin.

661
00:41:25,440 --> 00:41:34,230
And so now we'll, uh, we'll save all this and then push this code to um, GitHub and then load or

662
00:41:34,230 --> 00:41:36,630
download our plugin and make use of it.

663
00:41:36,630 --> 00:41:40,320
Let's say instead our default random seed should be a hundred.

664
00:41:40,320 --> 00:41:42,630
So instead of ten we want to change this to 100.

665
00:41:42,630 --> 00:41:44,700
So that's the only modification we will make.

666
00:41:44,700 --> 00:41:50,190
Let's get back here and say um our random seed random seed okay.

667
00:41:50,190 --> 00:41:53,760
So we go from 1 to 1000 and our default is 100.

668
00:41:54,180 --> 00:41:56,880
Let's scroll down to where we had the random seed.

669
00:41:56,880 --> 00:41:58,380
Our default is 100.

670
00:41:58,380 --> 00:42:05,040
So we save that and then we go ahead and push this to GitHub and then um download the plugin and make

671
00:42:05,040 --> 00:42:05,670
use of it.

672
00:42:05,670 --> 00:42:12,030
Once we push the code to GitHub we could get back here and then download the plugins.

673
00:42:12,180 --> 00:42:13,890
So let's change this name.

674
00:42:13,890 --> 00:42:16,050
We have neural learn.

675
00:42:16,050 --> 00:42:16,980
Neural learn.

676
00:42:16,980 --> 00:42:24,240
That's fine plugin name neural learn neural rendering.

677
00:42:24,390 --> 00:42:26,190
And we run this cell.

678
00:42:26,190 --> 00:42:30,480
And then now we could go ahead and visualize our data.

679
00:42:30,510 --> 00:42:35,250
As expected we have our different samples right here.

680
00:42:35,280 --> 00:42:42,510
Now note that we have 77 samples because we restarted our session and because we are working with Colab.

681
00:42:42,510 --> 00:42:46,290
Each time you restart your session, all the data is deleted.

682
00:42:46,290 --> 00:42:52,950
Nonetheless, if you're working locally on some cloud server, you wouldn't have this, um, data deleted.

683
00:42:52,950 --> 00:42:55,710
Let's go ahead and check out our plugin.

684
00:42:55,710 --> 00:42:58,320
You see here we have augmented with stable diffusion inpainting.

685
00:42:58,860 --> 00:43:06,750
Um, if we get back up you would see we have this other um, plugin right here, which is actually,

686
00:43:06,840 --> 00:43:08,490
um, 51 plugins.

687
00:43:08,490 --> 00:43:16,770
Now this plugin permits you manage um, other plugins like um, this neural learn plugin we just created.

688
00:43:16,770 --> 00:43:24,990
So if you run this you should have now the the plugin manager added to our list of plugins.

689
00:43:24,990 --> 00:43:26,700
We'll need to refresh to have that.

690
00:43:26,700 --> 00:43:28,800
Anyways let's get back here.

691
00:43:28,800 --> 00:43:30,600
We have augmented with stable diffusion.

692
00:43:30,600 --> 00:43:33,750
Remember we need to select um an image.

693
00:43:33,750 --> 00:43:36,630
Um, so we could um carry out the augmentation.

694
00:43:36,630 --> 00:43:38,250
So let's select this one.

695
00:43:38,250 --> 00:43:41,130
And then we get back here we have augment.

696
00:43:41,130 --> 00:43:42,120
That's fine.

697
00:43:42,120 --> 00:43:48,030
Now notice how the random seed by default is 100 as we had modified it in the code.

698
00:43:48,030 --> 00:43:49,650
So that's fine.

699
00:43:49,950 --> 00:43:54,480
Um our prompts well number of of augmentations let's say ten.

700
00:43:54,750 --> 00:43:59,070
Um the the class what we want to modify the t shirt.

701
00:43:59,610 --> 00:44:03,090
Then we have number of inference steps set to 50.

702
00:44:03,090 --> 00:44:05,610
Garden scale seed nine.

703
00:44:05,880 --> 00:44:07,740
And then we have the prompt.

704
00:44:07,770 --> 00:44:11,640
Well let's go ahead and copy again our prompt we just used.

705
00:44:11,640 --> 00:44:16,980
So the documentation anyway we had our prompt from here.

706
00:44:18,240 --> 00:44:19,170
Scroll down.

707
00:44:19,170 --> 00:44:20,340
We have.

708
00:44:21,660 --> 00:44:22,830
That's fine.

709
00:44:22,950 --> 00:44:25,650
So we'll replace both by t shirt.

710
00:44:26,340 --> 00:44:27,270
Then back to this.

711
00:44:27,480 --> 00:44:28,530
Um.

712
00:44:28,680 --> 00:44:30,210
And then peace out.

713
00:44:30,210 --> 00:44:34,890
So anyway, let's, let's, let's just copy that for now and look at the image.

714
00:44:34,890 --> 00:44:38,490
Well, so let's take off the ground truth okay.

715
00:44:38,490 --> 00:44:39,120
So that's it.

716
00:44:39,120 --> 00:44:44,880
Suppose we want to change this t shirt into one that is say black.

717
00:44:44,880 --> 00:44:46,740
So it matches with this boots.

718
00:44:46,890 --> 00:44:48,960
Let's click again on that.

719
00:44:49,080 --> 00:44:51,750
So we select here.

720
00:44:51,750 --> 00:45:02,910
And then we have our prompt a photorealistic photo of a man of a man wearing a black colored nice looking

721
00:45:02,940 --> 00:45:08,310
t shirt t t shirt.

722
00:45:09,990 --> 00:45:13,050
Um all black.

723
00:45:14,710 --> 00:45:17,860
With, um, white logo.

724
00:45:19,950 --> 00:45:20,970
That's fine.

725
00:45:21,210 --> 00:45:22,290
Um.

726
00:45:22,290 --> 00:45:23,310
High resolution.

727
00:45:23,310 --> 00:45:24,900
Okay, so that's our prompt.

728
00:45:24,900 --> 00:45:26,940
We go ahead and pick our class.

729
00:45:27,330 --> 00:45:28,920
Um, yeah, we have t shirt.

730
00:45:28,920 --> 00:45:29,790
That's fine.

731
00:45:29,970 --> 00:45:35,160
So let's run this, let's increase this and then let's go ahead and execute.

732
00:45:37,040 --> 00:45:37,910
Executing.

733
00:45:37,910 --> 00:45:42,890
We just wait for a while and then we're visualize our data set.

734
00:45:43,430 --> 00:45:49,940
So when we go ahead and browse operations you see we have this um manage plugins.

735
00:45:49,940 --> 00:45:53,720
We have build plugin component build operator skeleton.

736
00:45:53,720 --> 00:45:56,960
And then you should have this install plugin.

737
00:45:56,960 --> 00:46:00,050
So we've already seen how to install the plugin.

738
00:46:00,110 --> 00:46:02,060
Um um using the command.

739
00:46:02,060 --> 00:46:04,700
So we don't need to go through this.

740
00:46:04,700 --> 00:46:08,810
But if you click on this you see you're asked to specify a GitHub repo.

741
00:46:08,810 --> 00:46:11,090
And then you could go ahead and install the plugin.

742
00:46:11,090 --> 00:46:15,710
So it's a plugin that permits you um, easily work with other plugins.

743
00:46:15,710 --> 00:46:23,000
Now if you click on this manage plugins, you see we have uh neural learn documents or, or and then

744
00:46:23,000 --> 00:46:24,230
we have a description here.

745
00:46:24,230 --> 00:46:26,480
So you could enable or disable the plugin.

746
00:46:26,480 --> 00:46:30,950
Then for the requirements you just need to select the specific plugin you're working with.

747
00:46:30,950 --> 00:46:37,550
And you could see all the um, different requirements or all the different libraries which were found

748
00:46:37,550 --> 00:46:39,560
in our requirements.txt file.

749
00:46:39,560 --> 00:46:49,010
So that said we could pick out, um, say this image and then now go ahead and um, augment.

750
00:46:49,010 --> 00:46:55,430
So let's click on augment and then let's say one augment this say three times.

751
00:46:55,430 --> 00:46:58,040
And then we change the class.

752
00:46:58,040 --> 00:47:00,920
Well we have a shirt and we have belt.

753
00:47:00,920 --> 00:47:02,450
We have pants.

754
00:47:02,840 --> 00:47:05,600
There we go okay let's let's pick out shirt.

755
00:47:05,600 --> 00:47:08,090
And then we have um.

756
00:47:09,610 --> 00:47:18,520
Well, a photorealistic photo of a man wearing a click on this, or wearing a blue colored nice looking

757
00:47:18,520 --> 00:47:19,690
Gucci vest.

758
00:47:20,470 --> 00:47:26,080
Well, let's say a photo realistic photo of a man wearing a blue collared, nice looking Gucci vest

759
00:47:26,080 --> 00:47:27,790
or blue high resolution.

760
00:47:27,790 --> 00:47:30,190
Okay, so we have that now.

761
00:47:30,190 --> 00:47:32,920
We could, um, pick out any random seed.

762
00:47:32,920 --> 00:47:39,520
Let's say, let's say we want to pick out the random seed of, of um, 9 or 15.

763
00:47:39,520 --> 00:47:40,420
That's fine.

764
00:47:40,780 --> 00:47:43,630
Number of um, inference steps is seven.

765
00:47:43,630 --> 00:47:44,530
That's fine.

766
00:47:44,830 --> 00:47:46,840
Um, guidance scale nine.

767
00:47:46,840 --> 00:47:49,390
And then let's go ahead and execute.

768
00:47:49,390 --> 00:47:55,090
Once we're done with executing we could now go ahead and check out the generated data.

769
00:47:55,120 --> 00:48:00,160
You can see the different um new samples which are generated.

770
00:48:00,160 --> 00:48:04,150
You see it's actually um, resembling the prompts.

771
00:48:04,150 --> 00:48:08,080
It's a blue shirt, though it's not very clear what's going on in this chest region.

772
00:48:08,080 --> 00:48:11,080
We still have something that looks like the prompt we put in.

773
00:48:11,080 --> 00:48:15,010
And then looking at the ground truth, you see all the zone.

774
00:48:15,010 --> 00:48:17,560
We have that shirt, we move to the next one.

775
00:48:17,560 --> 00:48:20,920
Let's take our ground truth so we could see the the output here.

776
00:48:20,920 --> 00:48:22,900
This is quite fine ground truth.

777
00:48:22,900 --> 00:48:24,760
We have that now.

778
00:48:24,760 --> 00:48:30,910
Um, as we have said already, there's a little bit of noise around this chest region again, because

779
00:48:30,910 --> 00:48:34,270
we expect the shirt to come right up like this.

780
00:48:34,270 --> 00:48:36,430
Anyways, we move to the next one.

781
00:48:37,060 --> 00:48:38,650
Let's take this off.

782
00:48:39,310 --> 00:48:40,870
Next sample.

783
00:48:43,370 --> 00:48:47,900
Well, um, we have a blue shirt, but it doesn't look, um, very realistic.

784
00:48:47,900 --> 00:48:53,930
So, uh, but at least it follows, uh, or it matches with the ground truth, uh, mask.

785
00:48:54,410 --> 00:48:55,730
We move to the next.

786
00:48:56,890 --> 00:48:57,760
This next one.

787
00:48:57,760 --> 00:48:58,540
Take off the ground.

788
00:48:58,540 --> 00:48:59,050
Truth.

789
00:49:01,510 --> 00:49:08,560
We have the shirt, which, um, still follows the prompt, though not very realistic, but at least,

790
00:49:08,710 --> 00:49:10,450
um, we have this blue shirt.

791
00:49:10,570 --> 00:49:17,590
Then we have the next one, which looks cool and also matches to some extent with the ground truth.

792
00:49:17,620 --> 00:49:18,490
So that's it.

793
00:49:18,520 --> 00:49:23,260
We just seen how we could generate new samples for our data set.

794
00:49:23,260 --> 00:49:25,660
You could go ahead and play around with this plugin.

795
00:49:25,930 --> 00:49:32,170
Um, and then see how you could retrain a model with a much larger data set.

796
00:49:32,170 --> 00:49:36,970
So one good thing about this is that you could pick like this one for example.

797
00:49:36,970 --> 00:49:41,110
You see we pick this one and then we go ahead augment.

798
00:49:41,110 --> 00:49:44,650
And then we could say for example we want say 20 samples.

799
00:49:44,650 --> 00:49:50,350
So you could increase uh, as many or you could increase the number of samples in a way you want.

800
00:49:50,710 --> 00:49:55,180
Now let's see what happens when we change the or when we change the person's skin.

801
00:49:55,180 --> 00:49:57,430
So let's get to the prompt.

802
00:49:57,430 --> 00:50:07,780
We have, uh, a photo, a photorealistic photo of, uh, of an Indian man, Indian man, um, wearing

803
00:50:07,780 --> 00:50:09,760
a red colored or.

804
00:50:09,760 --> 00:50:11,500
Well, let's say we're in a blue colored.

805
00:50:11,500 --> 00:50:13,480
Let's let's just say an Indian man.

806
00:50:13,570 --> 00:50:17,680
So let's take this off, um, high resolution.

807
00:50:18,880 --> 00:50:19,630
Okay.

808
00:50:19,630 --> 00:50:22,630
Let's say a photorealistic photo of an Indian man.

809
00:50:22,630 --> 00:50:28,390
High resolution, then number of inference steps 50 guidance skill nine random seed.

810
00:50:28,390 --> 00:50:30,760
Let's take ten.

811
00:50:31,270 --> 00:50:34,000
We have our newly generated samples.

812
00:50:34,000 --> 00:50:35,320
We have a couple of them.

813
00:50:35,320 --> 00:50:39,250
You see how we go from the black man to an Indian man.

814
00:50:39,250 --> 00:50:45,760
And, uh, one thing you could note here is the fact that sometimes the model generates the image with,

815
00:50:45,760 --> 00:50:47,710
uh, a long sleeve instead of a short sleeve.

816
00:50:47,710 --> 00:50:54,100
So you could want or you may want to, um, delete some of the samples where we have the long sleeve.

817
00:50:54,100 --> 00:50:59,320
So, like this one, we could delete this, delete this, delete this sample.

818
00:50:59,320 --> 00:51:02,530
And this one, this one and this one.

819
00:51:02,530 --> 00:51:06,820
So we're left with only, uh, when the the man is putting on a short sleeve.

820
00:51:06,820 --> 00:51:10,420
Anyways, uh, once you select this, you just get back here.

821
00:51:10,420 --> 00:51:12,010
You get back here.

822
00:51:12,940 --> 00:51:19,420
Anyways, once you select this, you just get back here and then you type delete and you have delete

823
00:51:19,420 --> 00:51:20,800
selected samples.

824
00:51:20,800 --> 00:51:25,600
So we execute that with all the samples and long sleeves deleted.

825
00:51:25,630 --> 00:51:26,920
Here's what we're left with.

826
00:51:26,920 --> 00:51:31,000
You could notice clearly the skin and even the race change.

827
00:51:31,000 --> 00:51:32,920
We've come to the end of the section.

828
00:51:32,920 --> 00:51:34,840
Thank you for getting around up to this point.

829
00:51:34,840 --> 00:51:42,400
If you want to continue learning, you could head over to neural AI and keep improving your deep learning

830
00:51:42,400 --> 00:51:43,180
skills.

831
00:51:46,430 --> 00:51:50,930
But normal may mean.

832
00:51:52,500 --> 00:51:53,070
Great.

833
00:51:57,790 --> 00:51:59,140
But I need.
