1
00:00:00,080 --> 00:00:06,650
Hi there and welcome to this session in which we shall practically train our GAN to produce images like

2
00:00:06,650 --> 00:00:07,540
this one here.

3
00:00:07,550 --> 00:00:11,360
We'll start with the imports and then we'll move on to prepare our data.

4
00:00:11,390 --> 00:00:16,760
The dataset we shall be using will be the celeb, a dataset which signifies celeb faces.

5
00:00:16,800 --> 00:00:17,450
Attributes.

6
00:00:17,450 --> 00:00:18,200
Dataset.

7
00:00:18,230 --> 00:00:24,830
Now, this is over 200,000 images of celebrities with 40 binary attribute annotations.

8
00:00:24,860 --> 00:00:27,780
Let's open up some of this here.

9
00:00:27,800 --> 00:00:31,100
You could have some of these images open this file up.

10
00:00:31,100 --> 00:00:32,980
And there we go.

11
00:00:32,990 --> 00:00:35,360
You see we have this faces right here.

12
00:00:35,360 --> 00:00:44,120
And so what we'll be doing will be to train our discriminator alongside with our generator, such that

13
00:00:44,120 --> 00:00:51,290
our generator can generate images of faces which can be able or which can be realistic enough to be

14
00:00:51,290 --> 00:00:55,730
able to fool the discriminator to think that they are actually real faces.

15
00:00:55,760 --> 00:01:03,150
This notebook is provided by Jessica Lee on Kaggle and can be downloaded, so let's go straight away

16
00:01:03,150 --> 00:01:09,300
to download this dataset and then start with our GAN modelling.

17
00:01:09,300 --> 00:01:14,960
In order to download a dataset from Kaggle, we'll be needing this Kaggle dot JSON file right here.

18
00:01:14,970 --> 00:01:21,210
Now this Kaggle dot JSON file can be gotten from Kaggle by getting into your account and then creating

19
00:01:21,210 --> 00:01:22,680
a new token.

20
00:01:22,680 --> 00:01:29,880
So once you have that, you will get right here and then click on copy command, which when you paste

21
00:01:29,880 --> 00:01:36,270
our here you see you have Kaggle dataset download and you have the username of the person who uploaded

22
00:01:36,270 --> 00:01:38,370
this dataset to Kaggle platform.

23
00:01:38,370 --> 00:01:40,920
And then you have the dataset name right here.

24
00:01:40,920 --> 00:01:44,910
But before carrying out this dataset download, we'll start by installing Kaggle.

25
00:01:44,940 --> 00:01:49,800
We'll make this directory, we'll copy this Kaggle dot JSON into this directory.

26
00:01:49,800 --> 00:01:57,810
Then we can now go ahead and download the dataset from this command API command which we downloaded

27
00:01:57,810 --> 00:02:04,170
or which we copied rather, and then we can unzip this into some dataset folder or directory which we

28
00:02:04,170 --> 00:02:04,950
specify.

29
00:02:04,950 --> 00:02:09,120
So that said, let's simply run this cell and everything should move on.

30
00:02:09,120 --> 00:02:10,890
Well, let's take this off.

31
00:02:11,070 --> 00:02:17,850
As you can see, the dataset has been downloaded and now we're extracting the files into this dataset

32
00:02:17,850 --> 00:02:19,110
folder right here.

33
00:02:19,140 --> 00:02:24,600
Now that we have this successfully extracted into our dataset folder, as we could see, we specify

34
00:02:24,600 --> 00:02:28,440
the batch size, the image shape and the learning rate.

35
00:02:28,470 --> 00:02:36,360
Now from here, let's run the cell and then we move on to create our TensorFlow data dataset.

36
00:02:36,360 --> 00:02:42,210
So here we have let's call this dataset and then we specify this path.

37
00:02:42,210 --> 00:02:47,220
Now to get this path, you could click open right here and then you see if you click open, this is

38
00:02:47,220 --> 00:02:50,820
going to take a while since we have 200,000 of different images here.

39
00:02:50,820 --> 00:02:58,320
So let's just let that and then we copy this path is a path that you specify in your Now once you specify

40
00:02:58,320 --> 00:03:02,520
that you have oops, you have a labeling mode which is known.

41
00:03:02,760 --> 00:03:06,000
Um, we have the image size which was specified already.

42
00:03:06,000 --> 00:03:09,030
We have the batch size.

43
00:03:09,150 --> 00:03:17,430
Um, anyway, let's let, we could have here batch size, but it doesn't matter as we'll see shortly.

44
00:03:17,430 --> 00:03:19,200
Anyway, we have that.

45
00:03:19,200 --> 00:03:26,040
And then from here we run the cell, you see, you could have dataset, we get an error TensorFlow not

46
00:03:26,040 --> 00:03:26,930
defined.

47
00:03:26,940 --> 00:03:28,770
Let's run this.

48
00:03:29,580 --> 00:03:30,330
Oops.

49
00:03:30,330 --> 00:03:31,890
Um, that's fine.

50
00:03:32,310 --> 00:03:33,270
Uh, next.

51
00:03:33,270 --> 00:03:34,140
This.

52
00:03:34,170 --> 00:03:36,480
We've run this already, and then we run this.

53
00:03:36,480 --> 00:03:37,710
Now this should be fine.

54
00:03:37,710 --> 00:03:40,470
Let's check out our dataset to use that.

55
00:03:40,470 --> 00:03:48,810
So as you can see, we have 202,599 files belonging to one class and our dataset has been batch.

56
00:03:48,810 --> 00:03:50,310
So we have a batch dataset.

57
00:03:50,340 --> 00:03:52,050
You can see the shape right here.

58
00:03:52,050 --> 00:03:54,780
We have 64 by 64 by three images.

59
00:03:54,810 --> 00:03:59,940
Now, uh, the default is um, 256 by 256.

60
00:03:59,940 --> 00:04:03,090
So if we, if we do not specify this, let's see what we get.

61
00:04:03,240 --> 00:04:04,680
Let's take this off.

62
00:04:05,520 --> 00:04:13,290
Um, take that off and run this again and check out on the image size you see here.

63
00:04:13,290 --> 00:04:16,590
When we don't specify anything, you have 256 by 256.

64
00:04:16,620 --> 00:04:20,070
Okay, let's get back and run this again.

65
00:04:20,070 --> 00:04:23,760
Then we move on to preprocess our data.

66
00:04:23,760 --> 00:04:30,420
So right here, what we're going to do is we're going to make sure this data lies between -1 and 1.

67
00:04:30,420 --> 00:04:36,390
And so that's why we're having here the image divided by 127.5 minus one.

68
00:04:36,390 --> 00:04:42,510
And so this means that any value we get between 0 and 255, let's say, for example, we have the value

69
00:04:42,510 --> 00:04:53,460
255, we'll take 255 divided by 127.5, which is two then minus one, which gives us one.

70
00:04:53,580 --> 00:04:57,240
So that's how we we pre-process this images.

71
00:04:57,240 --> 00:04:59,550
And then after pre-processing, we're going to.

72
00:04:59,760 --> 00:05:05,430
But because we need to reshuffle or rather because we need to drop the remainder.

73
00:05:05,430 --> 00:05:10,380
So we on batch and then we use the batching of our TensorFlow data.

74
00:05:10,410 --> 00:05:15,900
I then from here we carry out some prefetching for a more efficient way of loading the data.

75
00:05:15,930 --> 00:05:20,190
Now from here you can visualize a single element in our dataset.

76
00:05:20,220 --> 00:05:20,730
Here we go.

77
00:05:20,730 --> 00:05:24,060
We have 4G in our train dataset.

78
00:05:24,090 --> 00:05:25,710
Let's take a single element.

79
00:05:25,740 --> 00:05:30,180
We could print out its shape, the shape.

80
00:05:30,180 --> 00:05:31,710
And there we go.

81
00:05:31,710 --> 00:05:37,950
Now we have the shape, which is 128 by 64, by 64 by three as expected.

82
00:05:37,950 --> 00:05:42,030
And we could go ahead and visualize some elements here.

83
00:05:42,030 --> 00:05:44,490
So let's visualize four elements.

84
00:05:44,600 --> 00:05:46,950
Um, we could increase this definitely.

85
00:05:46,950 --> 00:05:50,220
So we should visualize four elements of this for now.

86
00:05:50,220 --> 00:05:58,620
Now here we have the subplot, um, plot in show and we could take off the axis.

87
00:05:58,620 --> 00:06:01,710
So let's run that and then see what we get.

88
00:06:02,940 --> 00:06:06,120
So here we have this, um, four different images.

89
00:06:06,120 --> 00:06:07,770
Let's reduce this a little.

90
00:06:08,800 --> 00:06:16,390
Now, uh, one thing we could do, too, is modify this here, this, um, value of our array.

91
00:06:16,420 --> 00:06:22,090
Now, the reason why we want to modify this is because this value ranges between -1 and 1.

92
00:06:22,090 --> 00:06:27,790
Whereas this bloodhound image here takes in values of range 0 to 1.

93
00:06:27,790 --> 00:06:33,220
So we're going to modify this such that we move from negative one one to the range zero one.

94
00:06:33,520 --> 00:06:40,180
And to do that, we need to take whatever value we have in this range, add one to it, and then divide

95
00:06:40,180 --> 00:06:40,990
by two.

96
00:06:41,110 --> 00:06:44,980
So let's take this off and get back here.

97
00:06:44,980 --> 00:06:48,250
We just have plus one, then divided by two.

98
00:06:49,150 --> 00:06:50,110
There we go.

99
00:06:50,110 --> 00:06:51,280
Let's run that again.

100
00:06:52,460 --> 00:06:53,360
And there we go.

101
00:06:53,390 --> 00:07:00,380
You see, you have now the images are much clearer and you do not get the messages which we were getting

102
00:07:00,380 --> 00:07:01,220
previously.

103
00:07:01,940 --> 00:07:07,520
Now we'll go ahead with the modeling and we're going to use this same architecture presented in the

104
00:07:07,520 --> 00:07:08,810
Dcgan paper.

105
00:07:08,810 --> 00:07:18,350
So right here we have this 100 dimensional latent vector, and then this is projected and reshaped into

106
00:07:18,350 --> 00:07:22,450
this 4x4 by 1024 tensor.

107
00:07:22,460 --> 00:07:31,100
And then from here we apply the Upsampling that's actually the Conv 2D transpose to then get this other

108
00:07:31,100 --> 00:07:32,120
vector right here.

109
00:07:32,150 --> 00:07:38,570
Notice how we're getting from four by 4 to 8 by eight and then from here again repeat the same process,

110
00:07:38,570 --> 00:07:43,820
16 by 16, 32 by 32, and then finally 64 by 64.

111
00:07:43,820 --> 00:07:52,170
Also notice that while the size of the outputs keep increasing from 8 to 6, from 8 to 16, 32, 64,

112
00:07:52,200 --> 00:07:53,790
the depth is reduced.

113
00:07:53,790 --> 00:07:59,220
And so we go from 1024 to 512 to 256 to 128.

114
00:07:59,220 --> 00:08:01,580
And finally we have three.

115
00:08:01,590 --> 00:08:08,250
So we get back to the code and we specify our latent dimension, which is equal 100.

116
00:08:08,290 --> 00:08:10,650
Let's rerun this cell right here.

117
00:08:10,770 --> 00:08:12,150
There we go.

118
00:08:12,150 --> 00:08:14,160
And then that should be fine.

119
00:08:14,160 --> 00:08:18,810
And then we go ahead and build our generator.

120
00:08:18,810 --> 00:08:24,120
So here we have our generator, which will build with a sequential model.

121
00:08:24,120 --> 00:08:32,970
We have tf.keras sequential, and then we start to pass in our different layers.

122
00:08:32,970 --> 00:08:41,910
So here we have our input layer input, which has a shape of the latent time or latent dimension.

123
00:08:41,910 --> 00:08:47,460
So here we have latent dimension, there we go.

124
00:08:47,460 --> 00:08:48,360
And that's fine.

125
00:08:48,360 --> 00:08:49,740
So that's our first layer.

126
00:08:49,740 --> 00:08:51,810
And then the next is our dense layer.

127
00:08:51,810 --> 00:08:53,880
So this is our projection.

128
00:08:53,880 --> 00:09:04,950
So we project this such that the output is having four times, four times the latent dream.

129
00:09:06,080 --> 00:09:08,050
Number of art units.

130
00:09:08,060 --> 00:09:10,200
So just as we had seen in the paper.

131
00:09:10,220 --> 00:09:15,590
Now, once we have this, we move to the next layer, which is going to be the reshape layer.

132
00:09:15,590 --> 00:09:22,250
So we go ahead and reshape such that because at this point we having letting them is a hundred.

133
00:09:22,250 --> 00:09:26,030
So we having 16 times 100 as 16,100 outputs.

134
00:09:26,060 --> 00:09:31,940
Now we reshape this such that it is a three dimensional tensor.

135
00:09:31,940 --> 00:09:36,590
So here we have 4x4 by latent.

136
00:09:37,420 --> 00:09:38,830
Latent dream.

137
00:09:39,010 --> 00:09:39,880
And that is it.

138
00:09:40,910 --> 00:09:41,600
There we go.

139
00:09:41,600 --> 00:09:43,220
So this is our next layer.

140
00:09:43,220 --> 00:09:44,990
Reshape from the reshape.

141
00:09:44,990 --> 00:09:52,940
We go ahead to do the Conv 2D or the UPSAMPLING with the Conv 2d transpose or as in the paper, we have

142
00:09:52,940 --> 00:10:04,330
come to the transpose and then we have 512 number of filters represent the number of output channels,

143
00:10:04,340 --> 00:10:10,730
then the kernel size kernel size equal four.

144
00:10:11,060 --> 00:10:15,110
Now if we get back to the paper here, let's get back to the paper.

145
00:10:15,110 --> 00:10:23,000
You will see that the kernel size isn't necessarily exactly equal four, but one very important rule

146
00:10:23,000 --> 00:10:30,080
to follow when picking out the kernel size is that the kernel size has to be divisible by the number

147
00:10:30,080 --> 00:10:30,890
of strides.

148
00:10:30,890 --> 00:10:37,580
So when we pick kernel size equal four, we could have the strides to be equal to.

149
00:10:38,660 --> 00:10:45,320
And the reason why we generally want that the kernel size is divisible by the number of strides is simply

150
00:10:45,320 --> 00:10:47,180
because of the.

151
00:10:48,230 --> 00:10:54,110
Or quality of outputs will get generated by the generator when this isn't the case.

152
00:10:54,110 --> 00:11:00,440
So, um, always ensure that we have the kernel size divisible by number of strides.

153
00:11:00,470 --> 00:11:10,280
Now from here on we go on to apply batch normalization as suggested in the paper and also in the tips

154
00:11:10,280 --> 00:11:11,060
and tricks.

155
00:11:11,060 --> 00:11:13,580
Um, GitHub repo.

156
00:11:13,580 --> 00:11:20,020
So here we have batch norm and then from uh, after the batch norm, we have our leaky Relu.

157
00:11:20,030 --> 00:11:25,600
Now for the leaky relu we have, it takes in value of 0.2.

158
00:11:25,610 --> 00:11:30,200
So here we have 0.2 and that's it for this first part.

159
00:11:30,200 --> 00:11:35,630
So we have this first block here which you could see in the paper this very first conv layer.

160
00:11:35,630 --> 00:11:40,850
Now, uh, once we have this here, it could be repeated again.

161
00:11:40,850 --> 00:11:44,750
So we just, um, copy this and then paste it out.

162
00:11:44,750 --> 00:11:46,640
But modifying this depth.

163
00:11:46,640 --> 00:11:49,800
So here we have 256 and then again.

164
00:11:49,800 --> 00:11:50,490
Oops.

165
00:11:50,520 --> 00:11:52,830
Here, get back.

166
00:11:54,390 --> 00:11:56,940
Um, and then here we paste this out.

167
00:11:56,940 --> 00:12:02,040
And then here finally we have 128.

168
00:12:02,250 --> 00:12:03,960
Okay, so we have that.

169
00:12:04,440 --> 00:12:06,600
Uh, for now, we're not going to apply any dropout.

170
00:12:06,600 --> 00:12:11,250
You could always feel free to apply that and see, um, the kind of results you will get.

171
00:12:11,250 --> 00:12:15,180
So here we have this and then now we have that.

172
00:12:15,180 --> 00:12:16,890
We already set that.

173
00:12:16,920 --> 00:12:19,440
We have the let's get to the paper.

174
00:12:19,440 --> 00:12:22,350
We have the first, the second and the third conv layer.

175
00:12:22,350 --> 00:12:26,550
Now this final conv layer is to get an output which is like an image.

176
00:12:26,550 --> 00:12:29,520
So we have 64 by 64 by three.

177
00:12:29,880 --> 00:12:31,890
And so let's get back here.

178
00:12:32,160 --> 00:12:35,880
Um, uh, we will have no leaky relu or what whatever.

179
00:12:35,880 --> 00:12:36,510
Like that.

180
00:12:36,510 --> 00:12:40,650
We'll just copy this out and this is out here.

181
00:12:40,890 --> 00:12:45,090
Okay, so we have that, and now we just come to the transpose.

182
00:12:45,120 --> 00:12:47,760
We have an activation, which is a.

183
00:12:48,450 --> 00:12:55,450
So after the strides here, we specify the activation activation.

184
00:12:55,450 --> 00:13:00,280
And this activation is, uh, um, tensor activation.

185
00:13:00,280 --> 00:13:02,770
So we just have change and that's it.

186
00:13:02,890 --> 00:13:05,620
Padding, um, equals same.

187
00:13:06,460 --> 00:13:08,740
We also copy paste this out here.

188
00:13:08,740 --> 00:13:10,660
So we have here we have padding same.

189
00:13:10,660 --> 00:13:13,450
And right here we have padding same.

190
00:13:13,450 --> 00:13:14,860
Okay, so that's it.

191
00:13:14,860 --> 00:13:17,230
Uh, that should be it for the generator.

192
00:13:17,230 --> 00:13:21,910
I guess with respect to what we had in the tips and tricks.

193
00:13:21,940 --> 00:13:28,630
And also, um, right here, you can see we were told use relu activation and generator for all layers

194
00:13:28,630 --> 00:13:30,780
except for the output which uses the tank.

195
00:13:30,820 --> 00:13:34,180
We'll call this model the generator.

196
00:13:34,180 --> 00:13:38,860
So we have our generator model, and then let's run that.

197
00:13:38,860 --> 00:13:41,980
And the next thing we want to do is summarize this.

198
00:13:41,980 --> 00:13:43,390
So let's get the summary.

199
00:13:44,020 --> 00:13:50,260
We check this out here and you see we're getting this here instead of three.

200
00:13:50,260 --> 00:13:53,350
So let's go ahead and modify this right here.

201
00:13:53,740 --> 00:13:55,390
Uh, this should be three.

202
00:13:55,690 --> 00:14:00,100
Let's run that again and get the summary.

203
00:14:00,790 --> 00:14:03,400
Okay, so this is what we have then.

204
00:14:03,400 --> 00:14:07,120
Now we can move ahead to our discriminator.

205
00:14:07,120 --> 00:14:11,530
So instead of generator right here, we have discriminator.

206
00:14:11,530 --> 00:14:18,370
And the input we're going to have here is going to be 64 by 64 by three.

207
00:14:18,400 --> 00:14:20,470
So it's actually in shape.

208
00:14:20,470 --> 00:14:21,970
So M shape.

209
00:14:22,630 --> 00:14:24,520
Um, zero index.

210
00:14:24,550 --> 00:14:25,540
M shape.

211
00:14:26,310 --> 00:14:35,340
There we go by three and then instead of the conv 2d transpose layers we'll be using the conv 2d layer.

212
00:14:35,340 --> 00:14:43,620
So here we have conv 2d and then the depth increases instead here instead of decreasing as we had with

213
00:14:43,620 --> 00:14:44,760
the generator.

214
00:14:44,760 --> 00:14:47,520
So here we go from 64.

215
00:14:47,520 --> 00:14:52,010
So we start with 64 and then we move on to 128 and so on and so forth.

216
00:14:52,020 --> 00:14:58,320
For now, let's take this off since we did with the Conv 2D and then again we have the kernel size which

217
00:14:58,320 --> 00:15:00,750
is divisible by the number of strides we have.

218
00:15:00,750 --> 00:15:03,870
The leaky relu as we've seen already in the tips and tricks.

219
00:15:04,140 --> 00:15:07,470
Um, let's take this off here, but we'll make use of the batch norm still.

220
00:15:07,470 --> 00:15:14,730
So let's, let's paste this out and then we have batch batch normalization.

221
00:15:15,030 --> 00:15:17,430
We have the batch normalization.

222
00:15:17,790 --> 00:15:18,990
There we go.

223
00:15:19,500 --> 00:15:23,910
Um, here is come 2D, but we increase the depth.

224
00:15:23,910 --> 00:15:25,690
So we have 128.

225
00:15:25,720 --> 00:15:33,100
Now that we have this depth increase, we just simply copy this out and paste it for the next layers.

226
00:15:33,100 --> 00:15:39,250
So for the next blocks, because we consider this to be a block and this is a block and this a block,

227
00:15:39,550 --> 00:15:46,540
Um, now we move on to 256, um, batch norm Leaky Relu still and that's it.

228
00:15:46,570 --> 00:15:52,300
Now for the final or for the last conv layer, let's just paste this out here.

229
00:15:52,330 --> 00:15:53,620
Let's take this off.

230
00:15:54,370 --> 00:15:57,640
Uh, we have this last conv layer right here.

231
00:15:58,300 --> 00:16:00,220
We'll give it a depth of one.

232
00:16:00,220 --> 00:16:08,530
And then given that our discriminator call, that our discriminator is a usual classifier which takes

233
00:16:08,530 --> 00:16:16,090
in the 64 by 64 by three input and then outputs a single value, whether a one or a zero or a value

234
00:16:16,090 --> 00:16:17,650
between 0 and 1 actually.

235
00:16:17,650 --> 00:16:19,180
So it outputs a single value here.

236
00:16:19,180 --> 00:16:25,990
So at this point, we should be thinking of using some dense layer and then specifying that this output

237
00:16:26,020 --> 00:16:28,450
is going to have only one unit.

238
00:16:29,110 --> 00:16:29,800
Okay.

239
00:16:29,800 --> 00:16:31,390
So let's take this off.

240
00:16:31,390 --> 00:16:35,440
And then now from here we could, um, flatten.

241
00:16:35,440 --> 00:16:40,300
So we flatten, um, what we get is output from the conv 2d layer.

242
00:16:40,300 --> 00:16:46,450
And then after flattening, we could, uh, have our dense layer one.

243
00:16:46,510 --> 00:16:55,480
See, here we have just one output and then all the activation activation is sigmoid.

244
00:16:55,480 --> 00:16:56,710
So that's it.

245
00:16:56,710 --> 00:16:58,630
We call with a sigmoid.

246
00:16:59,380 --> 00:17:06,670
With a sigmoid we have values or inputs from negative infinity to positive infinity, which have been

247
00:17:06,670 --> 00:17:10,300
mapped in the range zero 0 to 1 actually.

248
00:17:11,140 --> 00:17:13,570
And that's exactly what we need right here.

249
00:17:13,570 --> 00:17:15,130
So we have that.

250
00:17:15,280 --> 00:17:16,960
Um, that's fine.

251
00:17:16,960 --> 00:17:18,520
Let's take this off now.

252
00:17:19,540 --> 00:17:25,210
Now, before we move on, it should be noted that unlike the sigmoid, which maps values between 0 and

253
00:17:25,210 --> 00:17:31,060
1, the tan function maps values between -1 and 0.

254
00:17:31,060 --> 00:17:33,970
So this is from 0 to 1.

255
00:17:33,970 --> 00:17:40,210
And then the tan maps values between -1 and 1.

256
00:17:40,930 --> 00:17:46,810
And that's what we use in the final layer for the for the discriminator for the generator and now for

257
00:17:46,810 --> 00:17:49,150
the discriminator, we're using the sigmoid.

258
00:17:49,390 --> 00:17:49,930
Okay.

259
00:17:49,930 --> 00:17:51,010
So we have that.

260
00:17:51,010 --> 00:17:51,840
Understood.

261
00:17:51,850 --> 00:17:56,320
Now we have here our discriminator.

262
00:17:56,560 --> 00:17:56,980
Okay.

263
00:17:56,980 --> 00:17:57,790
So we have that.

264
00:17:57,790 --> 00:18:02,950
Let's run this cell and then finally we're going to have our summary.

265
00:18:03,610 --> 00:18:05,740
Discriminator summary.

266
00:18:06,160 --> 00:18:08,200
Let's run this and then see what we get.

267
00:18:08,650 --> 00:18:09,610
There we go.

268
00:18:09,610 --> 00:18:11,080
We have our summary.

269
00:18:11,320 --> 00:18:13,300
Um, everything looks fine.

270
00:18:13,300 --> 00:18:16,840
And now we could go ahead and start with our training.

271
00:18:17,050 --> 00:18:21,970
And just like we had done previously, we're going to overwrite the training step.

272
00:18:21,970 --> 00:18:29,770
So here we have our model, which we had built previously, where we override this train step right

273
00:18:29,770 --> 00:18:30,220
here.

274
00:18:30,220 --> 00:18:36,210
And with this we're able to make use of methods like the model.fit.

275
00:18:36,220 --> 00:18:44,800
So here instead of the we have again, let's get back we have again model.

276
00:18:45,130 --> 00:18:51,520
This GAN model is made of a discriminator, discriminator and a generator.

277
00:18:51,520 --> 00:18:57,160
So let's replace the encoder and decoder by the discriminator and the generator respectively.

278
00:18:57,490 --> 00:19:13,450
Then here we have our discriminator discriminator and generator, self generator and self discriminator.

279
00:19:13,690 --> 00:19:16,160
We can modify the compile method.

280
00:19:16,180 --> 00:19:19,990
Let's, uh, modify this compile method.

281
00:19:20,020 --> 00:19:27,850
The compile method actually will take in the optimizer for the discriminator, the optimizer for the

282
00:19:27,850 --> 00:19:31,150
generator, and then the loss function.

283
00:19:31,150 --> 00:19:42,380
So we have, uh, the optimizer, let's say the optimizer, G Optimizer, and then the loss function.

284
00:19:42,830 --> 00:19:45,710
So that's our compiler method.

285
00:19:45,710 --> 00:19:46,730
And then.

286
00:19:47,300 --> 00:19:53,780
We also go ahead and define our discriminator loss metric and our generator loss metric.

287
00:19:54,260 --> 00:19:58,640
Um, we've taken all these three and then we've put this out here.

288
00:19:58,670 --> 00:20:00,130
Okay, so that's it.

289
00:20:00,140 --> 00:20:09,380
Now let's have our the loss metric and our loss metric.

290
00:20:09,770 --> 00:20:13,970
And then from here we move on now to the training step for the training.

291
00:20:13,970 --> 00:20:21,830
Let's recall that we have a discriminator, we have a generator and then this generator takes in fake

292
00:20:21,830 --> 00:20:22,730
data.

293
00:20:22,790 --> 00:20:30,320
Your takes in a vector, a fake this, this noise here and then generates fake data.

294
00:20:30,320 --> 00:20:37,520
So takes noise out, takes noise, generates fake data, and then this fake data is then passed on to

295
00:20:37,520 --> 00:20:45,050
the discriminator, which says whether it's a 1 or 0 or gives the value between 0 and 1.

296
00:20:45,230 --> 00:20:46,010
Okay.

297
00:20:46,400 --> 00:20:51,570
Uh, we also have our real data right here, which is also going to be passed into our discriminator

298
00:20:51,570 --> 00:20:54,840
and it's also going to give a value between 0 and 1.

299
00:20:54,840 --> 00:20:58,980
Now it should be noted that we will start with training the discriminator.

300
00:20:58,980 --> 00:21:02,310
And when you're training the discriminator, we're going to freeze the generator.

301
00:21:02,320 --> 00:21:04,470
That's we do not update these parameters.

302
00:21:04,470 --> 00:21:12,480
So we're going to just update the weights of the discriminator, just like we had seen previously.

303
00:21:13,200 --> 00:21:18,150
So the first thing we want to do here is to, um, get our noise.

304
00:21:18,180 --> 00:21:21,870
The noise is random, normal.

305
00:21:22,080 --> 00:21:29,640
So we have normal and then we specify since we have a normal distribution, we want to specify its shape.

306
00:21:29,640 --> 00:21:34,080
Now, the shape of this here would be, um.

307
00:21:34,080 --> 00:21:37,560
The shape will be our latent dream.

308
00:21:37,560 --> 00:21:38,310
Now let's.

309
00:21:38,310 --> 00:21:41,760
Let's have our latent dream, which we've defined already.

310
00:21:41,760 --> 00:21:44,280
So here we will have latent dimension.

311
00:21:44,280 --> 00:21:50,640
Now, given that we will be working in batches, we'll need to add the batch dimension.

312
00:21:50,640 --> 00:21:54,630
So here we have batch size by latent dream.

313
00:21:54,630 --> 00:22:04,230
Now to obtain the batch size, all we need to do here batch size is equal to F dot shape of our x batch

314
00:22:05,100 --> 00:22:10,710
x batch and then we get the zeroed value.

315
00:22:10,710 --> 00:22:16,770
Now from here we have the batch size, we have the noise, and then we're ready to feed this into a

316
00:22:16,770 --> 00:22:19,740
generator and then obtain the fake data.

317
00:22:19,770 --> 00:22:24,540
Then also make use of the real data, which is basically this here, because this is our real data.

318
00:22:24,540 --> 00:22:31,620
Remember, our data set is made up of 200,000 different images of faces of celebrities.

319
00:22:31,620 --> 00:22:37,590
And what we've done is we've broken this up into batches of 128.

320
00:22:37,590 --> 00:22:43,770
So for every batch, we're going to take this X batch here, which is basically the real data.

321
00:22:43,770 --> 00:22:50,670
And then we're also going to use make use of this noise to generate a fake data and then train our discriminator.

322
00:22:51,750 --> 00:23:02,400
And so with that, we have our fake data or let's say fake images, fake images equals self generator,

323
00:23:02,400 --> 00:23:08,070
which takes in the noise or let's just say some random noise.

324
00:23:08,850 --> 00:23:14,820
Here we have random noise, some random noise vector actually.

325
00:23:14,880 --> 00:23:20,940
Okay, so we have this now we have our fake images, thus we have this here now the next, and we also

326
00:23:20,940 --> 00:23:21,840
have the real.

327
00:23:21,840 --> 00:23:24,960
So we can now dive into training the discriminator.

328
00:23:25,080 --> 00:23:33,540
Now, it should be noted that the discriminator loss function will take in the output from here.

329
00:23:33,570 --> 00:23:40,980
That's the output from the real data and compare it with one and then taking the output from those fakes

330
00:23:40,980 --> 00:23:42,450
and compare it with zero.

331
00:23:42,450 --> 00:23:46,680
So we'll take the output from the real let's call this our.

332
00:23:47,300 --> 00:23:48,680
And compare with one.

333
00:23:48,680 --> 00:23:54,490
And then we take the output from the fixed, let's call it F and then compare it with zero.

334
00:23:54,500 --> 00:23:58,610
So getting back into the code right here, let's change this.

335
00:23:58,610 --> 00:24:03,050
Let's call this real images.

336
00:24:03,050 --> 00:24:08,000
And here we have real images, real images.

337
00:24:08,000 --> 00:24:08,720
There we go.

338
00:24:08,720 --> 00:24:17,870
So here we want to have the predicted output or better still, let's say real predictions, real predictions.

339
00:24:17,870 --> 00:24:20,090
So here are our real predictions here.

340
00:24:20,090 --> 00:24:27,620
Now, the real predictions are gotten from taking in our discriminator discriminator.

341
00:24:27,620 --> 00:24:33,380
And then, uh, this discriminator actually takes in, uh, real images.

342
00:24:33,380 --> 00:24:37,520
So we're representing what's going on, Um, right here.

343
00:24:37,520 --> 00:24:43,310
So we have this real which gets into our discriminator and then we output the real predictions, which

344
00:24:43,310 --> 00:24:46,100
will then compare with the value one.

345
00:24:46,100 --> 00:24:47,580
So that's it.

346
00:24:47,610 --> 00:24:49,830
We can now go ahead and compute the loss.

347
00:24:49,830 --> 00:24:58,590
So we'll have discriminator loss for real is equal our loss function, which we are going to pass in.

348
00:24:58,590 --> 00:25:03,300
And then this loss function takes in the real predictions.

349
00:25:03,750 --> 00:25:06,660
And once.

350
00:25:06,680 --> 00:25:14,160
So yeah, yo will have both the real predictions and the real labels.

351
00:25:14,520 --> 00:25:18,180
Or the real labels, as I've said already, are the ones.

352
00:25:18,180 --> 00:25:19,920
So it's basically this one here.

353
00:25:19,950 --> 00:25:26,700
Since we haven't batches of images, we have several ones of size, the batch size.

354
00:25:26,700 --> 00:25:32,400
So here we have real um, we there we go.

355
00:25:32,400 --> 00:25:43,230
We have real, um, labels TF ones and then we specify its size or better still, its shape.

356
00:25:43,230 --> 00:25:49,500
So here we have the, the shape, which is batch size by one.

357
00:25:49,530 --> 00:25:51,990
See that batch size by one.

358
00:25:51,990 --> 00:25:57,300
And the reason why we have batch size by one is simply because we have an output which takes in just

359
00:25:57,300 --> 00:26:06,670
one single value while this output will, uh, for all the real label will be equal one and we have

360
00:26:06,670 --> 00:26:07,510
the batch size.

361
00:26:07,540 --> 00:26:09,880
Okay, so we have that Now.

362
00:26:09,880 --> 00:26:13,870
The next thing we want to have here are the fake labels.

363
00:26:13,870 --> 00:26:18,610
Now the fake labels is going to be this zeros right here.

364
00:26:18,610 --> 00:26:27,130
So we have oh, sorry, we have zeros and then we have batch size and one.

365
00:26:27,130 --> 00:26:28,450
Okay, so that's it.

366
00:26:28,450 --> 00:26:34,840
Now we have our real labels and we have our fake labels and we will take the real labels.

367
00:26:34,840 --> 00:26:36,340
We'll take the real predictions.

368
00:26:36,340 --> 00:26:41,440
That is, we take we get what the model thinks about a particular input.

369
00:26:41,440 --> 00:26:43,330
That's all the classification.

370
00:26:43,330 --> 00:26:46,060
And then compare it with the real labels.

371
00:26:46,060 --> 00:26:54,340
The real labels is ones because we expect that the model should take in a real input and then know that

372
00:26:54,340 --> 00:26:59,560
it is a real input and that and that means that it should output a one when it takes in real data.

373
00:26:59,560 --> 00:27:05,980
And if it outputs a value different from one, then the loss is going to be greater than zero.

374
00:27:05,980 --> 00:27:09,190
Whereas if it's exactly equal one, the loss is going to be zero.

375
00:27:09,190 --> 00:27:11,590
And our aim here is to minimize this loss.

376
00:27:11,680 --> 00:27:19,450
Now we have that and the next thing we want to do is repeat this, but this time around for the fake

377
00:27:19,450 --> 00:27:20,410
predictions.

378
00:27:20,920 --> 00:27:23,710
So here we have fake predictions.

379
00:27:23,710 --> 00:27:30,310
Now the first step is we have real data getting into the discriminator and the other we have fake data

380
00:27:30,310 --> 00:27:31,570
getting to the discriminator.

381
00:27:31,570 --> 00:27:34,330
So here we have fake predictions.

382
00:27:34,330 --> 00:27:40,600
And this time around it doesn't just take the real images, but it takes, uh, the generated images.

383
00:27:40,600 --> 00:27:46,240
So we have self generator and it takes in, uh oh, noise actually.

384
00:27:46,240 --> 00:27:53,170
So it should take in a random noise, which is this, uh, one right here.

385
00:27:53,170 --> 00:27:58,240
So it takes in random noise and then it outputs fake images.

386
00:27:58,690 --> 00:28:03,400
But since we've defined this already here, we could just make use of it here.

387
00:28:03,400 --> 00:28:08,350
So here we have fake images and there we go.

388
00:28:08,350 --> 00:28:15,070
So your here we have the discriminator, which takes in fake image and then gives us a fake prediction.

389
00:28:15,070 --> 00:28:17,800
And we're going to compare this fake prediction with zero.

390
00:28:17,800 --> 00:28:24,430
So we expect that the the fake predictions should be zero, if not the the loss is not going to be equal

391
00:28:24,430 --> 00:28:24,880
to zero.

392
00:28:24,880 --> 00:28:28,570
So here instead of real labels, we have fake labels.

393
00:28:28,570 --> 00:28:36,400
So we comparing zero with what the model is going to predict or the output of the discriminator.

394
00:28:36,910 --> 00:28:39,700
Uh, here we have discriminator of fake.

395
00:28:39,730 --> 00:28:41,340
We have here fake.

396
00:28:42,010 --> 00:28:43,000
Uh, that's it.

397
00:28:43,120 --> 00:28:44,440
I think that's okay.

398
00:28:44,440 --> 00:28:47,950
We have the same loss function actually is a binary cross entropy loss.

399
00:28:47,950 --> 00:28:57,160
And once we have this now we could have, we could define the loss to be equal, the loss real plus

400
00:28:57,160 --> 00:28:59,410
the loss fake.

401
00:29:00,300 --> 00:29:04,980
Since that's basically a combination of this two losses right here.

402
00:29:05,400 --> 00:29:12,120
Now, before we move on, if you recall in the tips and tricks, we saw the label smoothing.

403
00:29:12,150 --> 00:29:16,690
Now here we have our labels, our real and our fake labels.

404
00:29:16,710 --> 00:29:17,550
Let's separate this.

405
00:29:17,550 --> 00:29:23,580
And now what we're going to do is instead of taking a one, we'll take values around one.

406
00:29:23,580 --> 00:29:34,620
So we take we add plus, uh, 0.25 times some random value between -1 and 1.

407
00:29:34,620 --> 00:29:41,910
So basically what we're saying here is we want to take, uh, this one and then add it plus a value

408
00:29:41,910 --> 00:29:49,510
in the range of -0.25 and two five and 0.25.

409
00:29:49,530 --> 00:29:55,230
So this means that now instead of having, uh, the label to be fixed at one, we'll have the label

410
00:29:55,260 --> 00:29:59,780
to be between, uh, because one -0.25 is 0.75.

411
00:29:59,800 --> 00:30:04,780
So it will be between 0.75 and 1.25 instead of just one.

412
00:30:04,780 --> 00:30:10,330
So that's how what our label will be now and then for the Zeros, since we we wouldn't want to have

413
00:30:10,330 --> 00:30:17,610
negative values, we'll take the zero plus some random value between 0 and 0.25.

414
00:30:17,620 --> 00:30:24,520
So instead of zero, we'll have some random value between 0 and 0.25.

415
00:30:24,550 --> 00:30:26,590
Okay, so that's said.

416
00:30:26,620 --> 00:30:29,980
What we'll do now is we'll get right here.

417
00:30:30,010 --> 00:30:32,140
We have tf random.

418
00:30:32,170 --> 00:30:40,540
Uniform and then we specify the mean vile, which is negative one and then the max vile, which is one.

419
00:30:40,540 --> 00:30:46,090
Now specifying this means we're going from -1 to 1 and then multiplying by 0.25 means we're going from

420
00:30:46,090 --> 00:30:49,480
-0.25 to 0.25.

421
00:30:49,510 --> 00:30:50,770
So that's basically it.

422
00:30:51,460 --> 00:30:53,650
And then also we specify its shape.

423
00:30:53,650 --> 00:30:58,770
So we have, um, the batch size by one.

424
00:30:58,780 --> 00:30:59,530
There we go.

425
00:30:59,530 --> 00:31:04,270
Now we're just going to copy this out and then paste this right here.

426
00:31:04,270 --> 00:31:08,230
So we don't want to have negative numbers, so we start from zero instead.

427
00:31:08,230 --> 00:31:14,320
So here we have 0 to 1, But uh, by default, the values are already from 0 to 1.

428
00:31:14,320 --> 00:31:16,120
So we could take this off.

429
00:31:16,330 --> 00:31:17,770
Okay, so that's it?

430
00:31:17,780 --> 00:31:19,570
Uh, everything looks fine.

431
00:31:20,320 --> 00:31:23,800
I think everything is done for our discriminator.

432
00:31:24,220 --> 00:31:26,590
Um, now we have this.

433
00:31:26,590 --> 00:31:27,820
Let's take this off.

434
00:31:28,270 --> 00:31:30,850
Uh, we have our loss, and that's it.

435
00:31:31,330 --> 00:31:33,910
Okay, so that's basically it.

436
00:31:33,940 --> 00:31:37,030
We now move on to our partial derivatives here.

437
00:31:37,030 --> 00:31:44,200
We take in our loss, and then we're going to update, um, the discriminator.

438
00:31:44,200 --> 00:31:51,310
So here we have this discriminator dot trainable weights.

439
00:31:51,460 --> 00:31:59,220
Then, uh, for the optimizer is the optimizer which we had specified already right here.

440
00:31:59,230 --> 00:32:06,720
So here we have instead of this optimizer, we have our D optimizer, which we're going to specify.

441
00:32:06,730 --> 00:32:13,570
So we have the optimizer, uh, takes in the partial derivatives and our trainable weights.

442
00:32:13,570 --> 00:32:16,510
Again, here we are training only the discriminator.

443
00:32:16,510 --> 00:32:19,600
So we have discriminator.

444
00:32:19,630 --> 00:32:20,470
That's it.

445
00:32:20,470 --> 00:32:21,670
So here we go.

446
00:32:21,670 --> 00:32:28,570
We have our model, our gan, the discriminator, the trainable weights, and we repeat the same process

447
00:32:28,570 --> 00:32:29,110
here.

448
00:32:29,140 --> 00:32:32,260
Now, uh, this is self.

449
00:32:32,260 --> 00:32:33,790
Okay, so that's it.

450
00:32:34,060 --> 00:32:35,470
Uh, this should be fine now.

451
00:32:35,470 --> 00:32:40,990
And then the next step we do is we're going to do this same, but for the generator.

452
00:32:41,140 --> 00:32:48,250
So what we do now is we're going to again sample some noise, uh, random noise here.

453
00:32:48,250 --> 00:32:53,590
We'll copy this code out and then paste it out after this.

454
00:32:54,580 --> 00:32:55,150
Okay.

455
00:32:55,150 --> 00:32:55,840
There we go.

456
00:32:55,840 --> 00:32:58,660
So we have this random noise right here.

457
00:33:00,110 --> 00:33:01,400
Random noise.

458
00:33:02,120 --> 00:33:04,940
And then from this random noise.

459
00:33:06,210 --> 00:33:15,570
We have our fake images, um, self generator, and it takes in the random noise.

460
00:33:16,810 --> 00:33:20,050
Okay, now we're going to have the same again.

461
00:33:20,050 --> 00:33:26,140
So we're going to make use of the gradient tape as we've done already with the discriminator.

462
00:33:26,140 --> 00:33:29,140
So paste that out here.

463
00:33:29,140 --> 00:33:33,670
And then here we have we'll be working inside with the generator.

464
00:33:34,090 --> 00:33:37,060
Um, this is discriminator, discriminator.

465
00:33:37,870 --> 00:33:39,430
Let's go back discriminator.

466
00:33:39,460 --> 00:33:43,090
Okay, so as we've said, this is, let's, let's just comment this.

467
00:33:43,090 --> 00:33:57,010
Let's write out, uh, generator and then right here, um, we have the discrete discriminator.

468
00:33:57,430 --> 00:33:59,290
Okay, so that's the discriminator.

469
00:33:59,290 --> 00:34:01,060
And now for the generator.

470
00:34:01,060 --> 00:34:07,300
So as we were saying, we have this fake images, we have our random noise and we made use of the random

471
00:34:07,300 --> 00:34:09,640
noise to generate the fake image.

472
00:34:09,640 --> 00:34:12,190
But this time around, we want to fool the discriminator.

473
00:34:12,190 --> 00:34:21,110
So instead of expecting our fake labels to be zeros as we had here this time around, our fake labels

474
00:34:21,110 --> 00:34:25,760
will be ones and we obviously not have anything to do with the real data.

475
00:34:25,760 --> 00:34:28,790
So let's take this off and then get back here.

476
00:34:29,030 --> 00:34:36,290
Uh, now we have the fake labels, uh, which, which is equal one, and it happens that they've actually

477
00:34:36,290 --> 00:34:36,920
been flipped.

478
00:34:36,920 --> 00:34:43,130
So let's, let's, uh, get down here and then paste this out here.

479
00:34:43,130 --> 00:34:50,510
So here we have flipped, flipped fake labels which are actually ones instead of zeros.

480
00:34:50,660 --> 00:34:52,310
Remember, we've seen this already.

481
00:34:52,310 --> 00:34:57,470
So we have that and we're not going to do any label smoothing right here.

482
00:34:57,470 --> 00:34:58,910
So we have that.

483
00:34:58,910 --> 00:35:06,380
And the next thing we want to do is start with our recording of the gradients.

484
00:35:07,190 --> 00:35:09,470
Now here, let's take this off.

485
00:35:09,680 --> 00:35:12,170
We have our fake predictions.

486
00:35:12,170 --> 00:35:14,900
We have, um, here we go.

487
00:35:14,900 --> 00:35:17,690
Discriminator takes in the fake images and that's it.

488
00:35:17,690 --> 00:35:25,490
Now here we have the flipped, flipped fake labels, and then we have our fake predictions.

489
00:35:25,940 --> 00:35:27,200
Fake predictions.

490
00:35:27,230 --> 00:35:30,590
I guess if we we should have had we should have made an error here.

491
00:35:30,590 --> 00:35:33,650
This is actually fake fake predictions.

492
00:35:33,650 --> 00:35:35,810
We're comparing the fake predictions with the fake labels.

493
00:35:35,810 --> 00:35:41,300
And then here we're comparing the fake predictions with the flipped fake labels.

494
00:35:41,300 --> 00:35:42,290
So that's it.

495
00:35:42,830 --> 00:35:44,570
Um, that should be fine.

496
00:35:44,570 --> 00:35:45,650
Here we have G.

497
00:35:45,980 --> 00:35:47,900
So this is our loss.

498
00:35:48,320 --> 00:35:49,490
G loss.

499
00:35:49,490 --> 00:35:52,100
And there's nothing like loss, fake or loss real.

500
00:35:52,100 --> 00:35:54,890
We just have G loss and that should be it.

501
00:35:55,040 --> 00:35:56,450
Um, partial derivatives.

502
00:35:56,450 --> 00:36:00,200
So generator That's it.

503
00:36:00,620 --> 00:36:01,460
Generator.

504
00:36:01,460 --> 00:36:04,220
And then here we have updating our generator.

505
00:36:04,220 --> 00:36:09,650
So we're making we're not updating the parameters of the discriminator here.

506
00:36:09,680 --> 00:36:11,720
Now here we have G Optimizer.

507
00:36:11,750 --> 00:36:12,470
That's it.

508
00:36:12,800 --> 00:36:19,340
Okay, so, um, if that's okay, we have now to update the different states.

509
00:36:19,340 --> 00:36:28,940
So we have the loss metric, we update the state and we pass in the loss and then we repeat the same

510
00:36:28,940 --> 00:36:30,080
for the loss.

511
00:36:30,080 --> 00:36:31,550
That's the generator loss.

512
00:36:31,550 --> 00:36:35,330
So here we have G and then here we have G.

513
00:36:35,750 --> 00:36:46,610
Then for our loss, we have G loss, um, G loss, metric result metric.

514
00:36:46,880 --> 00:36:47,810
That's it.

515
00:36:47,930 --> 00:36:55,040
And then here we have our D loss south.

516
00:36:57,210 --> 00:37:01,710
The loss metric and then results.

517
00:37:03,140 --> 00:37:05,870
Okay, let's now run the cell.

518
00:37:05,870 --> 00:37:08,670
And normally everything should work fine.

519
00:37:08,690 --> 00:37:12,770
Now we move on to define the number of epochs.

520
00:37:12,770 --> 00:37:15,400
So we're going to work for 20 epochs.

521
00:37:15,440 --> 00:37:17,840
And then let's get back here.

522
00:37:18,920 --> 00:37:27,860
We define our again, which is again, we've just defined and then it takes in the discriminator and

523
00:37:27,860 --> 00:37:30,170
the generator which we defined already.

524
00:37:30,650 --> 00:37:34,730
Uh, from here we go ahead and compile the model.

525
00:37:34,730 --> 00:37:44,630
So we have again compile and then we specify the optimizer, the Optimizer optimizer, which is, uh,

526
00:37:44,720 --> 00:37:50,450
the Adam Optimizer optimizers, and then Adam.

527
00:37:50,990 --> 00:37:58,970
Now this Adam Optimizer will be or with a learning rate learning rate, which we specified already at

528
00:37:58,970 --> 00:38:01,160
the beginning learning rate equal.

529
00:38:02,790 --> 00:38:06,210
Two times ten to the negative four, which is specified at the beginning.

530
00:38:06,210 --> 00:38:12,780
So we have a learning rate and then beta one, beta one equals 0.5.

531
00:38:12,810 --> 00:38:16,040
Now we repeat the same for the generator.

532
00:38:16,050 --> 00:38:17,570
Let's get back here.

533
00:38:17,580 --> 00:38:20,420
Copy that and then paste it out here.

534
00:38:20,430 --> 00:38:22,290
So this is for our generator.

535
00:38:22,290 --> 00:38:22,800
Now.

536
00:38:22,800 --> 00:38:25,650
Now we notice that there's no, there's no major difference.

537
00:38:25,650 --> 00:38:26,820
Actually, there's no difference.

538
00:38:26,820 --> 00:38:29,550
We just use the same optimizer.

539
00:38:30,090 --> 00:38:31,260
Now we have that.

540
00:38:31,260 --> 00:38:34,620
The next thing we want to do is pass in our loss function.

541
00:38:34,620 --> 00:38:42,180
So we have your loss function which is equal, uh, binary cross entropy loss.

542
00:38:42,360 --> 00:38:51,660
So we have losses, dot binary, cross entropy, and that's it.

543
00:38:52,380 --> 00:38:53,550
So we have this set.

544
00:38:53,580 --> 00:39:01,560
We could run this now and then go ahead to, uh, train our model by calling on again, dot fit method

545
00:39:01,920 --> 00:39:03,590
or just model.fit method.

546
00:39:03,600 --> 00:39:11,310
So we have history equal again, dot fit and then we pass in our train data set.

547
00:39:11,610 --> 00:39:20,010
Uh, we'll start with, say, uh, ten or let's, let's take just 100 elements first and then here the

548
00:39:20,010 --> 00:39:24,390
number of epochs equal our epochs, which we've defined already.

549
00:39:24,390 --> 00:39:26,850
And then we have some callback.

550
00:39:26,850 --> 00:39:28,740
So we'll make use of this callback.

551
00:39:28,740 --> 00:39:35,610
And you already see the advantage of, uh, overriding the Trainstep method as now we could just, um,

552
00:39:36,000 --> 00:39:40,170
define our callback and then pass it in here and the job is done.

553
00:39:40,170 --> 00:39:47,820
So we will have this callback which is going to show us the generated images at the end of an epoch.

554
00:39:47,820 --> 00:39:51,060
So, uh, let's call it show image.

555
00:39:51,060 --> 00:39:55,210
And it's going to take in our latent dream.

556
00:39:55,990 --> 00:39:57,970
Okay, so that's it.

557
00:39:58,090 --> 00:39:59,320
Everything looks fine.

558
00:39:59,320 --> 00:40:03,880
Now, the next thing we want to do is define this show image callback right here.

559
00:40:04,090 --> 00:40:06,190
Now, here, let's.

560
00:40:06,190 --> 00:40:09,340
Let's define this callback just above here.

561
00:40:09,340 --> 00:40:18,310
So we have our show image callback, and then we get the latent dimension and then we specify an epoch.

562
00:40:18,310 --> 00:40:23,100
And so at the end of every epoch, we are going to, uh, run this code right here.

563
00:40:23,110 --> 00:40:25,390
Now, what's going on here is simple.

564
00:40:25,390 --> 00:40:28,570
We have our model and then we have the generator.

565
00:40:28,570 --> 00:40:36,700
We take in some random noise, we pass it into our generator, and then we try to see what the model

566
00:40:36,700 --> 00:40:37,690
is generating.

567
00:40:37,690 --> 00:40:45,550
So this means that, uh, when when we training or initially we have some output or some fake data generated

568
00:40:45,550 --> 00:40:52,780
by the generator and then after an epoch, we want to see what the model is generating as we keep on

569
00:40:52,780 --> 00:40:56,770
training our our whole or complete model.

570
00:40:56,770 --> 00:41:03,640
So this, this is very important as we could already be able to debug and understand what's going on.

571
00:41:03,640 --> 00:41:11,140
So this means that in a case where the model is saved, for example, uh, generating the same kinds

572
00:41:11,140 --> 00:41:20,260
of output or generating some outputs which um, are clearly not the type of output we will expect to

573
00:41:20,260 --> 00:41:25,690
get or whose distribution is very far away from that of the real data.

574
00:41:25,720 --> 00:41:27,610
Then we will have to take some measures.

575
00:41:27,610 --> 00:41:36,160
So it's very important to work with these kinds of callbacks as the already permit us to debug our whole

576
00:41:36,160 --> 00:41:37,750
model training process.

577
00:41:37,750 --> 00:41:41,590
So that said, we just have to specify the figure size here.

578
00:41:41,590 --> 00:41:46,810
And then what we're doing is we we're having this different subplots because we see we have 64.

579
00:41:46,810 --> 00:41:48,700
We could you could reduce this or you could increase.

580
00:41:48,700 --> 00:41:49,540
This depends on you.

581
00:41:49,540 --> 00:41:51,730
You could just pick whatever you want to pick here.

582
00:41:51,730 --> 00:41:56,070
So you could generate a certain number of, um, images.

583
00:41:56,340 --> 00:41:57,900
Um, here n equals six.

584
00:41:57,900 --> 00:42:00,910
So we generating six by six, that's 36 images.

585
00:42:00,910 --> 00:42:02,850
So we could change this to 36.

586
00:42:02,850 --> 00:42:08,280
And then now for each and every subplot, we're going to show the image.

587
00:42:08,310 --> 00:42:09,210
See this out?

588
00:42:09,210 --> 00:42:10,650
This out comes from here.

589
00:42:10,650 --> 00:42:13,410
So it's from the generator and that's it.

590
00:42:13,410 --> 00:42:18,720
Now we're going to save this figure in some in some directory, which we could, uh, visualize.

591
00:42:19,840 --> 00:42:21,480
So let's modify this.

592
00:42:21,490 --> 00:42:24,310
Let's take here to be 36.

593
00:42:24,890 --> 00:42:26,170
Um, that's fine.

594
00:42:26,170 --> 00:42:27,400
Let's run this.

595
00:42:27,430 --> 00:42:28,930
We have that show image.

596
00:42:28,930 --> 00:42:29,890
That's fine.

597
00:42:30,340 --> 00:42:30,700
Um.

598
00:42:30,700 --> 00:42:32,470
Everything looks fine now.

599
00:42:32,470 --> 00:42:40,210
Let's go ahead and start with the training so we could even reduce this to just ten so that we could

600
00:42:40,480 --> 00:42:46,450
be able to notice any errors quickly and then now train on the full dataset.

601
00:42:48,120 --> 00:42:50,980
We're getting this error unexpected keyword argument.

602
00:42:51,000 --> 00:42:57,470
Meanwhile this actually mean let's get back here without the underscore.

603
00:42:57,480 --> 00:42:59,130
So this is it.

604
00:42:59,130 --> 00:43:02,430
Here you have this mean vowel and here we have max vowel.

605
00:43:02,460 --> 00:43:07,500
Now you could feel free to check out the documentation and you should find the exact syntax.

606
00:43:07,500 --> 00:43:10,470
So let's run this again and then start with the training.

607
00:43:12,480 --> 00:43:21,000
Oh, we're getting this errors where we're told that no, no gradients are provided for any variable.

608
00:43:21,000 --> 00:43:25,800
And when you look at this, you'll notice that we have come to the transposes.

609
00:43:26,140 --> 00:43:29,610
Uh, this means that most probably this error is coming from the generator.

610
00:43:29,610 --> 00:43:33,240
So let's get back here and make sure everything is okay.

611
00:43:33,600 --> 00:43:36,350
Uh, one thing we notice here already is that this should be g.

612
00:43:36,360 --> 00:43:36,810
So.

613
00:43:36,810 --> 00:43:40,950
Yeah, it should be G And what do we have again here?

614
00:43:41,220 --> 00:43:43,500
Um, yeah, it looks fine.

615
00:43:43,500 --> 00:43:46,320
So let's run this again and then see what we get.

616
00:43:47,420 --> 00:43:53,370
We still get this error, which again shows that it's coming from the generator.

617
00:43:53,390 --> 00:43:57,170
We get back here and what do we notice?

618
00:43:57,200 --> 00:44:05,450
We notice here that we get this fake images out of the scope of our gradient tape.

619
00:44:05,450 --> 00:44:11,180
So what we have to do is instead directly call this in here.

620
00:44:11,510 --> 00:44:17,480
So we we we want to update the parameters of the generator.

621
00:44:17,480 --> 00:44:22,610
So it has to be in this gradient tape scope and not outside.

622
00:44:22,610 --> 00:44:24,260
So let's take this off.

623
00:44:24,710 --> 00:44:25,880
Take that off.

624
00:44:25,880 --> 00:44:32,000
And one question you may be asking yourself is, why is it possible to just do this here that's have

625
00:44:32,000 --> 00:44:34,220
this out but not in the generator?

626
00:44:34,220 --> 00:44:37,100
And the simple answer is for the discriminator.

627
00:44:37,130 --> 00:44:41,450
This generator isn't updated, so it doesn't matter if it's in here or not.

628
00:44:41,480 --> 00:44:45,170
Whereas for the generator it actually matters.

629
00:44:45,170 --> 00:44:47,460
So we we have to make sure it's in here.

630
00:44:47,460 --> 00:44:48,450
So that's it.

631
00:44:48,450 --> 00:44:51,270
Let's run this again and then see what we get.

632
00:44:51,810 --> 00:44:52,590
So that's it.

633
00:44:52,620 --> 00:44:53,310
Training has started.

634
00:44:53,310 --> 00:44:56,550
We told no such file or directory generated and all of that.

635
00:44:56,550 --> 00:45:05,340
Anyway, let's, let's make, uh, this directory generated and then we run this again.

636
00:45:07,350 --> 00:45:08,250
There we go.

637
00:45:08,280 --> 00:45:09,990
Things seem to be working fine.

638
00:45:10,350 --> 00:45:13,480
We see that struggling to generate some images.

639
00:45:13,500 --> 00:45:15,540
Let's open up our generator here.

640
00:45:15,630 --> 00:45:17,490
You see, we have the different files.

641
00:45:17,520 --> 00:45:21,240
Let's check out on, say, the 18th output.

642
00:45:22,510 --> 00:45:23,740
Uh, you see this already?

643
00:45:23,740 --> 00:45:26,320
So you see it's struggling already to produce.

644
00:45:26,350 --> 00:45:27,730
You see this image here?

645
00:45:27,880 --> 00:45:30,640
This one here looks already like so.

646
00:45:30,940 --> 00:45:34,720
A human face, though it's still struggling a lot.

647
00:45:34,720 --> 00:45:36,340
So we have that.

648
00:45:36,340 --> 00:45:36,700
Let's.

649
00:45:36,700 --> 00:45:39,070
Let's check out on, say, the 19th.

650
00:45:39,790 --> 00:45:41,200
Oh, there we go.

651
00:45:41,350 --> 00:45:42,580
Okay, so that's it.

652
00:45:42,580 --> 00:45:44,800
Let's close this.

653
00:45:45,040 --> 00:45:47,800
Hopefully there's no issue with connection.

654
00:45:48,160 --> 00:45:49,570
Uh, let's get back up.

655
00:45:49,570 --> 00:45:57,670
And then if those images aren't very visible, you can retrain and go for or take many more samples.

656
00:45:57,670 --> 00:46:00,520
So here we're dealing with, uh, 100.

657
00:46:00,520 --> 00:46:01,570
We take a hundred.

658
00:46:01,570 --> 00:46:05,080
And then after, let's, uh, check this out.

659
00:46:05,080 --> 00:46:08,230
After six epochs, you see the kind of results we get.

660
00:46:08,260 --> 00:46:11,410
You see the images we get already look like humans.

661
00:46:11,440 --> 00:46:14,140
Now let's modify this code.

662
00:46:14,140 --> 00:46:21,160
Let's stop the training, and then let's modify the code such that we do not flip the labels.

663
00:46:21,460 --> 00:46:23,210
Um, there we go.

664
00:46:23,210 --> 00:46:29,750
We could get back to this flip fake labels, and then instead of once, we just take the fake labels.

665
00:46:29,750 --> 00:46:31,700
So let's, let's take zeros.

666
00:46:31,790 --> 00:46:38,180
So if we have this, we're going to compare it with when we actually do the flipped labeling.

667
00:46:38,180 --> 00:46:46,550
So let's run this, um, let's, uh, rerun the cells and then we start with the training.

668
00:46:47,530 --> 00:46:48,850
After nine epochs.

669
00:46:48,850 --> 00:46:51,790
We can now check out those generated images.

670
00:46:51,790 --> 00:46:58,630
Let's let's open up the zero, open up the third the feet and then say the eight.

671
00:46:59,110 --> 00:47:00,820
Um, let's see what we get here.

672
00:47:00,880 --> 00:47:01,480
See?

673
00:47:02,480 --> 00:47:03,300
It is.

674
00:47:04,910 --> 00:47:10,550
Our generators experiencing vanishing gradients, and that's why we're getting these kinds of horrible

675
00:47:10,550 --> 00:47:11,390
outputs.

676
00:47:11,390 --> 00:47:14,540
So let's again stop the training.

677
00:47:14,540 --> 00:47:21,290
Let's stop the training, and then let's get back to our our models.

678
00:47:21,530 --> 00:47:26,030
What we'll do now is instead of using the leaky Relu, we'll just use the relu.

679
00:47:26,030 --> 00:47:30,710
So let's, let's change this activation and set it to relu.

680
00:47:31,130 --> 00:47:32,180
There we go.

681
00:47:32,180 --> 00:47:38,930
Let's simply paste this out everywhere and then take off the leaky relu and then see the kind of output

682
00:47:38,960 --> 00:47:45,260
we would get in case we, we, we used the, the, the, the relu itself instead of the leaky relu.

683
00:47:45,680 --> 00:47:46,910
Um, let's take that off.

684
00:47:46,910 --> 00:47:48,080
Everything looks fine.

685
00:47:48,080 --> 00:47:48,560
Now.

686
00:47:48,560 --> 00:47:52,130
Let's get back into the generator and then repeat the same.

687
00:47:52,130 --> 00:47:53,950
So space spaces out here.

688
00:47:54,170 --> 00:47:55,640
Relu Activation.

689
00:47:56,060 --> 00:47:58,250
Um, take off.

690
00:47:58,250 --> 00:48:04,130
Leaky Relu, take off leaky relu, And then rerun this again.

691
00:48:04,130 --> 00:48:05,720
Um, this should be fine.

692
00:48:05,720 --> 00:48:08,330
Now let's run that and then see what we get.

693
00:48:09,140 --> 00:48:09,740
Okay.

694
00:48:09,740 --> 00:48:12,290
Now, that training has been going on for a while now.

695
00:48:12,290 --> 00:48:20,030
We could open up the zeroth, let's say open up the second, the fifth, the seventh, um, and then

696
00:48:20,030 --> 00:48:22,640
the 10th so we could look at this.

697
00:48:22,670 --> 00:48:23,840
Now, what do you notice?

698
00:48:23,840 --> 00:48:27,860
You see we're getting practically nothing as output right here.

699
00:48:27,860 --> 00:48:33,980
And that's simply because we using the relu instead of the leaky Relu.

700
00:48:34,520 --> 00:48:40,250
And as you can see, Sparsity isn't great for generating images, so you have to be very careful with

701
00:48:40,250 --> 00:48:40,700
that.

702
00:48:40,700 --> 00:48:42,380
Now, let's stop this training.

703
00:48:42,770 --> 00:48:48,680
Um, let's stop the training and then, uh, get back to what we had before.

704
00:48:49,250 --> 00:48:51,200
Um, we have this.

705
00:48:51,200 --> 00:48:52,490
Yeah, this should be fine.

706
00:48:52,910 --> 00:48:55,070
Uh, no discriminator.

707
00:48:55,070 --> 00:49:00,290
And then right here, we get back again and.

708
00:49:00,980 --> 00:49:03,650
Oops, let's get back.

709
00:49:03,650 --> 00:49:06,600
Then we restart the training again and then see what we get.

710
00:49:06,810 --> 00:49:09,270
Now, that training has been going on for a while.

711
00:49:09,270 --> 00:49:11,400
Let's go ahead and open this up.

712
00:49:11,400 --> 00:49:12,090
One.

713
00:49:12,660 --> 00:49:20,340
Let's zero take out two, for example, five, um, seven and say eight.

714
00:49:20,340 --> 00:49:22,930
So let's check out what the model is outputting.

715
00:49:22,950 --> 00:49:25,980
You see, Um, there we go.

716
00:49:26,220 --> 00:49:29,940
We can see that with the normal Relu It isn't doing that bad either.

717
00:49:29,940 --> 00:49:36,930
So in this specific example, using the leaky Relu isn't maybe, uh, that necessary?

718
00:49:37,050 --> 00:49:42,030
Okay, so now we've looked at the effect of not using this leaky relu.

719
00:49:42,210 --> 00:49:48,510
Um, you could also take off the batchnorm and see how that affects the kinds of outputs you get from

720
00:49:48,510 --> 00:49:49,530
the generator.

721
00:49:49,830 --> 00:49:51,510
Um, here, we're getting an error.

722
00:49:51,960 --> 00:49:53,550
Let's run this again.

723
00:49:54,180 --> 00:49:55,290
Yeah, okay, that's fine.

724
00:49:55,290 --> 00:50:01,230
Now, let's go ahead and, uh, restart the training, but this time around with the full data set.

725
00:50:01,230 --> 00:50:04,920
So let's take this off and then start with the training.

726
00:50:04,950 --> 00:50:12,570
After training for over 20 epochs, you are the kinds of outputs we shall be getting in our deep learning

727
00:50:12,570 --> 00:50:20,100
for Image Generation course, we delve deep into how to create even much higher quality outputs like

728
00:50:20,100 --> 00:50:26,430
this one, for example, which was created with a diffusion model, or this other one which we created

729
00:50:26,430 --> 00:50:27,570
with a pro gan.

730
00:50:27,600 --> 00:50:33,870
That's it for this section on image generation with the variational autoencoders and the generative

731
00:50:33,870 --> 00:50:35,460
adversarial neural networks.