1
00:00:00,270 --> 00:00:01,590
Hi and welcome back.

2
00:00:01,890 --> 00:00:08,100
In this section, we'll talk about how we can start training games and understand the process and challenges

3
00:00:08,610 --> 00:00:09,660
in training guns.

4
00:00:09,870 --> 00:00:10,920
So let's get started.

5
00:00:12,120 --> 00:00:17,490
So firstly, training guns is notoriously difficult compared to training neural networks.

6
00:00:27,140 --> 00:00:28,400
Hi and welcome back.

7
00:00:28,880 --> 00:00:31,640
In this section, we'll talk about training guns.

8
00:00:32,120 --> 00:00:37,100
We'll go into detail of what the process and the challenges you can experience when training guns.

9
00:00:37,400 --> 00:00:38,570
So let's get started.

10
00:00:40,220 --> 00:00:46,550
So firstly, training guns is notoriously difficult compared to neural networks because neural networks

11
00:00:46,550 --> 00:00:47,330
are quite simple.

12
00:00:47,330 --> 00:00:49,040
It's one network.

13
00:00:49,340 --> 00:00:50,900
We have a simple loss function.

14
00:00:50,900 --> 00:00:54,110
We use gradient descent and back propagation to reduce our loss.

15
00:00:54,560 --> 00:00:58,640
Generally, I mean, yes, it's complex, but it's actually quite simple compared to Gans.

16
00:00:59,090 --> 00:01:04,730
And that's because in guns, every weight change can actually change the entire balance of autonomic

17
00:01:04,730 --> 00:01:10,760
system because remember, we have two adversarial systems working with two against each other.

18
00:01:10,890 --> 00:01:12,290
However, you want to think about it.

19
00:01:12,830 --> 00:01:16,580
And so in this case, we're not trying to just minimize loss.

20
00:01:17,000 --> 00:01:20,900
We're trying to find an equilibrium between both opposing networks.

21
00:01:21,620 --> 00:01:28,910
So when training stops or basically we end training when the discriminator cannot teleport to from first

22
00:01:29,120 --> 00:01:32,690
for big data, that's essentially how we even know when to stop.

23
00:01:33,290 --> 00:01:35,990
So you can see this leads to a lot of problems again.

24
00:01:35,990 --> 00:01:42,170
So you basically have to babysit the training process and try to experiment a lot with parameters to

25
00:01:42,170 --> 00:01:44,210
make sure we can achieve this equilibrium.

26
00:01:45,950 --> 00:01:47,930
So let's talk about the training process.

27
00:01:48,860 --> 00:01:55,430
So firstly, we start with a noisy vector in this training process and this example, I'm assuming we're

28
00:01:55,430 --> 00:01:58,030
training again to generate images.

29
00:01:58,070 --> 00:02:01,490
So we use an image vector like this here to represent the noise.

30
00:02:03,050 --> 00:02:08,820
Next, we input this noise into a generated network to generate some synthetic data.

31
00:02:08,840 --> 00:02:15,050
So initially, in the first training phase of this training process, the generated just has random

32
00:02:15,050 --> 00:02:16,070
with initialization.

33
00:02:16,070 --> 00:02:18,770
So he's going to produce a random looking noise image here.

34
00:02:19,460 --> 00:02:20,660
So what's next?

35
00:02:21,230 --> 00:02:27,200
So next, we take some sample data from a real dataset and mix it with some of our generated synthetic

36
00:02:27,200 --> 00:02:27,620
data.

37
00:02:28,010 --> 00:02:33,380
So in this combination of data here, we have some real data and some synthetic data.

38
00:02:34,400 --> 00:02:36,000
Next, what we do.

39
00:02:36,350 --> 00:02:42,760
We train with these communities to classify whether this mixed dataset has real or fake data.

40
00:02:42,770 --> 00:02:49,390
So he has identified what's real and what's fake out of it, and then used the knowledge to ground your

41
00:02:49,400 --> 00:02:52,340
knowledge afterward to update his weights accordingly.

42
00:02:53,390 --> 00:02:57,170
Next, we use that information to train the generator.

43
00:02:57,740 --> 00:03:01,640
We make some more random noisy vectors and create more synthetic data.

44
00:03:02,300 --> 00:03:06,470
And with the width of the discriminative frozen, this is in this field.

45
00:03:06,470 --> 00:03:12,200
Here we use to feedback from this community to now update widths of our generator.

46
00:03:12,800 --> 00:03:14,090
So let's think about this.

47
00:03:14,120 --> 00:03:17,510
So firstly, I would discriminate.

48
00:03:17,510 --> 00:03:19,010
So that's this guy right here.

49
00:03:19,640 --> 00:03:25,100
He basically identifies whether from that mixed dataset, what's real and what's fake.

50
00:03:25,610 --> 00:03:30,020
So then we hear he gets his weights accordingly to the ground truth of that.

51
00:03:30,560 --> 00:03:37,630
So next in this next phase, the generator generates some more and fake and real and somewhat fake data,

52
00:03:37,970 --> 00:03:39,650
and mixes it with the real data.

53
00:03:40,070 --> 00:03:42,280
However, we freeze the weights of discriminator.

54
00:03:42,310 --> 00:03:49,130
No, because he just that his we took it and we take those images here and get his feedback as well.

55
00:03:49,550 --> 00:03:55,400
And then once he gives us feedback, if he's able to identify most as more specific data as well, the

56
00:03:55,410 --> 00:04:01,580
generator now has to update its weights accordingly so that it can improve in the next iteration.

57
00:04:03,370 --> 00:04:06,490
So this is how it looks, this is how I do it of process looks.

58
00:04:06,880 --> 00:04:15,250
So this is the first minister that degenerates, generates on a first epoch and you can see it generally,

59
00:04:15,250 --> 00:04:16,600
most of us looks quite bad.

60
00:04:16,610 --> 00:04:19,630
This is this is the amnestied of status that we're looking at.

61
00:04:20,320 --> 00:04:24,580
And then you can see in the second epoch, things sort of look a bit, but so you can see this looks

62
00:04:24,580 --> 00:04:27,330
like for this, it looks like a zero.

63
00:04:27,340 --> 00:04:30,040
This looks like a two, but it's still not good.

64
00:04:30,190 --> 00:04:32,830
And most of this discriminator might say it's fake.

65
00:04:33,640 --> 00:04:34,690
Then I'm 30 buck.

66
00:04:34,690 --> 00:04:35,860
You can see it gets better.

67
00:04:35,860 --> 00:04:43,810
So you can see how this feedback loop of passing information across the networks actually leads to improved

68
00:04:44,290 --> 00:04:50,140
cash generative results here that look like they plausibly would have come from the original data set.

69
00:04:51,220 --> 00:04:51,910
It's pretty cool.

70
00:04:52,390 --> 00:04:54,460
So are there any issues with guns?

71
00:04:55,060 --> 00:04:55,540
Yes.

72
00:04:55,660 --> 00:04:57,310
Look carefully at this image.

73
00:04:58,390 --> 00:05:02,120
See that that fear smudging effect here.

74
00:05:03,040 --> 00:05:07,930
That's because guns, they can understand and generate images like this quite well.

75
00:05:08,260 --> 00:05:15,390
Once they learn the representation of a face, they can apply their random noise generated generative

76
00:05:15,670 --> 00:05:20,890
knowledge to the embedded learnings of that generated network and generate the fear so they fit.

77
00:05:21,010 --> 00:05:27,520
However, because sometimes faces in groups, it will generate weird looking faces like this.

78
00:05:27,520 --> 00:05:33,130
This guy looks slightly like he has a skin boon and faces like peeling off of it.

79
00:05:33,790 --> 00:05:40,960
So there are issues, and those issues basically point to the fact that guns still don't fully understand

80
00:05:40,960 --> 00:05:45,460
what they're learning, which speaks back to Richard Feynman's humans could.

81
00:05:46,240 --> 00:05:52,360
However, they do learn a lot, as you can see from this discriminator and generator set up.

82
00:05:52,570 --> 00:05:58,120
They actually do learn a lot in representing and creating new images, which you'll see in the practical

83
00:05:58,120 --> 00:06:00,070
experiments will do after this lesson.

84
00:06:01,180 --> 00:06:04,930
But first, let's talk about some general challenges in training guns.

85
00:06:05,470 --> 00:06:10,810
Firstly, because it's a dynamic system, achieving equilibrium is quite difficult.

86
00:06:11,470 --> 00:06:17,860
Now also, guns are computationally demanding to train, and it requires a lot of tweaking of hyper

87
00:06:17,860 --> 00:06:24,310
parameters initialization changing a number of hidden lives, trying different activation functions,

88
00:06:24,310 --> 00:06:26,710
using batch non-metropolitan, different layers.

89
00:06:27,220 --> 00:06:29,660
So it's quite a messy process.

90
00:06:29,680 --> 00:06:35,980
There's no there's some ground truth, there's some ground rules and rule of thumb strategies to train

91
00:06:35,980 --> 00:06:36,580
your guns.

92
00:06:37,150 --> 00:06:39,130
However, it's still very difficult.

93
00:06:39,910 --> 00:06:44,650
Also, bad initialization can cause a discriminator loss to go close to zero.

94
00:06:45,160 --> 00:06:48,340
You don't want your discriminating loss too good to go close to zero.

95
00:06:48,670 --> 00:06:54,340
You want it to remain at point zero five, roughly, which means that it can't tell the difference between

96
00:06:54,340 --> 00:06:55,150
real and synthetic.

97
00:06:55,330 --> 00:06:56,740
It's basically indistinguishable.

98
00:06:57,400 --> 00:07:03,580
So one solution is to purposely make your generator loop slower than the discriminator.

99
00:07:04,120 --> 00:07:09,730
However, this too can be hard to balance, as a good discriminator depends on a good generator, so

100
00:07:09,730 --> 00:07:13,300
you can see the problems that you are raising and training guns.

101
00:07:14,320 --> 00:07:16,240
And then there's something called mode collapse.

102
00:07:16,720 --> 00:07:22,600
This happens when, regardless of the noise input fed into the generator, the generated output varies

103
00:07:22,600 --> 00:07:23,440
very little.

104
00:07:23,860 --> 00:07:30,280
It occurs when a small set of images look good to discriminator and get scored better than the other

105
00:07:30,280 --> 00:07:30,820
images.

106
00:07:31,270 --> 00:07:34,630
The gun simply learns to reproduce those images over and over.

107
00:07:35,020 --> 00:07:39,790
Sort of analogous to overfitting at that point, so more collapse is something you need to be careful

108
00:07:39,790 --> 00:07:40,060
of.

109
00:07:41,530 --> 00:07:46,700
Next, we'll take a look at some practical use cases for guns so you can go.

110
00:07:46,720 --> 00:07:52,000
We'll go over the many ways guns are applied in the industry, so stay tuned for that.

111
00:07:52,210 --> 00:07:52,690
Thank you.