1
00:00:00,180 --> 00:00:01,380
Hi and welcome back.

2
00:00:02,010 --> 00:00:08,250
Now that you have an intuitive understanding, at least at a high level of how some these networks work

3
00:00:08,580 --> 00:00:14,400
and what they're used for, we can now talk about how you can begin training your very ordered society,

4
00:00:14,400 --> 00:00:14,970
its network.

5
00:00:15,480 --> 00:00:16,530
So let's get started.

6
00:00:16,590 --> 00:00:18,900
So firstly, let's do a quick recap.

7
00:00:19,410 --> 00:00:24,930
Remember, the goal of a socialist network is to classify what two inputs like these two images here

8
00:00:25,350 --> 00:00:26,730
are similar or not.

9
00:00:27,150 --> 00:00:32,910
And to do that, we have to basically create an including using the kind of layers of identical networks

10
00:00:32,910 --> 00:00:33,600
here and here.

11
00:00:34,230 --> 00:00:39,180
And then we can use those in a different single layer to compare them using Euclidean across St. Louis.

12
00:00:39,600 --> 00:00:43,290
And then we use a loss function here to output a similarity score.

13
00:00:44,340 --> 00:00:51,900
So in order to Trina, a Siamese network, we basically have to create appeared dataset.

14
00:00:52,170 --> 00:00:54,210
Basically, I'll explain what that is to you.

15
00:00:54,960 --> 00:00:57,600
We firstly start by creating image pairs.

16
00:00:57,600 --> 00:00:58,650
So what is the image?

17
00:00:58,660 --> 00:01:04,080
Because, well, firstly, we need to create pairs where the images are identical, like a positive

18
00:01:04,080 --> 00:01:04,710
pair here.

19
00:01:05,400 --> 00:01:09,220
And then also when they are negative, when there are different classes.

20
00:01:09,330 --> 00:01:13,500
So we can see they're in the same class here, but there are different classes here.

21
00:01:13,890 --> 00:01:17,520
So this pier becomes a dataset when this is a label here.

22
00:01:18,300 --> 00:01:21,660
And likewise, this is the pier we are treating it on.

23
00:01:22,110 --> 00:01:25,320
And this is label quite simple, isn't it?

24
00:01:26,280 --> 00:01:32,700
So let's talk about how we build this network, so we build and see another that outputs a feature encoding

25
00:01:32,700 --> 00:01:36,090
or embedding later using a fully connected layer at the end.

26
00:01:36,270 --> 00:01:38,070
That's a densely and gives you forgot.

27
00:01:38,640 --> 00:01:43,740
We build it the system network, which will have the exact same architecture and hyper parameters and

28
00:01:43,740 --> 00:01:47,250
widths as the initial CNN we created here.

29
00:01:48,060 --> 00:01:53,880
We didn't build a different layer to calculate the Euclidean distance or cosine distance between the

30
00:01:53,880 --> 00:01:56,220
output of the Tunisian and sub networks.

31
00:01:57,150 --> 00:02:02,880
The final here is a fully connected layer with a single node using a sigmoid activation function to

32
00:02:02,880 --> 00:02:04,800
output similarity score.

33
00:02:05,820 --> 00:02:11,340
And lastly, we compiled a model using one of the lowest functions we discussed contrastive triplet

34
00:02:11,340 --> 00:02:12,120
or binary.

35
00:02:12,510 --> 00:02:15,600
However, I would stick to contrastive or triplet in most cases.

36
00:02:16,140 --> 00:02:23,160
And then you're basically ready to train once we have our image data set as a sorted in pairs and we

37
00:02:23,160 --> 00:02:25,080
have all models built and compiled.

38
00:02:25,590 --> 00:02:31,310
We can then use something like a typical optimizer, like R-Miss. proper common gradient descent algorithm,

39
00:02:32,010 --> 00:02:34,600
and we can start training with CNN's No.

40
00:02:35,070 --> 00:02:40,170
There are typically quite simple and we do not need to, as do we need to create relatively simple embeddings,

41
00:02:40,620 --> 00:02:43,620
meaning that like, let's say, we're including these images here.

42
00:02:43,770 --> 00:02:51,300
This is a 28 by 28 image you can use of maybe the code 64 one dimensional vector.

43
00:02:51,310 --> 00:02:57,220
So one by 64 to store those embeddings or something even smaller if you want, it will work.

44
00:02:57,240 --> 00:03:01,050
However, if you get too small, you lose some information in the network.

45
00:03:01,290 --> 00:03:07,050
So I would advise you maybe stick to like 64 128 foot image this size.

46
00:03:07,590 --> 00:03:08,730
So that's it.

47
00:03:09,000 --> 00:03:13,350
You're basically ready to trim Siamese network right now.

48
00:03:13,830 --> 00:03:16,260
But let's take a look at how we do that increase.

49
00:03:16,740 --> 00:03:18,240
So stay tuned for that lesson.

50
00:03:18,510 --> 00:03:18,930
Thank you.
