1
00:00:11,800 --> 00:00:16,900
In this lecture we are going to summarize everything we learned in this section of the course.

2
00:00:16,900 --> 00:00:21,330
This section was all about facial recognition with Siamese networks.

3
00:00:21,370 --> 00:00:26,620
One thing you saw in this section was that while the theory was relatively simple the implementation

4
00:00:26,620 --> 00:00:33,640
was complex not because of the model itself but because of all the data processing Siamese that works

5
00:00:33,640 --> 00:00:35,750
by themselves are quite simple.

6
00:00:35,830 --> 00:00:41,710
We passed two images through the same CNN and this gives us a face embedding for both images.

7
00:00:41,710 --> 00:00:46,630
Then we would like to know the distance between these two embedding is just like we do with word embedding

8
00:00:46,630 --> 00:00:54,640
meetings that tell us how similar the two faces are if the two faces are so similar that the distance

9
00:00:54,640 --> 00:01:00,220
between the embedding is less than some threshold and we predict that they are the same person.

10
00:01:00,220 --> 00:01:04,750
If the faces are from different people the distance between the embedding should be greater than the

11
00:01:04,750 --> 00:01:09,680
threshold so that we can predict that they are different people in reality.

12
00:01:09,700 --> 00:01:15,490
Suppose you are running some kind of facial recognition database at your company instead of always passing

13
00:01:15,490 --> 00:01:22,960
in two faces at a time something simpler you could do is just store the embedding themselves in a database.

14
00:01:23,020 --> 00:01:28,750
So for all the employees of the company or all of the students in a school don't store their images

15
00:01:28,810 --> 00:01:35,050
just store there and then things then when it's prediction time you only have to calculate the embedding

16
00:01:35,350 --> 00:01:41,140
for the incoming image and compare that to all the existing embedding in your database to determine

17
00:01:41,200 --> 00:01:47,610
which person it is.

18
00:01:47,760 --> 00:01:52,590
Next we looked at all the data pre processing steps of which there were many.

19
00:01:52,590 --> 00:01:57,420
First we loaded in the data and split the images up into train and test.

20
00:01:57,420 --> 00:02:03,210
It was important to realize that the data being fed into the neural network itself not be split up but

21
00:02:03,210 --> 00:02:06,960
rather we should split up the images to start with.

22
00:02:06,960 --> 00:02:13,050
This is so that none of the images in the train set will appear in the test set and this is because

23
00:02:13,050 --> 00:02:18,930
the actual data set we train the neuron that recon simply consists of the same images being paired up

24
00:02:18,930 --> 00:02:21,950
amongst themselves multiple times.

25
00:02:22,020 --> 00:02:28,260
We don't start with a lot of images but we end up with a lot of training samples which grow quite dramatically

26
00:02:28,540 --> 00:02:31,980
with the number of images we have.

27
00:02:32,180 --> 00:02:35,900
Next we looked at how to convert the data into pairs.

28
00:02:35,930 --> 00:02:41,120
There are many ways of doing this and sometimes it can be difficult to try and understand what someone

29
00:02:41,120 --> 00:02:42,770
else's approach is.

30
00:02:42,770 --> 00:02:46,940
So as always the most important thing is to coat it up yourself.

31
00:02:46,940 --> 00:02:51,640
This always leads to the best understanding next.

32
00:02:51,660 --> 00:02:56,760
We wrote some data generator functions so that we could iterate over these in our training loop to generate

33
00:02:56,760 --> 00:03:02,640
data on the fly and not have to store them in gigantic Redundant Arrays.

34
00:03:02,640 --> 00:03:06,840
Finally we wrote a function to help us evaluate the Siamese networks.

35
00:03:06,840 --> 00:03:13,530
In particular it would draw a histogram of the match distances versus the non match distances and prints

36
00:03:13,530 --> 00:03:16,980
out the sensitivity and specificity.

37
00:03:17,010 --> 00:03:23,240
This is better than accuracy because we have imbalanced classes due to the nature of the data.

38
00:03:23,250 --> 00:03:27,420
We will always have more samples from the negative class than the positive class.