1
00:00:12,020 --> 00:00:17,450
In this lecture I'm going to introduce you to the next section of the chorus which is on facial recognition

2
00:00:17,480 --> 00:00:19,780
using Siamese networks.

3
00:00:19,850 --> 00:00:26,300
Facial recognition has been a very controversial topic in the recent years as governments and basically

4
00:00:26,300 --> 00:00:30,070
everyone who is not the government have competing interests.

5
00:00:30,170 --> 00:00:35,930
So it's important for everyone to understand facial recognition its strengths and weaknesses and what

6
00:00:35,930 --> 00:00:41,720
it can and cannot do so that you can be knowledgeable about how this technology will impact the way

7
00:00:41,720 --> 00:00:42,290
we live.

8
00:00:47,520 --> 00:00:51,990
Let's outline everything we will do in this section before moving on to the meat of the content.

9
00:00:53,140 --> 00:00:58,810
I think what you'll find is the theory behind facial recognition for the algorithm we are about to discuss

10
00:00:59,070 --> 00:01:01,210
is quite intuitive and simple.

11
00:01:01,210 --> 00:01:05,920
You can probably arrive at a decent understanding of it in just five minutes.

12
00:01:05,920 --> 00:01:12,040
The real challenge is going to be the little details that go into the implementation specifically a

13
00:01:12,040 --> 00:01:15,380
lot of these little details have to do with data processing.

14
00:01:15,610 --> 00:01:19,430
It's not so straightforward as here's a data set of inputs and targets.

15
00:01:19,480 --> 00:01:25,510
Now throw them into a CNN usually that would require a very large set of images which technically we

16
00:01:25,510 --> 00:01:31,170
have thanks to surveillance and people giving up their data to companies like Facebook and Instagram

17
00:01:31,900 --> 00:01:37,510
but rather as we will learn we can train official recognition network without thousands or millions

18
00:01:37,510 --> 00:01:43,270
of images.

19
00:01:43,390 --> 00:01:45,190
So how can this be.

20
00:01:45,190 --> 00:01:48,540
Think of how a facial recognition system might work.

21
00:01:48,550 --> 00:01:54,280
Suppose I have a picture of you like say your student I.D. card and when you walk into a building a

22
00:01:54,280 --> 00:01:58,660
camera will take a picture of you and compare that picture to your I.D. card.

23
00:01:58,690 --> 00:02:01,640
So this is like a binary classification problem.

24
00:02:01,810 --> 00:02:04,590
We want to know is it a match or not.

25
00:02:04,750 --> 00:02:08,660
If we're given two pictures of the same person it should return a match.

26
00:02:08,770 --> 00:02:12,490
If we're given two pictures of different people it should say it's not a match

27
00:02:17,590 --> 00:02:17,960
all right.

28
00:02:17,990 --> 00:02:21,030
So why does this approach require less data.

29
00:02:21,050 --> 00:02:26,150
I don't want to get into too much detail right now but it all has to do with how we always want to consider

30
00:02:26,150 --> 00:02:27,600
pairs.

31
00:02:27,620 --> 00:02:30,560
Imagine we have three pictures a B and C..

32
00:02:30,770 --> 00:02:32,990
How many possible pairs are there.

33
00:02:32,990 --> 00:02:39,140
We have a b aC and B.C. if we have four pictures ABC NBC.

34
00:02:39,230 --> 00:02:41,300
How many possible pairs are there.

35
00:02:41,300 --> 00:02:47,700
Now we have a b aC a d B.C. BD and CB that's 6.

36
00:02:47,720 --> 00:02:49,460
So what's the pattern.

37
00:02:49,550 --> 00:02:52,280
In fact this is what we call a counting problem.

38
00:02:52,430 --> 00:02:58,280
If we have any pictures then there are and choose two possible ways to combine different pictures.

39
00:02:58,370 --> 00:03:01,810
This is equal to end times and minus one divided by two

40
00:03:06,940 --> 00:03:13,180
so a lot of this section is just going to be figuring out how to load in the data split it up appropriately

41
00:03:13,420 --> 00:03:16,510
and feed it into our facial recognition network.

42
00:03:16,720 --> 00:03:21,550
Of course we'll still need to discuss the theory behind the algorithm and the special lost function

43
00:03:21,550 --> 00:03:23,950
that we use in this scenario.

44
00:03:23,950 --> 00:03:29,470
You'll also learn why this approach leads to imbalanced classes and how to evaluate the model which

45
00:03:29,470 --> 00:03:31,780
is not as straightforward as usual.

46
00:03:31,990 --> 00:03:34,060
Thanks for listening and I'll see you in the next lecture.