1
00:00:00,060 --> 00:00:01,170
Hi and welcome back.

2
00:00:01,320 --> 00:00:07,770
So in this lesson, lesson 41, we'll be taking a look at using the Python library called deep deepfakes,

3
00:00:07,770 --> 00:00:14,430
which is a very cool and useful library for extracting things like age, gender expression, as well

4
00:00:14,430 --> 00:00:18,330
as performing facial recognition with a number of different models they have built in.

5
00:00:18,840 --> 00:00:19,970
So let's get started.

6
00:00:20,000 --> 00:00:20,700
Let's take a look.

7
00:00:20,730 --> 00:00:23,550
So open notebook 41 and we'll begin.

8
00:00:24,390 --> 00:00:29,400
So firstly, we have to install deepfakes here and also uses deliv as well.

9
00:00:29,820 --> 00:00:31,380
So let's install both of them.

10
00:00:31,560 --> 00:00:32,660
Actually, they don't need the.

11
00:00:32,930 --> 00:00:39,210
But I use the elevators to test and demonstrate a couple of the facial landmark activities in this lesson,

12
00:00:39,810 --> 00:00:42,930
because in this lesson, what we're doing, we're first doing facial landmarks.

13
00:00:43,350 --> 00:00:49,170
Then I'm age, gender, emotional expression, ethnicity, using deepfakes, and then we're going to

14
00:00:49,170 --> 00:00:52,710
perform facial similarity as well as facial recognition.

15
00:00:53,310 --> 00:00:59,040
Also, we do have to change the RAM to Hiram, which I forgot to mention this firearm by default.

16
00:00:59,040 --> 00:01:00,630
So that's good in this notebook.

17
00:01:01,200 --> 00:01:05,670
So deepfakes has been installed so we can continue with listen.

18
00:01:06,570 --> 00:01:10,950
So this is this is just so immature function and we look at all libraries here.

19
00:01:12,570 --> 00:01:18,570
Next, we have to download our facial landmark library because in case you missed it before you skipped

20
00:01:18,570 --> 00:01:23,910
that part of the course which delivers facial landmarks section, I'm just going to go over it quickly

21
00:01:23,910 --> 00:01:24,870
in this section here.

22
00:01:25,260 --> 00:01:29,160
So what we'll do next is download all our test images that we'll be using.

23
00:01:29,820 --> 00:01:34,800
So let's do that, and now we can demonstrate the official landmark project.

24
00:01:34,860 --> 00:01:36,450
So this is quite simple here.

25
00:01:38,220 --> 00:01:42,720
So to demonstrate the official landmark project, what we need to do, we need to have a detector and

26
00:01:42,720 --> 00:01:43,330
a predictor.

27
00:01:43,770 --> 00:01:49,290
The detector, basically the text office and from an image, just a bounding box around office and a

28
00:01:49,290 --> 00:01:51,980
predictor signs and landmarks onto that fire.

29
00:01:52,000 --> 00:01:53,640
So we have 68 landmarks here.

30
00:01:54,150 --> 00:01:59,580
So you used have just specified a path to this to deliver predictive facial landmark model.

31
00:02:00,180 --> 00:02:04,320
And we have a detector and our predictive lodo image grayscale it.

32
00:02:04,770 --> 00:02:06,510
Get the faces out of it here.

33
00:02:06,930 --> 00:02:12,810
And then for the faces in that image, we just passed that face that cropped face to the predictor and

34
00:02:12,810 --> 00:02:14,820
then we draw the bounding box around it.

35
00:02:15,180 --> 00:02:21,450
And then we just draw circles, tiny circles around to represent each facial landmark.

36
00:02:21,570 --> 00:02:24,480
So that's one that I we'll take a look at the output.

37
00:02:24,610 --> 00:02:27,040
Now there we go.

38
00:02:27,090 --> 00:02:33,060
So you can see, here's my face here and you can see we have all the landmarks around my eyes, my nose,

39
00:02:33,450 --> 00:02:35,700
my lips, a little lips a bit off.

40
00:02:35,940 --> 00:02:37,500
That's because it's a sight on view.

41
00:02:38,460 --> 00:02:43,140
Also this I'm not sure this is supposed to be all around at first, but it cuts into motion a bit.

42
00:02:43,560 --> 00:02:44,550
But it's pretty good.

43
00:02:44,730 --> 00:02:48,420
I mean, it works best with a full frontal and face, but this is pretty good too.

44
00:02:48,420 --> 00:02:51,090
Captured my eyes, my eyebrows quite well, quite well.

45
00:02:51,810 --> 00:02:55,920
So now what we're going to do, we're going to use deep fears to do something very cool.

46
00:02:56,340 --> 00:03:01,200
We're going to get each gender, emotional expression and ethnicity out of it.

47
00:03:01,470 --> 00:03:07,730
So the reason we have to download all models is that sometimes at least the last time I tried it and

48
00:03:07,730 --> 00:03:10,920
deface it supposed to download the models automatically.

49
00:03:10,920 --> 00:03:13,110
But links were dead for some reason.

50
00:03:13,110 --> 00:03:19,320
So I had to capture them and then download them and put them to my Google Drive so we can use them here.

51
00:03:20,280 --> 00:03:21,960
So let's download all those models.

52
00:03:22,440 --> 00:03:26,970
It doesn't take very long because Google's backend has a very fast connection, as we could see here.

53
00:03:28,350 --> 00:03:33,570
Even though those models are quite big, they're about 500 megs each, which is quite large.

54
00:03:33,570 --> 00:03:38,550
Actually, you can see that the way these models can't work in embedded devices like cell phones, they're

55
00:03:38,550 --> 00:03:39,420
just way too big.

56
00:03:41,920 --> 00:03:47,830
And this is our race model, gender model, age model and tiny one is the facial expression model.

57
00:03:52,330 --> 00:03:54,860
OK, well, maybe maybe it might work, let's try it.

58
00:03:54,950 --> 00:03:56,720
I'll fix this error for you guys afterward.

59
00:03:57,620 --> 00:04:00,170
Let's see if devious models are there anyway.

60
00:04:00,350 --> 00:04:03,320
And I believe that all because this looks like it's running.

61
00:04:05,520 --> 00:04:10,710
Yeah, it's downloading the model anyway, so we didn't need to do that, so they fixed a bug where

62
00:04:10,710 --> 00:04:12,720
it was and downloading these models before.

63
00:04:13,650 --> 00:04:15,330
So apologies for that.

64
00:04:22,860 --> 00:04:27,480
OK, so it took roughly just over a minute to download all these models from the deep Facebook GitHub

65
00:04:27,480 --> 00:04:27,930
repo.

66
00:04:28,440 --> 00:04:31,080
So now we can use this, so what we're going to do.

67
00:04:31,620 --> 00:04:33,570
We're going to get the emotions.

68
00:04:34,110 --> 00:04:39,530
We're going to run this and get age, gender, race and emotions out of the deep face module.

69
00:04:39,540 --> 00:04:43,910
So to do that, we just load an image using open TV here.

70
00:04:43,920 --> 00:04:48,660
So we have the image and then we can just actually don't need an image for your face.

71
00:04:48,930 --> 00:04:51,220
We just need an image path, which we declare here.

72
00:04:51,690 --> 00:04:56,910
And then we just specify what we want to analyze in this area here, this list, and then we can get

73
00:04:56,910 --> 00:05:03,330
our object and then use it, returns a dictionary with some type of file and then we can just use pretty

74
00:05:03,330 --> 00:05:04,820
print to print the results.

75
00:05:04,830 --> 00:05:08,400
So let's run this and take a look at the output.

76
00:05:09,750 --> 00:05:10,080
OK.

77
00:05:10,230 --> 00:05:11,280
So this is pretty cool.

78
00:05:11,340 --> 00:05:13,650
You can see how much information it gives us from this feature.

79
00:05:14,370 --> 00:05:15,990
Yes, I'm still here, club.

80
00:05:16,620 --> 00:05:18,540
So you can see dominant emotion is sad.

81
00:05:18,540 --> 00:05:22,200
Even though she's inside here of dominant races.

82
00:05:22,200 --> 00:05:26,760
Asian, she's Arab Indian emotion.

83
00:05:26,790 --> 00:05:32,130
You can see the attributes different scores that each emotion can see attributed the most to, said

84
00:05:32,460 --> 00:05:33,200
the neutral.

85
00:05:33,210 --> 00:05:35,950
That gender woman is quite right.

86
00:05:35,970 --> 00:05:37,260
You can see the different races here.

87
00:05:37,280 --> 00:05:42,930
So Indian did get a high school as well, as well as Latino has spent Sponeck, which she does have

88
00:05:42,930 --> 00:05:44,100
some in her as well.

89
00:05:44,820 --> 00:05:49,710
So and then you can see this is a region, this is a bounding box for the face as well.

90
00:05:50,310 --> 00:05:56,250
So let's create a simple function that can display these results on the image, so we wouldn't have

91
00:05:56,250 --> 00:05:58,740
to keep looking at this just on File Hill.

92
00:05:59,250 --> 00:06:06,720
So let's take another image and we'll input it into our deep office analyze and we can get a nice little

93
00:06:06,720 --> 00:06:15,300
descriptor like this sitting on efforts to edge happy Latino, Hispanic and serious not to correct but

94
00:06:15,300 --> 00:06:16,290
close enough anyway.

95
00:06:17,760 --> 00:06:20,790
So you can see this is quite good, though it's quite useful, isn't it?

96
00:06:21,330 --> 00:06:24,420
You can do a lot of analytics with deep face.

97
00:06:25,290 --> 00:06:26,780
So now we can what we can do.

98
00:06:26,790 --> 00:06:32,310
We can change buttons and you can see they have several different backgrounds here for deep face.

99
00:06:32,310 --> 00:06:38,190
And they may have more if, if, if, if because they're constantly updating this library so they might

100
00:06:38,190 --> 00:06:39,570
have more packages available.

101
00:06:40,320 --> 00:06:45,840
So what we can do, we can switch beckons here so you can set up using, we can use SSD, we can use

102
00:06:45,840 --> 00:06:52,260
open OpenCV, dylib NTSC, MTC and what you've seen before that some FirstNet, as well as a new one

103
00:06:52,260 --> 00:06:54,090
you're unfamiliar with that's in office.

104
00:06:54,630 --> 00:06:57,090
So we'll demonstrate the results here with SSD.

105
00:06:57,360 --> 00:06:59,130
So let's take a look at that.

106
00:07:01,390 --> 00:07:03,460
So it hasn't dealt with the model, as you can see here.

107
00:07:04,150 --> 00:07:07,620
And then we get here, so we get territory inside.

108
00:07:08,140 --> 00:07:11,410
I'm actually a bit older, so this is quite complimentary.

109
00:07:12,220 --> 00:07:13,660
Let's try a different model now.

110
00:07:13,780 --> 00:07:16,330
Let's try MTC, CNN.

111
00:07:23,020 --> 00:07:23,540
There we go.

112
00:07:23,560 --> 00:07:28,000
So, again, territory neutral Indian, so you can see this is looking quite good.

113
00:07:28,360 --> 00:07:32,890
And you can experiment and mess with different modes and see which gives you the best results, although

114
00:07:33,520 --> 00:07:36,940
they generally all have strengths and weaknesses, so it depends on your data sets.

115
00:07:37,630 --> 00:07:41,860
So now what we're going to do, we're going to perform facial similarity.

116
00:07:42,400 --> 00:07:47,920
So deep is has a verify function that compares images here and returns a result.

117
00:07:47,920 --> 00:07:50,770
So the result is is verified and result here.

118
00:07:51,220 --> 00:07:52,860
So that is verified.

119
00:07:52,860 --> 00:07:55,600
True means that you're the same person verified.

120
00:07:55,810 --> 00:07:57,490
Being false means that there are different people.

121
00:07:57,640 --> 00:08:00,010
So let's take a look at the results.

122
00:08:05,080 --> 00:08:09,190
So again, we have to don't lose the weight, so again, apologies for the way it would.

123
00:08:09,210 --> 00:08:11,820
This might just be about 30 seconds or so.

124
00:08:17,630 --> 00:08:23,270
OK, so we've got a results here, and you can see it verified its true meaning that the two images

125
00:08:23,270 --> 00:08:25,610
are compared with my wife that you are to see in person.

126
00:08:26,060 --> 00:08:32,270
And you can see some of the metrics they use that are trash values was point for obvious Fiji faces

127
00:08:32,270 --> 00:08:38,000
a comparison which we've used before, and Kerry's similarity metric was Hussein and Verify being true.

128
00:08:38,050 --> 00:08:39,080
Final results here.

129
00:08:39,740 --> 00:08:40,760
So that's pretty cool.

130
00:08:40,790 --> 00:08:45,110
So what if we use different facial metrics you can use cosine Euclidean?

131
00:08:45,560 --> 00:08:52,850
So let's do Euclidean comparison on the same images as well, and you can see verified as true as well,

132
00:08:53,360 --> 00:08:55,550
but you can see the threshold has changed.

133
00:08:55,550 --> 00:08:57,740
Now it's 0.6 and still within the threshold.

134
00:08:58,160 --> 00:09:03,320
However, they do change values according to which metric you use, and you can see these are two different

135
00:09:03,320 --> 00:09:04,640
metrics that are available here.

136
00:09:04,640 --> 00:09:08,690
We have Cosine Euclidean and look at Euclidean L2 norm.

137
00:09:09,410 --> 00:09:12,860
So let's try that as well and get the results.

138
00:09:12,860 --> 00:09:15,410
And they're all at least for these images here.

139
00:09:15,830 --> 00:09:17,540
They're all giving us the correct results.

140
00:09:17,660 --> 00:09:18,530
So that's pretty cool.

141
00:09:19,100 --> 00:09:21,020
So I don't think I need to do this.

142
00:09:21,170 --> 00:09:23,900
This in defense mode seem to be downloading well.

143
00:09:24,740 --> 00:09:27,680
So let's perform visual facial recognition.

144
00:09:28,580 --> 00:09:33,870
So what we're going to do, we're going to keep using a deep space DVR.

145
00:09:33,890 --> 00:09:34,250
Sorry.

146
00:09:34,610 --> 00:09:35,360
That's fine.

147
00:09:35,780 --> 00:09:40,640
We have an input image here and then we have a directory of images here called Trading Spaces.

148
00:09:41,150 --> 00:09:44,780
That's what we downloaded initially and we've downloaded some set of files.

149
00:09:45,260 --> 00:09:46,670
So we have a bunch of faces here.

150
00:09:46,670 --> 00:09:52,640
We have what six pictures of my wife Maria, one of J.Lo, one of Jennifer Aniston, one of Lady Gaga,

151
00:09:53,330 --> 00:09:55,220
and what is going to what we're going to do?

152
00:09:55,250 --> 00:10:01,010
We're going to get a nice little output data frame where we have the cosine distances for each one.

153
00:10:01,460 --> 00:10:04,400
And we're going to use the SD backend for this experiment.

154
00:10:04,670 --> 00:10:09,200
And you can you can feel free to try different experiments and you can see obviously this is the same

155
00:10:09,560 --> 00:10:10,790
as this one.

156
00:10:10,790 --> 00:10:16,400
So the cosine distances minus 10 to the minus 16 Mexicans.

157
00:10:16,400 --> 00:10:23,090
See all the images with area have gotten very low scores, and the one with J.Lo got the Louis.

158
00:10:23,480 --> 00:10:25,350
So this is pretty cool.

159
00:10:26,360 --> 00:10:26,510
So.

160
00:10:27,020 --> 00:10:33,470
So let's now try it with a few different models here so we can actually know loop through all of the

161
00:10:33,470 --> 00:10:34,060
different models.

162
00:10:34,060 --> 00:10:36,140
So we have PGD fierce, isn't it?

163
00:10:36,140 --> 00:10:38,720
Open, fierce, deep, fierce debate.

164
00:10:38,930 --> 00:10:40,160
OK, fierce of dylib.

165
00:10:40,160 --> 00:10:44,990
So we have a ton of different models we can use in the deep Phastos Find module.

166
00:10:45,530 --> 00:10:48,680
So what we do, we just specify the model in here.

167
00:10:49,130 --> 00:10:55,610
The detector back in is what we use to detect office so we can use SD or we can try dylib for change,

168
00:10:56,570 --> 00:11:00,800
and it's going to basically give us a similarity scores now for each one.

169
00:11:00,890 --> 00:11:02,780
So let's take a look at this.

170
00:11:09,030 --> 00:11:13,170
So it may have to download some models here because we didn't download all of the models that we'd be

171
00:11:13,170 --> 00:11:13,560
using.

172
00:11:13,980 --> 00:11:21,570
So the first one to download it was that to deliver predictor for landmarks that are downloaded, FirstNet

173
00:11:21,570 --> 00:11:22,320
was here.

174
00:11:42,000 --> 00:11:46,470
OK, so there we have all these vast results, so let's take a look at what's happening here.

175
00:11:47,010 --> 00:11:52,230
So it took a while to run, took about three minutes because we had to download all these models here,

176
00:11:52,740 --> 00:11:56,430
the ones we downloaded, the models, actually, the inference was quite fast.

177
00:11:57,030 --> 00:11:58,470
So you can see what we're doing.

178
00:11:58,470 --> 00:12:05,250
We're comparing the input image here, which is the new on the score one, and we're comparing it now

179
00:12:05,250 --> 00:12:08,730
to all of the images in the in the directory.

180
00:12:08,760 --> 00:12:13,170
So we have an idea of four, three, five six two and so on.

181
00:12:13,740 --> 00:12:17,580
And you can see the reason why it is no none of the other JLU images.

182
00:12:18,000 --> 00:12:22,080
Unfortunately, it is because our first detector did not detect them.

183
00:12:22,590 --> 00:12:29,010
So let's try SSD, and hopefully that gives us some more detections.

184
00:12:29,790 --> 00:12:33,820
And you can see, yes, so it got the global one there, but didn't get it for the others.

185
00:12:34,260 --> 00:12:36,720
But regardless, let's see how we interpret this.

186
00:12:37,200 --> 00:12:39,420
So you can see this is a model of being used here.

187
00:12:39,660 --> 00:12:41,970
This is the same distance we've seen before.

188
00:12:42,570 --> 00:12:45,240
Again, this is not the first, and it's called cosine.

189
00:12:45,690 --> 00:12:52,170
You can see the distances here for each one for open fields didn't get any results for some reason.

190
00:12:52,710 --> 00:12:54,750
Similarly, 50 percent divided.

191
00:12:55,230 --> 00:12:57,870
But for obvious, we got the results as well.

192
00:12:58,050 --> 00:13:00,260
So you can see it says in the video.

193
00:13:00,260 --> 00:13:05,010
One is the closest, which makes sense because that's the original image, and you can see that in order

194
00:13:05,040 --> 00:13:05,970
similarity.

195
00:13:06,510 --> 00:13:07,200
So it's it here.

196
00:13:07,320 --> 00:13:10,110
So you can see they all have slightly different ordering.

197
00:13:10,890 --> 00:13:14,040
One five four three one five six.

198
00:13:14,730 --> 00:13:18,870
This is one four five, so it's one in five do appear quite similar.

199
00:13:19,560 --> 00:13:25,920
So let's take a look at what one in five or six does seem to be the most similar images.

200
00:13:29,650 --> 00:13:35,500
OK, so I guess those do look a bit similar to me, although, I mean, they are all of the same person

201
00:13:35,500 --> 00:13:35,890
anyway.

202
00:13:36,850 --> 00:13:44,680
So that's it for this lesson is a very cool tutorial in how you can build your very own MongoDB facial

203
00:13:44,680 --> 00:13:48,220
recognition engine using deepfakes as well, so you can check it out.

204
00:13:48,730 --> 00:13:55,210
And that's it for this lesson that what we'll do next will start an object detector lesson.

205
00:13:55,210 --> 00:14:00,760
So we'll go through some slides of the material for the different types of networks, types of object

206
00:14:00,760 --> 00:14:06,200
detection, and then we'll begin the many lessons we have on object the detection.

207
00:14:06,340 --> 00:14:07,180
All very cool.

208
00:14:07,750 --> 00:14:09,100
So I'll see you then.

209
00:14:09,310 --> 00:14:09,730
Thank you.