1
00:00:00,120 --> 00:00:06,180
Hi and welcome back to the course now in this section, we're going to load a few pre-trial networks,

2
00:00:06,180 --> 00:00:12,120
such as rejig and rest at Inception, mobile net and a couple others using cameras just like we did

3
00:00:12,120 --> 00:00:12,780
with PyTorch.

4
00:00:12,780 --> 00:00:14,850
We're going to do the same now with cameras.

5
00:00:15,300 --> 00:00:17,790
So let's open this notebook and begin to listen.

6
00:00:17,940 --> 00:00:20,430
So this number should load quickly.

7
00:00:24,190 --> 00:00:24,640
There we go.

8
00:00:25,600 --> 00:00:30,490
So these are the networks we're going to load now in this lesson, so you can take a look at this link

9
00:00:30,490 --> 00:00:30,760
here.

10
00:00:31,180 --> 00:00:36,520
This link will show you all the networks are available in us and is quite a bit as well, including

11
00:00:36,520 --> 00:00:40,000
the efficient, fully efficient and family of networks.

12
00:00:40,660 --> 00:00:43,600
And, like I explained, would apply to us less than what we're doing.

13
00:00:43,900 --> 00:00:49,840
We're loading networks that have been trained on the image net dataset and Carole's gives us a nice

14
00:00:49,840 --> 00:00:55,750
breakdown on the website of the accuracy so you can see what the accuracy is all for the these networks,

15
00:00:55,750 --> 00:00:58,510
except for some reason for the efficient in that family.

16
00:00:59,140 --> 00:01:06,220
But you can see the nice net large, which has 243 megs, which is a big network of 88 million parameters.

17
00:01:06,790 --> 00:01:10,030
You can see that's quite that's the best performing network, but it's quite big.

18
00:01:10,200 --> 00:01:13,360
The biggest one available actually out of all of them.

19
00:01:13,990 --> 00:01:18,940
And you can see the inference time when CPU and GPU, which is quite nice to see.

20
00:01:19,810 --> 00:01:21,060
So you can see the efficient.

21
00:01:21,060 --> 00:01:23,620
That model is actually the slowest inference on GPUs.

22
00:01:24,130 --> 00:01:29,410
It takes over a second for one image on the phone to be seven backbone efficient net.

23
00:01:30,010 --> 00:01:32,770
So I mean, it's interesting to see it's good that they have it.

24
00:01:33,310 --> 00:01:35,740
So let's see how we load these networks.

25
00:01:35,860 --> 00:01:36,250
OK.

26
00:01:36,370 --> 00:01:40,540
So to load in that network and Caris is actually quite simple.

27
00:01:40,960 --> 00:01:42,260
First, we have to import it.

28
00:01:42,280 --> 00:01:47,740
So we just do this and put TensorFlow that carries applications, and these are where the networks are.

29
00:01:48,250 --> 00:01:50,680
So we import four video 16.

30
00:01:50,890 --> 00:01:54,250
And we put it as video 16 in caps.

31
00:01:54,520 --> 00:01:56,080
We can specify the same thing.

32
00:01:56,080 --> 00:01:59,560
You want to get done the same lowercase, fidgety if you wanted.

33
00:02:00,190 --> 00:02:06,730
And we're going to import the pre-processing image library, as well as this one here, which is specific

34
00:02:06,730 --> 00:02:11,740
to BTG net and decode predictions, which is also specific to rejig.

35
00:02:12,190 --> 00:02:17,040
So in PyTorch, we didn't have these these functions available.

36
00:02:17,050 --> 00:02:23,650
Unfortunately, however, Keros gives us these nice pre-processed functions so that we can quickly use

37
00:02:23,650 --> 00:02:27,460
them and inference them on our image or test images.

38
00:02:27,850 --> 00:02:33,670
Using a loaded network like this and to load the network, all you have to do is just too big for the

39
00:02:33,670 --> 00:02:34,000
weights.

40
00:02:34,210 --> 00:02:40,570
So do we just image net, which I think is doing the available weights anyway and put it equal to model,

41
00:02:40,570 --> 00:02:42,280
and that's a lot more effectively.

42
00:02:42,760 --> 00:02:48,400
So let's run that block of code, which will take about 10 seconds because we have to connect to a virtual

43
00:02:48,400 --> 00:02:49,370
machine instance.

44
00:02:49,390 --> 00:02:49,990
There we go.

45
00:02:50,140 --> 00:02:53,560
And it should little bit bigger and maybe a handful of seconds.

46
00:02:55,540 --> 00:02:56,620
It's taking a bit longer.

47
00:02:56,620 --> 00:02:59,890
But yeah, taking about five seconds to download a model.

48
00:03:00,670 --> 00:03:01,120
There we go.

49
00:03:01,540 --> 00:03:07,270
So we've loaded the model now and you can see 228 million parameters, which is quite big.

50
00:03:08,020 --> 00:03:09,970
And no, let's get our test images here.

51
00:03:10,390 --> 00:03:16,320
So we've downloaded SAS images and because I zipped them on a makeovers deleting the dirtiest or file

52
00:03:16,780 --> 00:03:22,240
so that it doesn't crash when it won't go into the loop, it's always we could have done that.

53
00:03:22,250 --> 00:03:27,370
That's just an easy way, but it's probably better to have it and to have it programmatically in the

54
00:03:27,370 --> 00:03:29,770
could so check to see if a file is an image.

55
00:03:30,490 --> 00:03:37,520
Nevertheless, we'll let's begin this lesson so our images are not stored at this path here.

56
00:03:37,540 --> 00:03:44,530
Images Class one and we have our test images, the same images we used in our pie to watch pre-trained

57
00:03:44,530 --> 00:03:49,180
Model S. So what we do, we just gather the file names here.

58
00:03:49,780 --> 00:03:54,550
So let's get all the file names from that directory and can see them here next.

59
00:03:54,640 --> 00:03:55,360
What do we do?

60
00:03:56,200 --> 00:04:03,010
This is our function here, but this is the area where we actually inference on the random images,

61
00:04:03,850 --> 00:04:07,000
inference and inference mode in our pre-trained model.

62
00:04:07,720 --> 00:04:10,240
So it's actually quite simple and plot the results as well.

63
00:04:10,450 --> 00:04:11,650
So it's quite simple, actually.

64
00:04:11,650 --> 00:04:18,520
So I'm ignoring this map of lines or get back to these after what we do, CUDA file names we loaded

65
00:04:18,520 --> 00:04:18,850
here.

66
00:04:19,450 --> 00:04:25,870
We just load the image here using the carousel image function to image dot load image.

67
00:04:26,500 --> 00:04:28,060
The full part is my part.

68
00:04:28,060 --> 00:04:34,270
My part was this images plus one directory, plus the file name, which at least is here.

69
00:04:34,990 --> 00:04:37,900
So if it gets for the loop, goes through each image at a time.

70
00:04:38,500 --> 00:04:42,870
We specify the size we want to load images, so it's 224 for Viji.

71
00:04:43,510 --> 00:04:51,310
Then we use images to array, which is not a care unprepossessing image function, and we just can convert

72
00:04:51,310 --> 00:04:53,380
that image and call it excel.

73
00:04:53,860 --> 00:04:59,350
Then we can squeeze or expand dimensions, which flattens the image, and then we can just run it through

74
00:04:59,350 --> 00:05:00,370
the pre process.

75
00:05:00,610 --> 00:05:04,990
That's a carrot and the liquorice function pre-processed that's specific to VG.

76
00:05:05,020 --> 00:05:08,170
Actually, Majidi 16, I should say.

77
00:05:09,040 --> 00:05:13,750
And then all we have to do now is just this is just where we display the image here.

78
00:05:14,140 --> 00:05:16,810
So we load the image using OpenCV here.

79
00:05:17,440 --> 00:05:20,890
Image two And that's what we plot.

80
00:05:21,010 --> 00:05:23,170
I believe that.

81
00:05:23,260 --> 00:05:24,510
Actually, no, I'm not seeing it.

82
00:05:24,630 --> 00:05:25,120
Yeah, it is.

83
00:05:25,130 --> 00:05:28,860
Yeah, this is what we plot in the code right here at the end.

84
00:05:29,460 --> 00:05:34,710
And meantime, we just run the X here that we're ready to get the images for.

85
00:05:35,370 --> 00:05:40,350
So we just of run the prediction model that predict on that x when this is done.

86
00:05:40,350 --> 00:05:47,880
But then we use another the camera's function that's specific to be called decode predictions, and

87
00:05:47,880 --> 00:05:49,320
we just take the predictions here.

88
00:05:49,830 --> 00:05:51,010
We get predictions.

89
00:05:51,030 --> 00:05:54,840
The predictions out of it now, decode predictions is similar to the class.

90
00:05:54,840 --> 00:05:59,550
We built in PyTorch, where we just looked up the name of the actual class.

91
00:06:00,090 --> 00:06:06,480
So that's what Decode Predictions does returns the actual string name for the class that Preds represents

92
00:06:07,020 --> 00:06:08,720
based on the image, not classes.

93
00:06:09,540 --> 00:06:16,050
And then we just plotted these new certainly seems so subplots could map of the picture that we did

94
00:06:16,650 --> 00:06:17,360
above here.

95
00:06:17,370 --> 00:06:18,680
This is just a figure size.

96
00:06:18,690 --> 00:06:20,220
We make it nice and big.

97
00:06:20,760 --> 00:06:22,650
And then we go, So let's run this.

98
00:06:22,650 --> 00:06:27,720
Let's run our images now to true BGT 16, and let's see how it does.

99
00:06:34,060 --> 00:06:34,520
There we go.

100
00:06:34,570 --> 00:06:35,800
So let's take a look at this.

101
00:06:36,280 --> 00:06:41,980
So for each output of the loop, we're actually printing this above here that's done in this line here.

102
00:06:42,010 --> 00:06:43,780
Print predictions, that's what the output is.

103
00:06:44,290 --> 00:06:47,710
So it just gives us the class and the probabilities as well.

104
00:06:48,160 --> 00:06:53,640
So you can see it's 99 percent sure to the machine and the other class is actually quite small.

105
00:06:53,650 --> 00:06:56,980
So you're probably not even close convertible and beach wagon.

106
00:06:57,550 --> 00:06:58,270
So you can see that.

107
00:06:58,330 --> 00:06:59,170
Let's see what it got.

108
00:06:59,560 --> 00:07:05,950
Limousine basketball, college airmanship at Christmas stocking doormat, burrito, spiderweb and picket

109
00:07:05,950 --> 00:07:06,190
fence.

110
00:07:06,190 --> 00:07:08,800
So it got all right, except the beer.

111
00:07:09,790 --> 00:07:14,020
This does seem a bit different to the to which results because if I remember correctly, the PyTorch

112
00:07:14,020 --> 00:07:18,090
results for video 16, they didn't get the doormat correct.

113
00:07:18,190 --> 00:07:20,590
So that's an interesting find.

114
00:07:21,220 --> 00:07:25,870
You know, theoretically, you would think it would be the same as a pie touch model.

115
00:07:26,470 --> 00:07:31,960
However, it could be a case where we don't actually know how long this model was trained for and the

116
00:07:31,960 --> 00:07:34,000
image that we don't know the specifics.

117
00:07:34,150 --> 00:07:34,600
We don't know.

118
00:07:34,840 --> 00:07:38,740
They're not exactly the same width, which clearly these results represent.

119
00:07:38,890 --> 00:07:44,230
So that's an interesting results so far, that pipe that Cara says that a pre-trained models.

120
00:07:45,250 --> 00:07:47,050
So now let's load resonant.

121
00:07:47,350 --> 00:07:54,370
Less than 50, in fact, previously entitled to be loaded resonant of 18, which is a lots, which is

122
00:07:54,370 --> 00:07:55,510
a much smaller, resonant.

123
00:07:55,990 --> 00:08:01,660
This restaurant has 25 million parameters, which is moderately sized and I would say still small.

124
00:08:01,810 --> 00:08:07,930
And now we can just run our same code that just runs the inference of those images through the rest

125
00:08:07,930 --> 00:08:08,530
of the model.

126
00:08:09,130 --> 00:08:11,590
And let's see outperforms, and it performs quite well.

127
00:08:11,620 --> 00:08:16,780
Let's see it got it actually got pretty much everything correct, although I would say this is a bad

128
00:08:16,780 --> 00:08:20,020
model, but maybe the category is more appropriate for this.

129
00:08:20,020 --> 00:08:22,870
But it got all of these images correct, which is remarkable.

130
00:08:23,290 --> 00:08:29,200
So you can see Resonant 50 is quite a good network, and that should give you some intuition as to why

131
00:08:29,200 --> 00:08:31,030
I see a resonate as my favorite network.

132
00:08:32,050 --> 00:08:37,270
I mean, it's not good to have favorites, but I do get generally get the best results in a lot of researchers

133
00:08:37,270 --> 00:08:37,660
as well.

134
00:08:38,080 --> 00:08:40,780
Get the best results with resonant networks.

135
00:08:40,930 --> 00:08:49,420
So let's try Inception Vietri, get our weights, get on network 23 million parameters, not too big

136
00:08:49,540 --> 00:08:50,320
or too small.

137
00:08:51,100 --> 00:08:57,520
Let's run this loop here that gives us the predictions for each image and then we can see Limousin basketball.

138
00:08:57,850 --> 00:08:58,720
It got this wrong.

139
00:08:58,720 --> 00:08:59,890
It's not a Shetland sheepdog.

140
00:08:59,890 --> 00:09:01,480
It's actually got this right.

141
00:09:01,480 --> 00:09:02,830
But this really got this right.

142
00:09:03,580 --> 00:09:04,060
Pretty big.

143
00:09:04,060 --> 00:09:06,040
Las Vegas is probably the category.

144
00:09:06,040 --> 00:09:06,460
I would say.

145
00:09:06,460 --> 00:09:08,800
This is the correct category for this.

146
00:09:09,370 --> 00:09:13,150
So that's pretty good, although it's not perfect because they got this one wrong.

147
00:09:13,630 --> 00:09:16,930
It's quite quite good, though, so let's try and move on it now.

148
00:09:18,790 --> 00:09:20,600
So let's load the mobile network.

149
00:09:21,040 --> 00:09:24,940
Actually, there's one thing I wanted to show you for this one.

150
00:09:25,060 --> 00:09:27,900
Remember, the image size for inception is different.

151
00:09:27,910 --> 00:09:29,140
It's a little bit larger.

152
00:09:29,140 --> 00:09:31,110
It's 229 as opposed to 24.

153
00:09:31,600 --> 00:09:37,420
So you would just have to change the target size in this limited image function to 228 299.

154
00:09:37,420 --> 00:09:37,720
Sorry.

155
00:09:38,380 --> 00:09:44,230
So let's sort of load image of bullet, which I've already done, and within that parameters are 2.5

156
00:09:44,230 --> 00:09:45,520
million quite small.

157
00:09:46,180 --> 00:09:54,850
And now we can run our inference on the images here using mobile it and we can see limousine basketball,

158
00:09:54,850 --> 00:09:58,870
college and shepherds is quite good Christmas stocking.

159
00:09:59,080 --> 00:10:00,100
It's no wallet.

160
00:10:00,610 --> 00:10:02,050
It is a very good spider web.

161
00:10:02,120 --> 00:10:04,290
It's definitely not parallel, but so mobile.

162
00:10:04,300 --> 00:10:07,540
I did get the worst performance so far, but it's actually still quite good.

163
00:10:08,650 --> 00:10:11,500
So let's ride this theinternet, who will?

164
00:10:12,520 --> 00:10:13,870
That's a very big network.

165
00:10:14,500 --> 00:10:15,370
That's not to a one.

166
00:10:15,400 --> 00:10:17,180
So let's take a look and see how that performs.

167
00:10:18,210 --> 00:10:22,450
This should give us the best results, but I actually don't remember if it did.

168
00:10:23,440 --> 00:10:25,180
It's actually, oh, it's actually not as big as I thought.

169
00:10:25,180 --> 00:10:28,000
It's a very deep network, which doesn't have that many parameters.

170
00:10:29,140 --> 00:10:32,410
So let's run this and see what the results are.

171
00:10:37,450 --> 00:10:41,260
So again, it's pretty decent, except for ice lolly.

172
00:10:41,770 --> 00:10:45,670
They've got everything else correct, so that's also impressed with that.

173
00:10:45,790 --> 00:10:46,330
It's pretty good.

174
00:10:47,200 --> 00:10:47,920
That's strange.

175
00:10:48,070 --> 00:10:48,850
That's no.

176
00:10:48,850 --> 00:10:53,830
And that's it is basically a predator's predecessor to efficient net.

177
00:10:54,400 --> 00:10:56,260
So let's see how it performs.

178
00:10:58,150 --> 00:11:02,620
Actually, this is a mobile mast and thus that so it's a bit smaller.

179
00:11:02,800 --> 00:11:07,500
So it may not be, I mean, to give us the best results, but let's see what it does.

180
00:11:11,790 --> 00:11:19,320
So we can see it got everything right, except this the big glass here, which is giving a lot of these

181
00:11:19,320 --> 00:11:23,160
networks some problems, but remarkably, everything else is correct.

182
00:11:23,160 --> 00:11:24,060
So that's pretty good.

183
00:11:24,870 --> 00:11:30,390
Now let's try efficient that so efficient that D7 is actually the biggest efficient network model that's

184
00:11:30,390 --> 00:11:32,520
available in the US.

185
00:11:33,000 --> 00:11:34,380
So let's see how it performs.

186
00:11:37,440 --> 00:11:40,440
And you can see this is immediately the biggest model we've learned so far.

187
00:11:41,370 --> 00:11:44,880
Actually, it might have been bigger, but this is 66 million parameters.

188
00:11:45,960 --> 00:11:49,560
So let's now run this and let's see if we get perfect results with this one.

189
00:11:50,580 --> 00:11:51,720
I'm excited to find out.

190
00:11:52,500 --> 00:11:53,490
Hopefully, you are, too.

191
00:11:55,440 --> 00:12:01,950
And yes, finally, a model that gets every class right efficient.

192
00:12:01,980 --> 00:12:07,620
Let me V7 had got the big glass right spider web burrito donut, Christmas stocking, German Shepherd

193
00:12:07,620 --> 00:12:08,810
college basketball limousine.

194
00:12:08,820 --> 00:12:10,110
So that's quite impressive.

195
00:12:10,710 --> 00:12:15,630
So one thing I would have noticed and you would have noticed, too, is that the Keros pre-trained models

196
00:12:15,630 --> 00:12:21,540
do perform better than the PyTorch pre-trained models, and it may be a case where they just got they

197
00:12:21,540 --> 00:12:26,210
just trained them for a longer time or maybe did some manipulations to it.

198
00:12:26,220 --> 00:12:27,130
I don't know.

199
00:12:27,150 --> 00:12:32,010
I'll actually dig into this and maybe put the answer up somewhere below or put it in the notebooks for

200
00:12:32,010 --> 00:12:33,810
you guys if I have to find the find out.

201
00:12:34,440 --> 00:12:36,830
So that basically concludes this lesson.

202
00:12:36,840 --> 00:12:40,080
Well, Charisse, pre-trained models and loving them and using them.

203
00:12:40,650 --> 00:12:41,730
Hopefully you enjoyed it.

204
00:12:42,240 --> 00:12:48,150
And what we'll do now, we'll take a look at how we can actually get rank one or top one, the top five

205
00:12:48,150 --> 00:12:54,390
percent accuracy, which is a simple metric, but it's not implemented easily in Keros and by touch.

206
00:12:54,390 --> 00:12:57,390
So I've run a function that gives us the accuracy we want.

207
00:12:57,900 --> 00:13:01,500
It's quite simple, actually, and I'll just show you guys in the next section.

208
00:13:02,850 --> 00:13:04,050
So stay tuned for it.

209
00:13:04,170 --> 00:13:04,620
Thank you.
