1
00:00:00,060 --> 00:00:01,110
Hi, guys, welcome back.

2
00:00:01,680 --> 00:00:06,960
What we're going to do in this section is we're going to implement a rank one and rank five algorithm

3
00:00:07,320 --> 00:00:11,190
to get our accuracy metrics based on the rank one around five.

4
00:00:15,260 --> 00:00:19,790
Hi, guys, welcome back in this section, we'll take a look at using PyTorch to implement the rank

5
00:00:19,790 --> 00:00:21,650
one or rank five accuracy.

6
00:00:22,100 --> 00:00:26,090
Technically, we're making an algorithm that can give us a rank and accuracy.

7
00:00:26,630 --> 00:00:27,440
So let's get to it.

8
00:00:27,450 --> 00:00:32,360
So open this notebook here, which I've already done and what we're going to do.

9
00:00:32,420 --> 00:00:40,490
We're going to load of 16 network and we're going to use the same images that we did before, but now

10
00:00:40,520 --> 00:00:46,910
use them in a way to calculate to rank one or rank five or rank 10 accuracy based on on the ground charts.

11
00:00:47,420 --> 00:00:51,200
So it's like we're testing it that that was images become a test data set now.

12
00:00:53,550 --> 00:01:01,230
So there we go, so the model is loaded, and again, we'll have to do a normalization transforms like

13
00:01:01,230 --> 00:01:02,220
you've seen before.

14
00:01:02,520 --> 00:01:04,470
Set the mood to evaluation mode.

15
00:01:05,400 --> 00:01:09,810
Download our test images here and image classes.

16
00:01:09,880 --> 00:01:14,950
Image cast net image net classes file that gives us of a class names.

17
00:01:14,970 --> 00:01:21,890
Then we'll import all PyTorch modules, as well as ghetto class names, file variable from the image

18
00:01:21,900 --> 00:01:23,280
nets classes, JSON file.

19
00:01:24,360 --> 00:01:30,120
And now we'll just to implement this this run on a single image and get the output.

20
00:01:30,240 --> 00:01:34,170
So this is a burrito and it predicted a burrito, which is quite cool.

21
00:01:34,410 --> 00:01:38,360
And you can see this is the steps we took to convert this image here.

22
00:01:38,370 --> 00:01:38,930
So this is the end.

23
00:01:38,950 --> 00:01:42,570
We point the image a little bit to the image, the path for the image.

24
00:01:43,260 --> 00:01:49,260
Then we just converted to Tensor by doing all of these operations so that we can then finally feed it

25
00:01:49,260 --> 00:01:51,270
into the model to get the output.

26
00:01:51,690 --> 00:01:58,830
Then we get the maximum clucked index of the maximum probability class and then use that index to look

27
00:01:58,830 --> 00:02:03,880
up the class name and then we print the class name on the title of the image.

28
00:02:03,900 --> 00:02:05,340
So it predicted a burrito.

29
00:02:05,700 --> 00:02:06,690
Quite cool, isn't it?

30
00:02:07,440 --> 00:02:14,310
So let's take a look at something we need to get a class probabilities because remember rank one or

31
00:02:14,310 --> 00:02:22,260
rank and basically we need to get the rank in the number of the top, cut the top probable categories

32
00:02:22,860 --> 00:02:23,650
from the network.

33
00:02:23,670 --> 00:02:24,690
So how do we do that?

34
00:02:24,840 --> 00:02:26,070
So let's take a look at this.

35
00:02:26,220 --> 00:02:31,950
So so what we're going to do, we're going to import this pay to its function called functional and

36
00:02:31,950 --> 00:02:33,240
call this an nef here.

37
00:02:33,750 --> 00:02:35,010
And we're going to apply that.

38
00:02:35,010 --> 00:02:38,910
We're going to use that to get the soft max probabilities from the output.

39
00:02:39,480 --> 00:02:45,630
That's basically all we're going to do because previously we just used ARG Max to get the top probability

40
00:02:45,630 --> 00:02:46,200
index.

41
00:02:46,650 --> 00:02:48,990
Now we're just going to use it to get the probabilities here.

42
00:02:49,590 --> 00:02:56,190
So then what we do, we use probability topic here and we just specify how much probabilities we want,

43
00:02:56,760 --> 00:02:58,050
said the dimensions to one.

44
00:02:58,050 --> 00:03:04,260
Since we get it in a nice array and this returns, this this object that we get from the puts off max

45
00:03:04,260 --> 00:03:10,320
function returns the top probability and top class, but it's not exactly it's many of them.

46
00:03:10,740 --> 00:03:12,390
So let's see what we get here.

47
00:03:12,870 --> 00:03:14,430
So you can see this is the output.

48
00:03:14,910 --> 00:03:22,950
What this means here is that 0.87 is the highest probability that the class returned, that the number

49
00:03:23,190 --> 00:03:24,600
of the model written sorry.

50
00:03:25,320 --> 00:03:26,490
And this is a class.

51
00:03:26,800 --> 00:03:29,280
This is nine six five corresponds to burrito.

52
00:03:29,310 --> 00:03:30,810
Let's just double check and make sure.

53
00:03:31,410 --> 00:03:38,040
So let's open this up and I'll scroll all the way down to nine six five.

54
00:03:38,080 --> 00:03:39,150
I believe it was.

55
00:03:40,200 --> 00:03:42,600
And yes, nine six five is burrito.

56
00:03:42,990 --> 00:03:44,790
So let's take a look at that again.

57
00:03:45,000 --> 00:03:48,480
So you can see this is the class it predicted for Burrito 965.

58
00:03:48,960 --> 00:03:51,450
And this is a probability point eight seven.

59
00:03:51,840 --> 00:03:57,990
Let's see what it gave this relatively low probability point zero one seven seven percent, up one 1.7

60
00:03:57,990 --> 00:03:58,380
percent.

61
00:03:58,830 --> 00:04:00,420
That is nine to four.

62
00:04:00,450 --> 00:04:02,790
Let's take a look at it as well.

63
00:04:03,180 --> 00:04:03,870
Guacamole.

64
00:04:04,380 --> 00:04:05,760
That's that's kind of cool.

65
00:04:06,030 --> 00:04:07,440
Actually detected guacamole.

66
00:04:07,440 --> 00:04:11,430
I didn't see guacamole inside of this, but it's kind of weird that a category is quite similar.

67
00:04:11,820 --> 00:04:12,960
It is possible, though.

68
00:04:13,110 --> 00:04:15,810
This is because this is actually a good example of overtraining.

69
00:04:16,230 --> 00:04:21,330
That the pictures of guacamole is that the network is trained on may have actually had burritos in them,

70
00:04:21,330 --> 00:04:27,300
too, so it's understandable that the network could think of Burrito as a guacamole, but it doesn't

71
00:04:27,300 --> 00:04:30,390
give it that much probability confidence here.

72
00:04:31,020 --> 00:04:32,940
So let's look at 9:31 Bagel.

73
00:04:33,390 --> 00:04:34,170
Kind of makes sense.

74
00:04:34,170 --> 00:04:36,570
It looks more like a wrap than a bagel, but similar.

75
00:04:37,260 --> 00:04:45,390
And nine three three cheeseburger, I guess, because it's a ground beef here and nine two three, which

76
00:04:45,480 --> 00:04:47,250
is pleats, which which is OK.

77
00:04:47,250 --> 00:04:52,440
I mean, it isn't actually even a platter, but maybe because it thinks it's food and the association

78
00:04:52,440 --> 00:04:56,820
of these colors and these textures do go to kind of go hand-in-hand with pleats.

79
00:04:57,330 --> 00:05:00,260
So those are the five top classes.

80
00:05:00,270 --> 00:05:06,320
This model returns for this image, with the highest class being awarded to Burrito, which is 87 percent.

81
00:05:07,320 --> 00:05:08,400
So that's quite good.

82
00:05:08,880 --> 00:05:10,320
So now what we do?

83
00:05:10,590 --> 00:05:12,690
We can put those top classes here.

84
00:05:12,900 --> 00:05:14,070
That's the top class.

85
00:05:14,590 --> 00:05:18,580
The index is here to get the indexes right.

86
00:05:18,600 --> 00:05:23,220
Actually, I just set out to get the indexes right there so we can put it to an umpire and then index

87
00:05:23,220 --> 00:05:25,950
it because this is a multi-dimensional array here.

88
00:05:26,160 --> 00:05:31,500
So we just use zero to get the index to get the first elements in that area.

89
00:05:32,190 --> 00:05:33,330
And that's this here.

90
00:05:34,350 --> 00:05:35,850
So know what we do.

91
00:05:36,390 --> 00:05:38,950
Let's create a class that gives us our class names.

92
00:05:38,970 --> 00:05:39,970
It's quite simple to do.

93
00:05:40,350 --> 00:05:42,930
So we just input something that gives us.

94
00:05:43,080 --> 00:05:44,250
We just take this input.

95
00:05:44,610 --> 00:05:49,320
Top classes here and we want to return the string of names here.

96
00:05:49,860 --> 00:05:51,510
So it's quite sorry.

97
00:05:52,110 --> 00:05:53,100
Quite simple to do that.

98
00:05:53,400 --> 00:05:59,580
So what we do, we just take up classes, get the indexes here, which is what we did right here, and

99
00:05:59,580 --> 00:06:04,320
then we just created an called all classes and then four top class and top classes.

100
00:06:04,890 --> 00:06:09,740
We just give the class name out of our dictionary here and then appended to this array.

101
00:06:10,170 --> 00:06:12,240
So now we have an array of the names.

102
00:06:13,800 --> 00:06:15,180
As you can see, it works quite well.

103
00:06:15,600 --> 00:06:18,120
Burrito, guacamole, bagel, cheeseburger plate.

104
00:06:18,520 --> 00:06:25,620
So not only have that closeness now since we don't need it anymore, that's constructor function to

105
00:06:25,620 --> 00:06:27,630
give us to rank in accuracy now.

106
00:06:28,200 --> 00:06:30,490
So this function is actually quite simple.

107
00:06:30,510 --> 00:06:33,110
So it's designers, call it Rankin.

108
00:06:33,810 --> 00:06:39,840
It takes a model as the input directory of the images to ground truth, which is the correct class levels,

109
00:06:39,840 --> 00:06:45,000
which is something you need to set prior in, which is how much which is to rank you want.

110
00:06:45,510 --> 00:06:49,560
And we can show images tenant of yes or no, this something a true or false.

111
00:06:50,190 --> 00:06:52,740
So you would have seen some of this code before.

112
00:06:52,980 --> 00:06:57,630
What we do, we just dive into the directory, get all the file names of the editor directory there.

113
00:06:58,260 --> 00:07:00,630
Then we just look through those images.

114
00:07:00,630 --> 00:07:01,320
We open them.

115
00:07:01,740 --> 00:07:07,140
We passed them to a network at the outputs, get the probability probabilities then used to get class

116
00:07:07,140 --> 00:07:11,400
names, function to get to class names as well and store it into disarray.

117
00:07:12,060 --> 00:07:15,390
And if the images, if we want to show images, we just plot them here.

118
00:07:16,290 --> 00:07:21,150
This here being the same subplot plotting function that we have seen in previous lessons.

119
00:07:21,840 --> 00:07:28,260
So what happens now is that we use to get school function that sorry for that, to get school function

120
00:07:28,260 --> 00:07:30,420
that we created, we created below here.

121
00:07:30,930 --> 00:07:36,630
All this function does just calculate an accuracy score based on that, all the top classes of returns

122
00:07:36,630 --> 00:07:36,820
here.

123
00:07:36,840 --> 00:07:44,520
So this is an array of five results or whatever and is and number of results that we set in here and

124
00:07:44,520 --> 00:07:48,980
check to see if any of those end results are in the ground truth.

125
00:07:50,130 --> 00:07:51,690
That's basically how we do it here.

126
00:07:51,730 --> 00:07:56,460
Well, we did previously here if the ground truth already had the correct one.

127
00:07:56,850 --> 00:07:59,370
So let's take a look at what the ground to is.

128
00:07:59,380 --> 00:08:02,940
It's basketball, the German Shepherd, Limousin, blah blah blah.

129
00:08:03,540 --> 00:08:05,000
So what happens now?

130
00:08:05,010 --> 00:08:06,780
It basically looks at a first class.

131
00:08:06,960 --> 00:08:11,400
Basketball is basketball that any of the top in classes here.

132
00:08:11,880 --> 00:08:15,420
If yes, then it just increases the label count by one.

133
00:08:15,960 --> 00:08:17,580
Otherwise, if not, it ignores it.

134
00:08:17,580 --> 00:08:20,670
And then we just divide a total number of correct labels here.

135
00:08:20,670 --> 00:08:22,470
This is a variable.

136
00:08:22,470 --> 00:08:24,510
We keep track of the correct times.

137
00:08:24,510 --> 00:08:27,780
It's in the end classes, it returns.

138
00:08:28,350 --> 00:08:30,630
So it's a simple function and it works quite well.

139
00:08:31,410 --> 00:08:37,800
And the only bad thing about this code is that you need to manually set what your ground should labels

140
00:08:38,040 --> 00:08:43,920
and needs to be in the correct order that they are loaded in this area only forgery.

141
00:08:44,370 --> 00:08:48,690
So what I would suggest you do is run this only file's array here.

142
00:08:48,870 --> 00:08:53,940
If you were going to use your own images, run this line of code with the directory of your images,

143
00:08:54,510 --> 00:08:59,100
get the order of the images that they are in and create your ground to the rear.

144
00:08:59,130 --> 00:09:04,860
Based on that, and you can do it manually, or you can automate it with some CSS fees and maybe put

145
00:09:04,860 --> 00:09:09,210
the class teams as to file names and extract it automatically so you don't have to do it manually.

146
00:09:09,790 --> 00:09:12,060
It's just giving you some tips on what you can do.

147
00:09:12,570 --> 00:09:15,120
So anyway, we're ready to run this code.

148
00:09:15,150 --> 00:09:16,050
Let's get our top.

149
00:09:16,320 --> 00:09:20,430
Let's go to Iraq five accuracy top five what I was going to see.

150
00:09:22,470 --> 00:09:23,820
So you can see it's predicting it here.

151
00:09:23,820 --> 00:09:30,210
And what this what this function is doing is that it's playing the the forget this warning here, but

152
00:09:30,210 --> 00:09:33,000
it's just displaying the classes top five classes.

153
00:09:33,360 --> 00:09:36,840
So we have basketball fears about a rugby ball, tennis ball, volleyball.

154
00:09:37,320 --> 00:09:43,280
Similarly, German Shepherd, German Shepherd, Hello, Eskimo dog, husky, Siberian blah, blah blah.

155
00:09:43,290 --> 00:09:51,390
So let's see what the final rank five accuracy of rigidity 16 is 88 percent on all those tiny little

156
00:09:51,420 --> 00:09:55,110
test data set here of maybe eight nine images.

157
00:09:55,950 --> 00:10:05,130
And now let's get the ground and one accuracy rank one accuracy, I should say, so we can see it's

158
00:10:05,130 --> 00:10:07,680
going to it's going to definitely be less than 88 percent.

159
00:10:07,680 --> 00:10:08,550
But let's see what it did.

160
00:10:08,570 --> 00:10:09,450
77 percent.

161
00:10:09,450 --> 00:10:10,050
Not too bad.

162
00:10:11,010 --> 00:10:13,830
Now let's see what our rank then accuracy is.

163
00:10:14,100 --> 00:10:16,680
Hopefully this is a hundred percent, so let's check it out.

164
00:10:22,680 --> 00:10:29,970
Idiot PR. So whatever class it's getting wrong here, it's getting it very wrong, I believe it is this

165
00:10:29,970 --> 00:10:30,800
one here.

166
00:10:30,810 --> 00:10:32,040
No local matters, correct?

167
00:10:32,760 --> 00:10:33,980
It is this one here.

168
00:10:34,020 --> 00:10:35,490
It's definitely causing some problems.

169
00:10:35,490 --> 00:10:42,030
So it's getting this one is not even getting big laughs in the top 10 predictions in this for this image.

170
00:10:42,060 --> 00:10:43,770
So that's not great news.

171
00:10:44,220 --> 00:10:49,710
But as you've seen previously, other networks do tend to get it right sometimes, especially the deeper,

172
00:10:50,130 --> 00:10:51,060
bigger networks.

173
00:10:51,180 --> 00:10:57,930
So the experiment with those, so we'll stop there and what I'm going to do, I'm going to use the same

174
00:10:57,930 --> 00:11:00,540
algorithm would harass pre-trained models.

175
00:11:00,540 --> 00:11:02,460
So let's take a look and see how we do that.