1
00:00:00,390 --> 00:00:01,590
Hi and welcome back.

2
00:00:01,770 --> 00:00:04,500
So in this section, we'll take a look at Grad Cam.

3
00:00:04,950 --> 00:00:10,830
Remember, Grad Cam tries to help you understand where your CNN is looking when it makes its decision.

4
00:00:11,010 --> 00:00:15,030
So let's open this notebook notebook 11 or already have it up.

5
00:00:15,030 --> 00:00:17,130
So let's get started.

6
00:00:17,340 --> 00:00:23,220
So what we're going to do, we're going to look at a few of these class activation map visualization

7
00:00:23,730 --> 00:00:24,450
algorithms.

8
00:00:25,080 --> 00:00:30,250
The first one we're going to do is basic grad cam, then grad cam plus plus that score cam and then

9
00:00:30,250 --> 00:00:31,230
a faster score cam.

10
00:00:31,890 --> 00:00:34,920
All of these allow us to see, well, CNN is effectively looking.

11
00:00:35,400 --> 00:00:38,550
So we're going to use to have too many sessions open.

12
00:00:38,580 --> 00:00:39,960
Let me just two minutes.

13
00:00:40,390 --> 00:00:40,950
These.

14
00:00:44,910 --> 00:00:45,390
All right.

15
00:00:46,050 --> 00:00:53,430
So let's install this is the same visualization package carries this that we've used in the previous

16
00:00:53,430 --> 00:00:53,850
lesson.

17
00:00:54,090 --> 00:00:56,460
It's a very, very good visualization package.

18
00:00:57,870 --> 00:00:59,370
I don't believe any exist.

19
00:00:59,580 --> 00:01:02,940
Well, they may have some for PI to watch, but I don't think they're as good as this one.

20
00:01:05,190 --> 00:01:09,450
So let's look at all libraries and print for using GPUs.

21
00:01:10,620 --> 00:01:15,760
And in the meantime, we're just going to load our Keros pre-trained model.

22
00:01:15,780 --> 00:01:17,310
The big pre-trained model.

23
00:01:17,850 --> 00:01:19,260
It's not a carris model to say.

24
00:01:21,420 --> 00:01:28,860
OK, so let's look at our Big 16 model this run, that block of code and a download model where it's

25
00:01:28,860 --> 00:01:33,990
quite quickly because everything is stored on Google's backend and the internet connection is quite

26
00:01:33,990 --> 00:01:34,530
quick.

27
00:01:34,590 --> 00:01:36,720
So it's 128 million parameters.

28
00:01:36,720 --> 00:01:37,380
It's quite big.

29
00:01:38,130 --> 00:01:40,000
Now, let's download our test images.

30
00:01:40,020 --> 00:01:45,330
These are the images that we'll be using to demonstrate the cloud cam algorithm.

31
00:01:45,540 --> 00:01:49,850
So I've heard you run this, so because it can just check the images would be here.

32
00:01:50,490 --> 00:01:54,720
So you download these here if I don't do it again to create more copies.

33
00:01:54,720 --> 00:01:59,010
So I've done this now three times its way to spare it one bit, too.

34
00:02:00,450 --> 00:02:04,070
So let's now visualize these images.

35
00:02:04,080 --> 00:02:11,310
So we'll just use the regular so--but plots that we've done before and we plot the class title from

36
00:02:12,420 --> 00:02:20,220
appear here we have stored and so we can see that this is the input goldfish to bear and also trifle.

37
00:02:20,640 --> 00:02:29,910
So what we're going to do now, we're going to input these images into our into a great camera algorithm,

38
00:02:30,480 --> 00:02:37,200
specifying that do we want that algorithm to look at that specific class because we need to know basically,

39
00:02:37,200 --> 00:02:42,360
if we're identifying a goldfish, we need to know where CNN is looking when we went into identifying

40
00:02:42,360 --> 00:02:42,960
a goldfish.

41
00:02:43,170 --> 00:02:44,580
What features it's looking at.

42
00:02:45,390 --> 00:02:46,290
So that's quite cool.

43
00:02:46,380 --> 00:02:51,810
So to do that, we have to basically just create a lost function that basically gives the outputs of

44
00:02:51,810 --> 00:02:56,190
the self max layer for that specific class of goldfish as class one.

45
00:02:56,880 --> 00:03:05,310
The bill was class 294 qualities for that noise and assault rifles for one tree.

46
00:03:05,880 --> 00:03:06,750
That's a class number.

47
00:03:07,170 --> 00:03:11,520
So let's create this function and know create a model.

48
00:03:11,520 --> 00:03:13,290
Modify the function which you've seen before.

49
00:03:13,320 --> 00:03:14,740
These are where we have.

50
00:03:14,760 --> 00:03:21,360
We put the activations to linear activations on the second to last live and now we can implement grad

51
00:03:21,360 --> 00:03:21,660
cams.

52
00:03:21,690 --> 00:03:23,130
Let's take a look at how we do that.

53
00:03:23,610 --> 00:03:27,030
These are the functions we need to load in order to use graphical.

54
00:03:27,630 --> 00:03:34,110
And this is important to note here from the TBTF Christmas package, we get crowd can and will do it

55
00:03:34,110 --> 00:03:35,310
to create a calm function here.

56
00:03:35,700 --> 00:03:39,750
And this takes the model as the input, the model modifier.

57
00:03:39,990 --> 00:03:40,830
But we created a model.

58
00:03:40,830 --> 00:03:44,970
Modify both here and we just don't clone the model in that case.

59
00:03:45,660 --> 00:03:50,580
Next, we take this great calm object that we've created and we passed a loss to it.

60
00:03:50,670 --> 00:03:56,220
We pass X, which is basically just a back end of Keros, don't have to think too much about that.

61
00:03:56,220 --> 00:03:57,780
Is just used internally here.

62
00:03:58,560 --> 00:04:04,290
And we passed have been in the penultimate layer, which is basically the labor force of MAX.

63
00:04:04,920 --> 00:04:06,180
And that's good.

64
00:04:06,180 --> 00:04:11,400
That's because we want to get the maximum the maximization of that specific class.

65
00:04:12,930 --> 00:04:19,320
And next, we can use Normalize, which is a Keros vis utility as well, which basically enables us

66
00:04:19,320 --> 00:04:20,490
to create the plots.

67
00:04:20,490 --> 00:04:22,200
And that's about it.

68
00:04:22,200 --> 00:04:24,720
So now we can actually just create a heatmap here.

69
00:04:25,290 --> 00:04:26,130
This is from Cam.

70
00:04:26,580 --> 00:04:32,220
This is how we have this since cool heatmap here, which generates I'll discuss the output shortly,

71
00:04:32,700 --> 00:04:34,620
but they've been showing you and this is what this is.

72
00:04:34,620 --> 00:04:38,850
The occurred here, this line that allows us to create those fancy heatmaps.

73
00:04:38,970 --> 00:04:43,920
So let's run this and we should get that same output again.

74
00:04:44,160 --> 00:04:45,600
Hopefully, we've done everything right.

75
00:04:48,250 --> 00:04:48,730
There we go.

76
00:04:48,760 --> 00:04:50,080
So it took about 10 seconds.

77
00:04:50,590 --> 00:04:58,150
And that's inspectors, so when the CNN is identifying goldfish, we expect to look at the goldfish

78
00:04:58,150 --> 00:05:02,080
and we can see it's looking mainly at the goldfish head because that's where it's the densest part of

79
00:05:02,080 --> 00:05:02,620
this heat map.

80
00:05:02,620 --> 00:05:05,380
Is this red zone here right above the head?

81
00:05:06,160 --> 00:05:07,270
Similarly for beer.

82
00:05:07,300 --> 00:05:12,430
What's interesting for the beer to me is that it didn't look at the beer fierce when making that decision

83
00:05:12,730 --> 00:05:13,570
that this was a beer.

84
00:05:13,960 --> 00:05:19,930
It looked at the texture, which could be misleading because I'm pretty sure that texture would come

85
00:05:19,930 --> 00:05:22,120
up in other animals like dogs as well.

86
00:05:22,240 --> 00:05:24,700
So CNN could be quite confused.

87
00:05:24,700 --> 00:05:27,400
Sometimes with this texture arises in an image.

88
00:05:28,390 --> 00:05:31,240
The assault rifle this one makes a bit more sense.

89
00:05:31,240 --> 00:05:35,230
You can see this looking at more of the handle of the gun for the decision.

90
00:05:35,260 --> 00:05:36,970
So this is quite cool.

91
00:05:37,930 --> 00:05:39,630
So this is just grad cam.

92
00:05:39,670 --> 00:05:41,080
What about Grad Cam Plus?

93
00:05:41,560 --> 00:05:43,610
Well, Grad Cam Plus is a little bit better.

94
00:05:43,630 --> 00:05:49,540
The researchers who who may have grabbed Cam Plus, made some improvements on the algorithm and to use

95
00:05:49,540 --> 00:05:51,060
it, that's pretty much the same thing.

96
00:05:51,070 --> 00:05:56,140
You just have you just import Grad Cam Plus from Grad Cam in this package here.

97
00:05:56,920 --> 00:06:00,580
You just passed model and model modifier again to loss.

98
00:06:00,820 --> 00:06:05,560
This creates a great object that will be using all the function functions, objects, by the way.

99
00:06:05,560 --> 00:06:07,540
So sometimes they used interchangeably.

100
00:06:08,770 --> 00:06:14,770
So then we have to loss the back end and then ultimately and then we normalize and then we create a

101
00:06:14,770 --> 00:06:15,400
heatmap.

102
00:06:16,030 --> 00:06:20,900
We use this in case I didn't mention before, and I don't think that seemed objet.

103
00:06:20,920 --> 00:06:26,170
That's basically the function in MATLAB lab that allows us to get this color skewing here.

104
00:06:26,500 --> 00:06:29,590
So that's and then we use that to get the image.

105
00:06:29,590 --> 00:06:32,470
This is where we get the image out of it, and this is a color scaling nothing.

106
00:06:33,220 --> 00:06:35,320
So let's run this.

107
00:06:35,410 --> 00:06:42,970
Let's take a look at Cloud Cam Plus and Grab Cam Plus gives us a bit more information in a way you can

108
00:06:42,970 --> 00:06:43,270
see.

109
00:06:43,270 --> 00:06:50,590
No, that tells us, as CNN is looking more maybe similar to the last one, the bids.

110
00:06:50,620 --> 00:06:54,600
Again, it's more of the best body, but overall it is looking at the best.

111
00:06:54,600 --> 00:07:01,180
So but emphasis on this area sold rifle, similar area, but more of the gun this time.

112
00:07:02,500 --> 00:07:03,580
So that's quite interesting.

113
00:07:04,060 --> 00:07:09,550
Now we can take a look at school cam score comes in that another method that generates the class activation

114
00:07:09,550 --> 00:07:09,850
map.

115
00:07:10,330 --> 00:07:14,080
It's a gradient free method, unlike those the previous two we just looked at.

116
00:07:14,950 --> 00:07:19,120
So this method could take a while, so it's wise not to run this on your CPU.

117
00:07:19,510 --> 00:07:21,550
So I'll come back to this shortly.

118
00:07:21,580 --> 00:07:22,600
Actually, let's run it now.

119
00:07:22,720 --> 00:07:23,670
Let's see a it.

120
00:07:29,850 --> 00:07:30,960
OK, there we go.

121
00:07:30,990 --> 00:07:33,000
So it took a bit longer, actually.

122
00:07:33,630 --> 00:07:38,340
And let's see, it's looking roughly at the same areas here, so we get the same answer.

123
00:07:38,760 --> 00:07:41,340
So again, we can see this is quite useful for him.

124
00:07:42,150 --> 00:07:48,450
Next, let's take a look at first cam, which, as the name suggests, it should read much faster than

125
00:07:48,450 --> 00:07:51,510
score count because we can see score can was actually quite long.

126
00:07:52,080 --> 00:07:56,890
So to implement fastest can, basically we still use the scorecard model here.

127
00:07:57,600 --> 00:07:59,760
Let me see if that's what we looked at before.

128
00:07:59,760 --> 00:08:00,330
Yep.

129
00:08:03,970 --> 00:08:10,300
But to use go first to school again, all you have to do is know just said this parents to a value that's

130
00:08:10,300 --> 00:08:13,580
not minus one because the minus one allows us to use original score.

131
00:08:14,440 --> 00:08:21,940
And this allows us to visualize the class activation maps, which should be run much faster.

132
00:08:21,970 --> 00:08:23,410
Yes, it's significantly faster.

133
00:08:24,220 --> 00:08:28,730
It is about one tenth one.

134
00:08:28,730 --> 00:08:29,260
Oh, sorry, one.

135
00:08:29,260 --> 00:08:30,340
It's the speed, roughly.

136
00:08:31,210 --> 00:08:33,310
So then you can see it gives us roughly the same areas again.

137
00:08:33,310 --> 00:08:34,330
So this is quite good.

138
00:08:35,020 --> 00:08:42,100
So that concludes this lesson on class activation maps using grad camp and school camp and first to

139
00:08:42,100 --> 00:08:44,050
school camp and grad camp plus plus.

140
00:08:44,770 --> 00:08:50,200
So we'll stop there for now, and we'll get back into the course where we'll start taking a look at

141
00:08:50,200 --> 00:08:53,560
more complicated CNN's as we continue this course.

142
00:08:53,770 --> 00:08:55,210
Thank you, and I'll see you soon.

143
00:08:55,630 --> 00:08:55,900
But.
