1
00:00:00,270 --> 00:00:06,720
Hi, welcome back in this lesson, we'll use a pre-trained network to extract features from our dataset

2
00:00:07,110 --> 00:00:11,190
and then train a linear model like a logistic regression on those features.

3
00:00:11,670 --> 00:00:12,960
So let's get started.

4
00:00:13,170 --> 00:00:19,860
Open Notebook 24 Call this one year by torch feature extraction CNN, which I already have done here,

5
00:00:19,860 --> 00:00:20,970
and let's get started.

6
00:00:21,000 --> 00:00:24,030
So we're going to go back to our Catskills stocks dataset.

7
00:00:25,170 --> 00:00:30,960
No particular reason, just because it's easy to of us to quit and make sure you're using a GPU as usual,

8
00:00:31,590 --> 00:00:34,040
Hiram, for this, for this lesson as well.

9
00:00:34,050 --> 00:00:37,260
So let's make sure we're using you and hire you.

10
00:00:38,280 --> 00:00:41,010
So now let's download our pre-trained model.

11
00:00:41,640 --> 00:00:45,630
So to download a pitching model, you just basically do models thought.

12
00:00:45,630 --> 00:00:46,610
VIDEO 16.

13
00:00:47,050 --> 00:00:52,830
Robert Moses is from too much vision, said pre-trained equal true, which means we're loading the pre-trained

14
00:00:52,830 --> 00:00:53,860
PyTorch wits.

15
00:00:54,480 --> 00:01:00,540
And then we send that model to you if each of you is available and then we just print out a summary

16
00:01:00,540 --> 00:01:01,380
here below.

17
00:01:01,410 --> 00:01:02,850
So let's do this.

18
00:01:04,020 --> 00:01:07,770
We may have to wait a bit to download our model, but that shouldn't take too long.

19
00:01:08,940 --> 00:01:13,290
OK, so that took about 23 seconds, plus maybe five 10 seconds time.

20
00:01:13,290 --> 00:01:15,360
It took to connect to an instant still.

21
00:01:16,020 --> 00:01:17,700
So now let's examine this network.

22
00:01:17,700 --> 00:01:20,450
You can see there's a lot of convolutional layers.

23
00:01:20,460 --> 00:01:24,180
It is one to try as many, many it's actually 16, to be fair.

24
00:01:24,780 --> 00:01:29,760
And then you can see at the end, there's a max pool and average pooling layer, and then there's this

25
00:01:29,760 --> 00:01:32,850
tree linearly connected, fully connected layers.

26
00:01:33,450 --> 00:01:37,140
So what we're going to do, we're going to remove these seven layers here.

27
00:01:37,530 --> 00:01:44,490
This is really tree linearly is linear, fully connected layers of overlays to reload the drop out.

28
00:01:44,490 --> 00:01:46,410
So we're going to drop all of those down.

29
00:01:46,860 --> 00:01:48,780
So to do that, it's actually quite simple.

30
00:01:49,290 --> 00:01:55,650
We just use an end or sequential here, and we just set this to convert this model to a list.

31
00:01:55,650 --> 00:02:04,290
And we use all this indexing slicing abilities here to just cut the last seven rows out of it.

32
00:02:04,320 --> 00:02:04,980
Whoops.

33
00:02:05,790 --> 00:02:10,350
And let's roll that and then we create a new classifier out of that.

34
00:02:10,350 --> 00:02:13,710
So we just make a model classifier to be equal to that classifier.

35
00:02:14,130 --> 00:02:22,080
So no model has become just the convolutional layers of the video 16 network along with those weights.

36
00:02:22,230 --> 00:02:28,620
So let's take a look at that and you can see it stops right at the adaptive average pulling last layer

37
00:02:28,620 --> 00:02:36,810
there and you can see now we have 14 to 14 million trainable premises here, whereas before we had 138

38
00:02:36,810 --> 00:02:44,700
million trainable parameters because we had a lot of nodes here in these fully connected layers so we

39
00:02:44,700 --> 00:02:45,440
can continue now.

40
00:02:45,450 --> 00:02:49,440
So let's download old Catzavelos Dogs datasets, which we've done before.

41
00:02:49,770 --> 00:02:55,350
This takes a little while and what we'll do is go to go through this again with you guys.

42
00:02:55,920 --> 00:02:57,810
We just set up, we're treating pets here.

43
00:02:57,810 --> 00:03:03,870
We get our files from those directories here, and we just display how many images is in our training

44
00:03:03,870 --> 00:03:06,240
dataset and our test dataset as well.

45
00:03:06,690 --> 00:03:09,480
Then we create our transformers here for video.

46
00:03:10,050 --> 00:03:11,470
We don't need anything complicated.

47
00:03:11,490 --> 00:03:16,320
We just need to resize to 224 by 224 and then convert it to image Tensor.

48
00:03:17,160 --> 00:03:23,040
And then we have a class dataset here where we just get all labels and image out of it.

49
00:03:23,670 --> 00:03:30,450
And then from that, we just have a dataset objects here where we just specify the files, the directory

50
00:03:30,450 --> 00:03:31,530
of the transmissions.

51
00:03:31,530 --> 00:03:36,660
And then finally, we can add using the dataset class to do that, actually.

52
00:03:37,200 --> 00:03:40,290
And then finally, we have our data models at the bottom there.

53
00:03:40,320 --> 00:03:41,910
So I believe this is finished.

54
00:03:41,920 --> 00:03:43,770
Yep, 30 seconds has passed.

55
00:03:44,430 --> 00:03:51,420
So now we can get our display here, 25000 images in the training, 12000 Typekit in the tests.

56
00:03:52,230 --> 00:03:56,340
So now we're ready to extract of features using Figure 16.

57
00:03:56,910 --> 00:03:58,350
So let's take a look at how we do that.

58
00:03:58,350 --> 00:04:04,110
So let's just set up image names here to get the image file names, some that are actually as well as

59
00:04:04,110 --> 00:04:06,210
image parts, which is the full path now.

60
00:04:07,320 --> 00:04:13,080
And then what we do, we set our model to evaluation mode because remember, we're not training this

61
00:04:13,080 --> 00:04:16,230
model, we're using this model as a feature detector.

62
00:04:16,830 --> 00:04:23,910
Well, CNN's numbers remember, CNN filters are basically feature detectors to give effect edges, complex

63
00:04:23,910 --> 00:04:31,710
patterns, and all of those things we can and could to well include or image any image in terms of those

64
00:04:31,710 --> 00:04:36,510
features and then use those features as inputs into another model.

65
00:04:36,810 --> 00:04:38,220
That's effectively what we're doing.

66
00:04:39,180 --> 00:04:40,800
So let's take a look at how we do that.

67
00:04:40,800 --> 00:04:46,800
So we just set the Model T evaluation mode for the model on the GPU and then with touch note.

68
00:04:46,800 --> 00:04:48,330
Read what we do here.

69
00:04:48,330 --> 00:04:53,610
We just create some blank features and levels because we're going to keep appending to those in the

70
00:04:53,610 --> 00:04:59,550
loop so we can we can go through this loop through our training dataset here.

71
00:05:00,200 --> 00:05:03,920
This is returns a batch which is here data and labels.

72
00:05:04,490 --> 00:05:06,540
We send that to you.

73
00:05:06,680 --> 00:05:10,850
This actually is probably bad coding, so let's make sure you have a GP, you enable the device.

74
00:05:10,850 --> 00:05:11,660
This will break.

75
00:05:12,440 --> 00:05:19,340
Then we just get concatenate of features using torture cut cuts here.

76
00:05:19,640 --> 00:05:25,100
And so we just keep accumulating features, accumulating to levels so we know what the what's there.

77
00:05:25,880 --> 00:05:28,600
And then we just reshape our TensorFlow.

78
00:05:28,730 --> 00:05:34,010
This is the output says it's going to be twenty five thousand by 25000 idiot.

79
00:05:34,490 --> 00:05:39,110
Because remember, let's take a look at the work in our Keros lesson.

80
00:05:39,290 --> 00:05:42,250
If you remember correctly, it's going to be this is the output size here.

81
00:05:42,260 --> 00:05:43,100
512.

82
00:05:43,490 --> 00:05:50,960
Yes, I am using code of yes, 512 by seven by seven, which gives us 25 tools and idiot.

83
00:05:52,580 --> 00:05:52,940
All right.

84
00:05:52,940 --> 00:05:54,320
So that's the final size here.

85
00:05:54,710 --> 00:05:55,850
So let's run this.

86
00:05:55,850 --> 00:05:57,050
This may take a while to run.

87
00:05:57,140 --> 00:05:59,600
So let's take a look and see how long it takes.

88
00:05:59,630 --> 00:05:59,870
Yep.

89
00:06:00,500 --> 00:06:02,570
You can see it's incrementing treatise.

90
00:06:02,870 --> 00:06:06,800
We use t2dm here to just get a nice little progress bar.

91
00:06:07,250 --> 00:06:09,290
And you can see this might take about a minute.

92
00:06:09,320 --> 00:06:11,540
So let's all right.

93
00:06:11,550 --> 00:06:11,930
There we go.

94
00:06:11,930 --> 00:06:16,700
It took about two minutes and we've extracted all of our features from a training dataset.

95
00:06:16,710 --> 00:06:20,750
So we now have a very big area here.

96
00:06:20,870 --> 00:06:26,150
We have an array that's 25000 by twenty five thousand eighty eight that's storing but storing all the

97
00:06:26,150 --> 00:06:29,300
encoded features that our CNN extracted from our images.

98
00:06:29,810 --> 00:06:35,570
So let's take a look at the size to make sure and labels just to make sure everything is OK and check

99
00:06:35,570 --> 00:06:38,750
the final shape as it is or as expected.

100
00:06:39,290 --> 00:06:45,710
No, we can use this along with labels here to train our logistic regression.

101
00:06:46,160 --> 00:06:47,930
So let's convert them to lumpia.

102
00:06:47,930 --> 00:06:51,430
Is firstly, suicide features thought CPU doc.

103
00:06:51,440 --> 00:06:55,050
No, a same for the labeled start CPU thought no.

104
00:06:55,070 --> 00:06:56,720
I didn't forget to brackets.

105
00:06:57,320 --> 00:06:58,520
So let's convert those.

106
00:06:58,520 --> 00:07:05,930
We have features on image labels, and by then we import a linear logistic regression from a skeleton's

107
00:07:05,930 --> 00:07:12,200
linear model, as well as to split because we're only going to look at the training data here.

108
00:07:12,560 --> 00:07:16,990
So we just split the training data into 80 20 split.

109
00:07:16,990 --> 00:07:22,520
So that's 20000 in training images and 5000 test images.

110
00:07:23,210 --> 00:07:25,880
And then we can just quickly just run this.

111
00:07:26,150 --> 00:07:27,500
So we take the split.

112
00:07:27,590 --> 00:07:28,250
The data here.

113
00:07:28,760 --> 00:07:36,590
Features and labels set our test size set around them state just to make it replicable, replicable

114
00:07:36,590 --> 00:07:42,170
for other times you run cities, set the random state here to seven to keep the random state constant

115
00:07:42,920 --> 00:07:44,380
can put in any number here.

116
00:07:44,390 --> 00:07:46,520
Actually, if you want, put your favorite number.

117
00:07:47,420 --> 00:07:48,760
And then this would be here.

118
00:07:48,770 --> 00:07:53,070
We just create a logistic regression object and we just fit the model here.

119
00:07:53,090 --> 00:07:55,850
So we have Extrait and Weitering Whiteread mean labels.

120
00:07:56,480 --> 00:07:58,520
So let's start training.

121
00:08:00,710 --> 00:08:01,550
All right, there we go.

122
00:08:01,580 --> 00:08:04,760
So we've completed training in just under a minute, 58 seconds.

123
00:08:04,760 --> 00:08:06,170
In fact, it's quite fast.

124
00:08:06,830 --> 00:08:09,260
And let's take a look at our accuracy.

125
00:08:09,680 --> 00:08:12,850
Ninety seven point one two percent.

126
00:08:13,400 --> 00:08:14,300
That's actually pretty good.

127
00:08:14,720 --> 00:08:18,650
So let's run now, run some inferences on our test data.

128
00:08:18,950 --> 00:08:23,540
So, no, the tests they do the twelve thousand five hundred images that we didn't touch.

129
00:08:23,570 --> 00:08:29,700
Let's take a look at running some inferences on those two samples here.

130
00:08:29,720 --> 00:08:31,370
Sorry to leave that.

131
00:08:32,390 --> 00:08:37,190
So let's create a load of transformers here for the test images.

132
00:08:37,880 --> 00:08:42,770
And then we're going to use random just load a random sample from our test images here.

133
00:08:43,250 --> 00:08:49,190
We're going to look at a random sample of 12 images and then just run those images here with this function

134
00:08:49,190 --> 00:08:50,690
call test image here.

135
00:08:51,230 --> 00:08:57,080
So it takes well, we take the sample to sample here and then we just run it through the prediction

136
00:08:57,080 --> 00:08:57,300
here.

137
00:08:57,300 --> 00:09:00,740
And if it's a dog eat dog, we just create a result as dog.

138
00:09:00,740 --> 00:09:02,000
If else, it's a cat.

139
00:09:02,420 --> 00:09:07,150
When logistic regression returns a probability between zero and one.

140
00:09:07,160 --> 00:09:11,550
So if it's greater than 0.5, it's going to be a dog.

141
00:09:11,570 --> 00:09:12,980
If not, it's going to be a cat.

142
00:09:13,490 --> 00:09:17,780
So let's create this here and now let's get our results.

143
00:09:17,930 --> 00:09:20,480
So we're just getting the results of this 12 images.

144
00:09:20,480 --> 00:09:25,130
The quote random random images here and all of this actually plot them now.

145
00:09:25,220 --> 00:09:28,920
So we're going to plot those in this loop, which I want.

146
00:09:28,950 --> 00:09:35,630
Explain everything here just to show you that we're just loading the image part here of sort of loading

147
00:09:35,630 --> 00:09:43,280
the image path here, resizing it two two two four, then changing it to look from a video to RGV.

148
00:09:43,280 --> 00:09:49,220
And then putting the result that we got previously, which was the results.

149
00:09:49,220 --> 00:09:53,540
Yet prediction results, and we can visualize it now on those images.

150
00:09:54,830 --> 00:09:56,750
So let's take a look at them.

151
00:09:57,170 --> 00:09:59,590
So these were the test samples to sample.

152
00:10:00,820 --> 00:10:03,070
That we're good at, so you can see this as a dog.

153
00:10:03,310 --> 00:10:04,420
But some people down here.

154
00:10:05,170 --> 00:10:06,480
This is also a dog.

155
00:10:06,490 --> 00:10:09,850
This is a dog to dog cat.

156
00:10:09,940 --> 00:10:11,020
Yep, got it right.

157
00:10:12,130 --> 00:10:15,250
Talk, talk, talk, talk got.

158
00:10:15,370 --> 00:10:17,380
So they got all 12 right, actually.

159
00:10:17,890 --> 00:10:23,770
So our 97 percent accurate model is doing fairly well on the training on the test dataset.

160
00:10:24,220 --> 00:10:27,400
So you can load more test at random test images.

161
00:10:27,400 --> 00:10:29,290
Actually, I'll show you how we can do that.

162
00:10:29,860 --> 00:10:36,610
What we do is just run this block of code from here, then run this one and then run this again and

163
00:10:36,610 --> 00:10:39,190
you're going to get 12 new random images here.

164
00:10:41,580 --> 00:10:41,840
See?

165
00:10:42,510 --> 00:10:47,520
So, yeah, so let's just make sure everything is correct, everything does seem to be correct.

166
00:10:48,240 --> 00:10:48,600
Yep.

167
00:10:48,930 --> 00:10:52,650
We've made a very good cut first of classify, you know, as you can see.

168
00:10:53,340 --> 00:10:54,930
So I'll stop there for now.

169
00:10:55,110 --> 00:11:01,140
And in the next section, we'll take a look at Google's deep dream, which is a very cool algorithm.

170
00:11:01,680 --> 00:11:02,820
So stay tuned for that.

171
00:11:02,970 --> 00:11:03,420
Thank you.