1
00:00:00,210 --> 00:00:05,790
Hi, welcome back in this lesson, we'll take a look at using PyTorch Lightning to implement transfer

2
00:00:05,790 --> 00:00:11,280
learning, which you should have a deeper understanding of know having watched those introductory slides

3
00:00:11,280 --> 00:00:13,080
I presented previously.

4
00:00:13,680 --> 00:00:15,840
So open this notebook notebook 20.

5
00:00:16,350 --> 00:00:17,850
And let's begin this lesson.

6
00:00:18,000 --> 00:00:21,270
So I really have some stuff running, so you wouldn't have to wait.

7
00:00:21,810 --> 00:00:26,280
So let's install and set up to pay to watch lightning module.

8
00:00:26,580 --> 00:00:27,630
So a library.

9
00:00:28,020 --> 00:00:32,340
So I've done it already here you can run that could have already imported these packages that we'll

10
00:00:32,350 --> 00:00:35,220
be using in this lesson and I've already done with it.

11
00:00:35,220 --> 00:00:40,410
Unzip the data set to be using so we can skipped all of that because we've done this before.

12
00:00:40,770 --> 00:00:47,250
This is a world class or data asset class that returns to images and label, and we just set it up here

13
00:00:47,250 --> 00:00:54,000
and create a file on the train and our dataset objects and then the data loaders here as well.

14
00:00:54,450 --> 00:00:56,370
So let's run all of this.

15
00:00:56,760 --> 00:01:01,050
And no, no, let's see how we set up our transfer learning.

16
00:01:01,440 --> 00:01:03,990
So we're doing transfer learning by feature extraction.

17
00:01:03,990 --> 00:01:09,270
And what that means, of course, you forgot, is that we're loading a pre-trained image net network.

18
00:01:09,270 --> 00:01:11,220
In this case, we're using a resonant 50.

19
00:01:11,760 --> 00:01:18,630
And what we do, we keep all the convolutional is all of Emilia's in the network frozen and we just

20
00:01:18,630 --> 00:01:22,050
replace the header with a different layer and retrieve that.

21
00:01:22,530 --> 00:01:24,450
So let's take a look at how we do that here.

22
00:01:24,960 --> 00:01:29,670
So Verizon here def definitely part of the class.

23
00:01:30,270 --> 00:01:33,780
We create we just followed the model here, so we load resonate here.

24
00:01:33,780 --> 00:01:37,590
We alluded to prettiness true and we call it the backbone here.

25
00:01:38,010 --> 00:01:44,940
So now we can access after fully connected layers here and we just get the number of number of filters

26
00:01:45,060 --> 00:01:46,950
on the output of that fully connected live.

27
00:01:47,460 --> 00:01:53,580
Next, we just get to let's get all of the layers here in that would accept the last one because the

28
00:01:53,580 --> 00:02:00,540
last one has the details and outputs, and we create this feature extractor here using it and then sequential

29
00:02:00,540 --> 00:02:00,840
here.

30
00:02:01,320 --> 00:02:05,520
This just combines the best basically rebuilds the model effectively here.

31
00:02:06,090 --> 00:02:11,940
And because our model has two classes now, because of the Catalyst Store classifier, we just set it

32
00:02:11,940 --> 00:02:16,440
to the last year and the network to be a number of classes here.

33
00:02:16,440 --> 00:02:23,070
So it takes no filters from the previous layer here, the fully connected layer and then ups it to the

34
00:02:23,070 --> 00:02:25,290
fullest, the last two output nodes, which we have.

35
00:02:25,920 --> 00:02:32,340
And then secondly, we have noticed in the forward loop, we don't have a network like the layers individually

36
00:02:32,340 --> 00:02:32,940
as defined.

37
00:02:33,390 --> 00:02:34,380
We have this.

38
00:02:34,380 --> 00:02:40,020
We have a feature extracted, which is basically a model node that we haven't thought so often feature

39
00:02:40,760 --> 00:02:40,990
to.

40
00:02:41,430 --> 00:02:48,060
This sort of command allows us to use this in different parts of the class so we can use this in the

41
00:02:48,060 --> 00:02:48,750
forward loop here.

42
00:02:49,500 --> 00:02:51,990
So we just set this now to evaluation mode.

43
00:02:52,080 --> 00:02:57,030
That's basically how we freeze the way it's there and then we use touch with no grid.

44
00:02:57,510 --> 00:03:03,840
And what this does, basically, this allows us to use this frozen model here where we just flatten

45
00:03:03,840 --> 00:03:07,080
the output and then feed it now to the final layer here.

46
00:03:07,890 --> 00:03:11,670
And then we have the software updates to the soft max layer here.

47
00:03:12,120 --> 00:03:17,430
So it seems a bit complicated, but this is actually boilerplate code that you have to change and you

48
00:03:17,430 --> 00:03:19,800
can interchange different resonant networks here.

49
00:03:20,160 --> 00:03:25,890
So this is basically the methodology for doing feature extraction, transfer, learning.

50
00:03:26,070 --> 00:03:28,140
Yeah, so that's true.

51
00:03:28,470 --> 00:03:29,670
So everything else is the same.

52
00:03:29,670 --> 00:03:32,760
You would have seen these things before in the previous lesson.

53
00:03:33,090 --> 00:03:38,550
So let's run that block of code and we're now ready to implement our transfer learning.

54
00:03:39,060 --> 00:03:40,350
So let's start training.

55
00:03:41,970 --> 00:03:44,300
So this is going to take a while to train.

56
00:03:44,310 --> 00:03:48,570
It doesn't and download the way it's here, but let's take a look and see how long it takes anyway.

57
00:03:49,470 --> 00:03:52,290
You can see here we have 23 million parameters here.

58
00:03:52,800 --> 00:03:56,970
However, we're only going to be treating the 4.1 key parameters in the model.

59
00:03:57,510 --> 00:03:59,130
So let's wait for that.

60
00:03:59,250 --> 00:04:00,910
And you can see we're training right now.

61
00:04:00,930 --> 00:04:04,920
So while the strings, we can take a look at the tensor board of.

62
00:04:04,920 --> 00:04:11,580
But down here, however, this doesn't have it saved for the previous training that I did, unfortunately.

63
00:04:11,580 --> 00:04:14,670
So we'll have to wait for this to be finished to complete training.

64
00:04:15,270 --> 00:04:17,610
However, that's it for this lesson.

65
00:04:17,910 --> 00:04:23,220
That's basically all there is to implementing transfer learning with PI to its lightning.

66
00:04:23,730 --> 00:04:26,220
What we can do can stop this now.

67
00:04:28,320 --> 00:04:29,430
Hopefully, it stops.

68
00:04:31,490 --> 00:04:33,230
And he did stop there.

69
00:04:33,380 --> 00:04:41,000
And we can no run our tenser border and look at the first two epochs that we've trimmed, so give it

70
00:04:41,000 --> 00:04:43,040
a wealth or attend to board to load.

71
00:04:44,030 --> 00:04:45,120
OK, so there it goes.

72
00:04:45,140 --> 00:04:46,470
It takes a little while to start up.

73
00:04:46,490 --> 00:04:50,310
So when you see this blank screen here, it doesn't mean it didn't load yet.

74
00:04:50,360 --> 00:04:51,770
It's just taking its time.

75
00:04:51,800 --> 00:04:52,280
There we go.

76
00:04:52,310 --> 00:04:53,660
You can see it's coming up now.

77
00:04:55,400 --> 00:04:57,960
And now we can take a look at our training.

78
00:04:57,980 --> 00:05:00,560
Let's take a look at our training and validation accuracy.

79
00:05:01,190 --> 00:05:03,710
So let's take a look at this one first.

80
00:05:05,060 --> 00:05:10,580
And you can see it's a little spiky, but however you can see immediately we we're at 84 percent accuracy,

81
00:05:11,240 --> 00:05:13,510
actually 87 percent accuracy.

82
00:05:13,520 --> 00:05:14,630
So that's quite good.

83
00:05:15,320 --> 00:05:18,470
And let's see how our validation one performed.

84
00:05:19,940 --> 00:05:21,410
Actually, this is the wrong one.

85
00:05:21,500 --> 00:05:24,710
Let's look at the epoch.

86
00:05:25,280 --> 00:05:27,110
Actually, it's the same one I looked at.

87
00:05:28,100 --> 00:05:28,370
OK.

88
00:05:28,670 --> 00:05:36,250
We have one value orderly, and we're at 78 percent accuracy right there.

89
00:05:36,260 --> 00:05:37,070
So that's pretty good.

90
00:05:37,080 --> 00:05:39,560
So we apparently we're not logging it.

91
00:05:39,740 --> 00:05:43,040
The validation accuracy is in the steps.

92
00:05:43,040 --> 00:05:46,040
Actually, that makes sense because it's not in the training loop.

93
00:05:46,250 --> 00:05:48,110
There can take a look and see.

94
00:05:48,920 --> 00:05:55,280
So that's it for this lesson, and I'll join you now in the adolescent where we'll now use carrots and

95
00:05:55,280 --> 00:05:59,540
carrots is actually a lot more, a lot easier to implement transfer learning, in my opinion.

96
00:06:00,110 --> 00:06:04,910
I will also use CNN as a feature extractor increase, which I mentioned in the slides.

97
00:06:05,360 --> 00:06:12,830
We will use a logistic regression at the end of that of the end of the CNN network using the features,

98
00:06:12,830 --> 00:06:14,810
which as we extract from the convolutional, is.

99
00:06:15,230 --> 00:06:18,010
So stay tuned for that and also will do the same.

100
00:06:18,020 --> 00:06:22,340
Those two same things and pi touch without using Pi Touch Lightning.

101
00:06:22,340 --> 00:06:27,260
So if you want to use transfer learning in PI touch and fine tuning, you can take a look and see how

102
00:06:27,260 --> 00:06:27,920
we do that there.

103
00:06:28,160 --> 00:06:28,640
Thank you.