1
00:00:11,690 --> 00:00:16,670
In this lecture we are going to look at a code lab notebook that does transfer learning without data

2
00:00:16,700 --> 00:00:18,410
augmentation.

3
00:00:18,410 --> 00:00:21,470
This lecture is going to walk you through a prepared code lab notebook.

4
00:00:21,530 --> 00:00:27,230
Although a very good exercise which I always recommend is once you know how this is done to try and

5
00:00:27,230 --> 00:00:33,380
recreate it yourself with as few references as possible as usual you can look at the title of the notebook

6
00:00:33,620 --> 00:00:36,010
to determine what notebook we are currently looking at.

7
00:00:37,250 --> 00:00:44,580
And this notebook is called Pi torch transfer learning fast so our main goal in this script is to see

8
00:00:44,970 --> 00:00:52,140
if we pre compute the features a Will training B faster and B Well we get an unacceptable reduction

9
00:00:52,170 --> 00:00:57,960
in accuracy I'll assume that in this lecture you've already seen the code that does transfer learning

10
00:00:58,380 --> 00:00:59,610
with data augmentation.

11
00:00:59,640 --> 00:01:05,700
So I'm not going to explain any of the lines that I've already explained so at the top everything is

12
00:01:05,700 --> 00:01:07,310
the same as before.

13
00:01:07,560 --> 00:01:14,340
Same imports same data same process to move the data into the folders which are structured the right

14
00:01:14,340 --> 00:01:17,130
way for the image folder data set object

15
00:01:23,520 --> 00:01:24,160
OK.

16
00:01:24,200 --> 00:01:28,260
So when we get to the transforms you'll notice that we only have one.

17
00:01:28,310 --> 00:01:30,410
Why do we only need one.

18
00:01:30,410 --> 00:01:35,300
Well since we're not going to do any data augmentation we can apply the same transformations to both

19
00:01:35,300 --> 00:01:41,460
the train and test that and these are all non data augmentation and transformations

20
00:01:44,380 --> 00:01:48,290
so next we create our image folder objects.

21
00:01:48,350 --> 00:01:52,390
Next we create our data load our objects.

22
00:01:52,500 --> 00:01:55,560
Next I grab the pre train model with prearranged equals true

23
00:01:58,840 --> 00:02:00,970
so the next part is also new

24
00:02:05,210 --> 00:02:08,980
since the V-J operation is separate from the rest of the network.

25
00:02:09,080 --> 00:02:16,360
We can define a separate model for it so I'll define a class called Veggie features in the constructor.

26
00:02:16,370 --> 00:02:20,890
The only thing I need to do is save the VB model in the forward function.

27
00:02:20,900 --> 00:02:27,020
I take X and I pass it through the different parts of the e.g. one at a time until right before the

28
00:02:27,020 --> 00:02:28,530
final classifier.

29
00:02:28,730 --> 00:02:30,860
At that point I just flatten and then return

30
00:02:33,600 --> 00:02:39,370
next I instantiate this class and I call it VEGF next.

31
00:02:39,370 --> 00:02:45,830
As a sanity check just to make sure this works we pass in a random input output a shape one by twenty

32
00:02:45,830 --> 00:02:46,960
five thousand eighty eight.

33
00:02:47,090 --> 00:02:50,780
So it works.

34
00:02:50,800 --> 00:02:55,530
Next we're going to transform our data using the feature transformer that we just instantiated.

35
00:02:56,140 --> 00:02:58,490
So first we need to find the length of each dataset.

36
00:02:59,560 --> 00:03:02,830
Luckily we can use the lend function on the data set objects

37
00:03:07,440 --> 00:03:11,740
so we have entrained three thousand and test one as.

38
00:03:11,790 --> 00:03:15,330
Next we're gonna grab D the feature dimensionality.

39
00:03:15,330 --> 00:03:20,370
This is the size of the output of the VEGF transformer which is equal to twenty five thousand eighty

40
00:03:20,370 --> 00:03:26,050
eight.

41
00:03:26,050 --> 00:03:31,750
Next we're going to work on creating our tabular data set consisting of transform the features from

42
00:03:31,750 --> 00:03:33,320
the original data.

43
00:03:33,430 --> 00:03:37,510
Once we have this we no longer have to work with the original dataset.

44
00:03:37,630 --> 00:03:45,990
So we're going to start by instantiating empty arrays to store X train y train x test and Y test.

45
00:03:46,120 --> 00:03:51,850
Next we're going to set the device to the GP you and move the feeder transformer to the GP you

46
00:03:56,860 --> 00:03:59,670
next we're going to populate X train and Y training.

47
00:04:00,070 --> 00:04:02,470
We can do this by looping over the train at dataset.

48
00:04:03,370 --> 00:04:09,340
So on each iteration we start by moving the inputs to the GP you it's not necessary to move the targets

49
00:04:09,340 --> 00:04:13,540
that the GP you because we only want to make a prediction with the model.

50
00:04:13,840 --> 00:04:20,840
Next we get the output by passing the inputs through VEGF.

51
00:04:20,870 --> 00:04:23,340
Next we get the size of the output.

52
00:04:23,360 --> 00:04:28,700
Note that this is not equal to the back size since the number of samples may not divide evenly into

53
00:04:28,700 --> 00:04:30,070
the batch size.

54
00:04:30,110 --> 00:04:37,510
So the final batch is less than or equal to the batch size next we assign the features and the targets

55
00:04:37,540 --> 00:04:41,920
to X train and Y train at the current index ie.

56
00:04:42,160 --> 00:04:49,700
Then we increment I by the current that size and go to the next iteration of the loop so since we print

57
00:04:49,730 --> 00:04:55,730
ie on each iteration we can confirm that the number of samples in our new dataset is equal to the original

58
00:04:55,730 --> 00:04:56,800
number of images.

59
00:04:58,970 --> 00:05:03,570
All right so we can see that the final value of AI is three thousand which is the correct number of

60
00:05:03,570 --> 00:05:08,800
samples.

61
00:05:08,810 --> 00:05:16,520
Next we do the same thing for X test and Y test.

62
00:05:16,730 --> 00:05:22,460
So one thing that's interesting to check is the men in the max of the feature values.

63
00:05:22,460 --> 00:05:28,580
As you can see the max in X train is about sixty two which should seem weird to you because you should

64
00:05:28,580 --> 00:05:32,390
expect it values in a neuron that work to be somewhat normalized.

65
00:05:32,390 --> 00:05:35,870
Since we're using a real you activation at the minimum value with zero

66
00:05:39,330 --> 00:05:44,580
so anyway since the data is at this weird scale we're going to use a standard scalar object to transform

67
00:05:44,580 --> 00:05:49,150
it so we'll call the scale data X train to an X test to

68
00:05:53,600 --> 00:05:56,670
now because the next step is to do logistic regression.

69
00:05:56,720 --> 00:05:59,690
There's no actual reason we need to use pi torch.

70
00:05:59,740 --> 00:06:05,250
In fact we can just use the logistic regression that's built into socket learn so let's do that.

71
00:06:06,200 --> 00:06:12,500
As usual it's just fit and predict which I've taught you for free in the past so everybody knows this.

72
00:06:12,500 --> 00:06:18,950
And as you can see we do very well just as well as when we did use data augmentation in fact.

73
00:06:18,950 --> 00:06:30,470
So maybe data augmentation wasn't so useful in this instance.

74
00:06:30,480 --> 00:06:36,000
Next we're going to complete this script by doing transfer learning the usual way which is to use the

75
00:06:36,000 --> 00:06:38,130
same library for the classifier.

76
00:06:38,400 --> 00:06:44,700
So let's start by creating a linear model and the rest of these steps you've seen before so we move

77
00:06:44,700 --> 00:06:53,750
the model to the GP you create the loss and optimizer make some tensor data set objects create the data

78
00:06:53,750 --> 00:07:05,890
loaders define the training function call the training function blood loss preparation and calculate

79
00:07:05,890 --> 00:07:06,580
the accuracy

80
00:07:10,410 --> 00:07:12,450
and so we end up with pretty much the same result

81
00:07:15,440 --> 00:07:16,740
also important.

82
00:07:16,820 --> 00:07:20,390
Notice how much faster this is with data augmentation.

83
00:07:20,420 --> 00:07:27,050
Each epoch takes about 40 seconds without data augmentation each epoch is nearly instant about zero

84
00:07:27,050 --> 00:07:28,460
point two seconds.

85
00:07:28,580 --> 00:07:33,950
So that's a two hundred x speed up by not using data augmentation and not having to go through the pre

86
00:07:33,950 --> 00:07:35,630
train network on each pass.