1
00:00:00,210 --> 00:00:02,010
Hi, and welcome back to the lesson.

2
00:00:02,640 --> 00:00:08,670
So now we're going to look at they just and understand and inspect it a bit and visualize it a bit.

3
00:00:09,270 --> 00:00:10,660
So let's take a look at this.

4
00:00:10,740 --> 00:00:17,400
So what we're using for this is the actual watch vision library, which we discussed initially in the

5
00:00:17,400 --> 00:00:23,010
first video in the section and this library, you can look at it here to see what data sets it provides.

6
00:00:23,400 --> 00:00:25,260
There are actually many different data sets.

7
00:00:25,260 --> 00:00:26,340
They're all listed here.

8
00:00:26,820 --> 00:00:32,860
A lot of these are quite famous, and the ones that we're going to use a lot are far fashion amnesty

9
00:00:33,360 --> 00:00:37,830
admit, well, we don't use a machine that's in this class because it's such a huge dataset, but it's

10
00:00:37,830 --> 00:00:39,780
available here, Amnesty.

11
00:00:39,780 --> 00:00:42,870
And that's about it for no, maybe a couple of hours afterward.

12
00:00:44,250 --> 00:00:46,230
So let's take a look at something here.

13
00:00:46,890 --> 00:00:48,690
How do we import data?

14
00:00:49,020 --> 00:00:51,450
How do we import data sets from using torture vision?

15
00:00:52,020 --> 00:00:53,280
Well, it's actually quite simple.

16
00:00:53,430 --> 00:00:57,760
You just use torch vision because we imported it up here as to what's vision.

17
00:00:58,530 --> 00:00:59,400
Let's go back down.

18
00:00:59,730 --> 00:01:01,210
Apologies for the scrolling.

19
00:01:02,010 --> 00:01:09,180
So we have to watch mission data sets amnesty, and we just specify amnesty again here as the end.

20
00:01:09,180 --> 00:01:13,920
But we set free and equal true, which means that we want to download the training dataset here.

21
00:01:14,940 --> 00:01:17,670
We have downloadable true, which means that we want to download it.

22
00:01:18,060 --> 00:01:23,310
Alternatively, if it's false, it would just pull it from your hard disk if you don't do that previously.

23
00:01:24,000 --> 00:01:26,970
And we set the transforms equal transform.

24
00:01:27,330 --> 00:01:29,760
Remember, we defined this transform right here.

25
00:01:30,240 --> 00:01:36,720
So what this means here is that this trained dataset, this object that contains your dataset, also

26
00:01:36,720 --> 00:01:40,440
contains information about a transform pipeline, which is really good.

27
00:01:40,890 --> 00:01:41,250
It does.

28
00:01:41,370 --> 00:01:43,950
It doesn't mean it's actually going to be transformed automatically.

29
00:01:44,340 --> 00:01:48,120
It just means that it's going to be transformed when it passes through the PyTorch network.

30
00:01:48,510 --> 00:01:49,080
It does.

31
00:01:49,080 --> 00:01:50,520
That has to use this transform.

32
00:01:51,330 --> 00:01:54,300
So now similarly, you can see that we physically did.

33
00:01:54,300 --> 00:01:58,940
The Trinidad said, no, let's do the testing, the set and the only difference here is that we set

34
00:01:58,950 --> 00:02:00,240
free and equal force.

35
00:02:00,690 --> 00:02:03,510
This means that it's not going to load the test data set here.

36
00:02:03,900 --> 00:02:06,030
So that's this is all fairly straightforward.

37
00:02:06,450 --> 00:02:11,100
When you run the school, you'll see these download links pop up, although it shouldn't take that long

38
00:02:11,700 --> 00:02:13,530
to call because it's quite fast.

39
00:02:14,460 --> 00:02:16,430
The internet connection that is on collab.

40
00:02:17,220 --> 00:02:18,360
So let's talk.

41
00:02:18,360 --> 00:02:20,910
Let's take a look about and talk about this data.

42
00:02:21,660 --> 00:02:25,530
So you see, we have two sets of data the training and the test.

43
00:02:25,530 --> 00:02:32,430
You know, test data sometimes called the validation data, and it sometimes can be two different data

44
00:02:32,430 --> 00:02:33,120
sets as well.

45
00:02:33,930 --> 00:02:41,040
However, in principle, this is how we train neural nets or CNN's use split usually 70 percent of your

46
00:02:41,040 --> 00:02:47,250
dataset into training into a training dataset, and you leave the attitude in percent as a test, or

47
00:02:47,250 --> 00:02:52,140
you can split test into two groups a test and validation, as I said here.

48
00:02:52,710 --> 00:02:55,260
But for no one does think of it as a 70 to 80 split.

49
00:02:56,070 --> 00:03:03,120
So we set out training data to 70 percent and well, in this case, the amnesty to set it actually already

50
00:03:03,120 --> 00:03:03,690
came split.

51
00:03:03,690 --> 00:03:12,930
Separately, we had 60 chosen training dataset training images and 10000 test images that were training

52
00:03:12,930 --> 00:03:13,500
on that.

53
00:03:14,310 --> 00:03:20,460
So it's a good practice to always have the test datasets not be part of the training procedure.

54
00:03:20,760 --> 00:03:27,510
The reason you have a test dataset for is because you want to train your model on one set of data and

55
00:03:27,510 --> 00:03:30,080
then test it on some totally unseen data.

56
00:03:30,090 --> 00:03:31,350
And that's your test dataset.

57
00:03:32,400 --> 00:03:34,530
So let's inspect our detail here.

58
00:03:34,800 --> 00:03:39,990
OK, so moving onto section four in this video section, because it's we can it's a short section.

59
00:03:39,990 --> 00:03:40,740
She was quite short.

60
00:03:41,580 --> 00:03:43,470
So let's take a look at this.

61
00:03:44,130 --> 00:03:45,030
Let's inspect the scene.

62
00:03:45,210 --> 00:03:46,530
So it's inspector data.

63
00:03:46,980 --> 00:03:47,640
What do we do?

64
00:03:47,760 --> 00:03:50,450
We just use data dot ship.

65
00:03:50,510 --> 00:03:51,630
And what does that give us?

66
00:03:52,110 --> 00:03:58,020
We use a trend in the set means sets that we had before, and this gives us the actual torch output

67
00:03:58,020 --> 00:03:58,860
size shape.

68
00:03:59,610 --> 00:04:07,500
So by specifying trend data dot ship, we can see that we have 62000 entries here Twenty-Eight by 20

69
00:04:07,640 --> 00:04:07,980
each.

70
00:04:08,160 --> 00:04:09,510
That's the Pixel images.

71
00:04:10,530 --> 00:04:15,870
If there was a two, if this was color, you would have a confident Tree Hill to indicate that this

72
00:04:15,870 --> 00:04:17,340
treat the dimension of tree.

73
00:04:18,120 --> 00:04:24,510
Similarly, for the test dataset, you can see it's 10000 images, also the same wintered height twenty

74
00:04:24,510 --> 00:04:25,320
eight by twenty eight.

75
00:04:26,370 --> 00:04:31,670
Now, let's take a look at one sample of data so you can see by using the indexing.

76
00:04:31,680 --> 00:04:34,380
Here we have data center data dot ship.

77
00:04:34,980 --> 00:04:40,530
We have data open square brackets zero, which means we're going to look at the first image in the training

78
00:04:40,530 --> 00:04:42,720
dataset and we can in the shape of it.

79
00:04:42,960 --> 00:04:48,560
And we can see it's twenty two by twenty, which is what we expect if we take a look at the first data.

80
00:04:48,570 --> 00:04:54,180
If you just move the ship and print out when the array is, you can see we get this value here.

81
00:04:54,780 --> 00:04:57,870
Distances and distances of twenty by twenty odd metrics.

82
00:04:57,880 --> 00:04:59,430
You can see the first row here.

83
00:05:00,310 --> 00:05:05,760
Is this all zeros and you can see the middle part of where the number is?

84
00:05:06,130 --> 00:05:10,210
No, you can't actually visualize it well because the lines are stacked on top of each other.

85
00:05:10,600 --> 00:05:14,830
If this was a 14 by 14 image, you would actually be able to see the number written here.

86
00:05:15,070 --> 00:05:16,210
That will be just values.

87
00:05:17,200 --> 00:05:19,090
Anyway, we will actually visualize it now.

88
00:05:19,750 --> 00:05:22,810
So let's use open KBE to talk this.

89
00:05:23,290 --> 00:05:25,960
Well, we're not technically using open TV to plot it here.

90
00:05:26,800 --> 00:05:32,060
We're just using the knob plot function to to to print this out.

91
00:05:32,080 --> 00:05:40,330
We're just using it app the open TV to get a visit from BGR to our TV because Open TV's color scheme

92
00:05:40,330 --> 00:05:41,950
is a bit different to our TV ads.

93
00:05:41,980 --> 00:05:50,140
BGR, it's a minor thing, so we just convert the image to BTR and use plot that show to show this image

94
00:05:50,140 --> 00:05:50,360
here.

95
00:05:50,440 --> 00:05:56,110
So this is the function we often use in the previous previous opens in the U.S. to display of images

96
00:05:56,110 --> 00:05:56,620
in color.

97
00:05:57,190 --> 00:06:03,430
And you can see by converting this image here to inspire, we were able to pass it into this function

98
00:06:03,430 --> 00:06:04,900
here and display it.

99
00:06:05,200 --> 00:06:06,760
So it's a no fly, if you can see.

100
00:06:07,420 --> 00:06:08,380
So that's pretty cool.

101
00:06:08,860 --> 00:06:13,180
However, there are other ways we can display images from our dataset.

102
00:06:14,110 --> 00:06:14,640
Yes, they are.

103
00:06:15,190 --> 00:06:20,920
However, before I move on to that, I just want to tell you that whenever you need to do some of the

104
00:06:20,920 --> 00:06:26,340
operations on these tenses, you can always converted to a nampai read by using dot number.

105
00:06:26,360 --> 00:06:29,750
Here is also the function of by touch to convert.

106
00:06:29,800 --> 00:06:32,020
The umpire is, but this is a common one that I use.

107
00:06:32,770 --> 00:06:40,420
So let's take a look at another way we can plot the grips of a stack of digits like this grip so you

108
00:06:40,420 --> 00:06:45,150
can see that system where we input MATLAB as party.

109
00:06:45,910 --> 00:06:47,570
We do create the finger object.

110
00:06:47,590 --> 00:06:49,840
We hope we see how much images we want to display here.

111
00:06:50,230 --> 00:06:57,100
So 50 and what we do, we just use the supplied function here because right now we're going to be looping

112
00:06:57,100 --> 00:07:03,130
from one to 51, which is 50 50 images here, and we just set the subplot here.

113
00:07:03,130 --> 00:07:04,660
So this is a number of rules.

114
00:07:04,690 --> 00:07:05,890
This is a number of columns.

115
00:07:06,340 --> 00:07:12,100
This is the index that's going to be so it starts at one to three indexing this and start at zero and

116
00:07:12,100 --> 00:07:12,670
subplots.

117
00:07:12,670 --> 00:07:19,240
That's why we solve that one and we have to go to number of images plus one easily two cents to fifty

118
00:07:19,240 --> 00:07:21,730
one if we wanted to be fair and removed this plus one.

119
00:07:22,630 --> 00:07:27,940
And we just moved to access here to make it nice and clean, and we just plot data set here.

120
00:07:28,330 --> 00:07:34,810
So by doing trends at the data and index being in the loop here, it's activated.

121
00:07:35,020 --> 00:07:41,520
And look, then this is if this is going to plot the first 50 images in the dataset and we set a color

122
00:07:41,550 --> 00:07:47,290
map to agree underscore our which puts a white background and black text instead of the other one we

123
00:07:47,290 --> 00:07:47,830
saw here.

124
00:07:48,610 --> 00:07:55,150
And this is how we get the images displayed nicely is always good practice, in my opinion.

125
00:07:55,150 --> 00:08:01,370
To visualize your vision, your image data sets because you're going to find a lot of issues.

126
00:08:01,370 --> 00:08:06,760
We can actually learn a lot of things and understand what types of CNN you have to use by just inspecting

127
00:08:06,760 --> 00:08:07,780
the images themselves.

128
00:08:08,050 --> 00:08:09,430
So it's always a good practice.

129
00:08:10,360 --> 00:08:12,880
So let's stop there for now.

130
00:08:13,330 --> 00:08:19,960
And in the next section, we'll take a look at creating or data what is required to know data and to

131
00:08:19,960 --> 00:08:23,020
buy too much into the training mechanism, if by touch.

132
00:08:23,710 --> 00:08:25,330
So I'll see you in the next section.
