﻿1
00:00:00,690 --> 00:00:03,720
‫Let's start by importing our BTC.

2
00:00:05,040 --> 00:00:12,480
‫As we've discussed earlier we are going to use defection and my SD dataset to classify images of fashion

3
00:00:12,480 --> 00:00:22,760
‫objects such as those goods boots etc. fashion MONUC Yazdi is a very popular dataset.

4
00:00:23,310 --> 00:00:31,860
‫It is relatively small and is used to verify that an algorithm books as expected are not efficient.

5
00:00:31,890 --> 00:00:41,440
‫Amnesty consists of a training set of sixty thousand examples and a test set of 10000 examples.

6
00:00:41,700 --> 00:00:50,640
‫Each example is a twenty eight by twenty eight grayscale image that is it has twenty eight pixels by

7
00:00:50,640 --> 00:00:55,500
‫twenty eight pixels dimensions and it is a black and white image.

8
00:00:58,960 --> 00:01:06,000
‫With each image there is an associated label of in glasses.

9
00:01:06,310 --> 00:01:08,970
‫I'll show you the images after we import the dataset.

10
00:01:10,290 --> 00:01:13,520
‫Let's run this line of code to import data say

11
00:01:20,310 --> 00:01:23,280
‫you know there are multiple ways to import data.

12
00:01:23,370 --> 00:01:28,990
‫Here we are using the big dataset that comes with us liability.

13
00:01:29,090 --> 00:01:36,750
‫So if the US package is not installed this line more book if it is installed you will get the dataset

14
00:01:36,780 --> 00:01:46,570
‫imported into this video building which is fashion underscored emanate SD.

15
00:01:46,790 --> 00:01:55,460
‫You can see on the right in the environment variable window here exactly does it let us view this dataset

16
00:01:55,510 --> 00:01:59,980
‫by clicking on it.

17
00:02:00,650 --> 00:02:06,560
‫Here you can see that this dataset has to pass grain and test.

18
00:02:06,560 --> 00:02:12,680
‫This means that it is already divided into two parts of planing and pasting.

19
00:02:12,710 --> 00:02:15,220
‫We do not need to do this separately.

20
00:02:15,370 --> 00:02:22,760
‫However if you want to learn how to separate any data set into train and test which is not in this format

21
00:02:23,720 --> 00:02:26,220
‫please take the opening section of this course.

22
00:02:26,660 --> 00:02:30,620
‫There you will find a lecture theatre test train split.

23
00:02:30,620 --> 00:02:35,840
‫With that you will be able to split any day does it make sense.

24
00:02:35,840 --> 00:02:39,100
‫Here are data set is already split.

25
00:02:39,440 --> 00:02:46,690
‫Let's go for the train set for that has two parts X and Y.

26
00:02:46,910 --> 00:02:52,730
‫X is the set of productivity levels and y is the list of output values.

27
00:02:52,730 --> 00:02:55,100
‫That is the class of defection object

28
00:02:58,940 --> 00:03:06,860
‫you can see the structure of X and Y also here X is a set of sixty thousand images.

29
00:03:06,860 --> 00:03:09,430
‫We talked to indeed pixel by two indeed pixels.

30
00:03:10,790 --> 00:03:16,820
‫So for each image we have a value between 0 and 255.

31
00:03:18,080 --> 00:03:22,460
‫If the value is 0 that pixel is black.

32
00:03:22,460 --> 00:03:25,850
‫If it is 255 that pixel is white.

33
00:03:27,440 --> 00:03:34,440
‫So each individual pixels data for all the sixty thousand images is stored in this exhibit even

34
00:03:37,580 --> 00:03:38,700
‫similarly.

35
00:03:38,750 --> 00:03:43,420
‫Why has the glassware lose of 60000 images.

36
00:03:43,970 --> 00:03:48,680
‫For example the first image has the glass nine.

37
00:03:48,830 --> 00:03:55,520
‫What this 90 percent look at that in something similar to the train data we have this data.

38
00:03:56,090 --> 00:03:57,930
‫Only difference is in the training set.

39
00:03:57,950 --> 00:04:01,050
‫We have sixty thousand images data in it.

40
00:04:01,130 --> 00:04:03,820
‫We have 10000 image data.

41
00:04:04,250 --> 00:04:07,390
‫We will use this train data to bring that model.

42
00:04:07,790 --> 00:04:15,230
‫And later on we will predictive y values for this basic using the x values of this data.

43
00:04:15,890 --> 00:04:22,550
‫Then we will compare the actual y values in this test set with the predicted y values from my model

44
00:04:23,120 --> 00:04:26,930
‫to find out the accuracy of our model.

45
00:04:26,930 --> 00:04:36,860
‫Now let's go back to our board will be assigning the x and y brain values to separate variables to do

46
00:04:36,860 --> 00:04:37,490
‫that.

47
00:04:37,490 --> 00:04:43,440
‫This line of code is the standard way in which we assign value to a variable.

48
00:04:43,820 --> 00:04:46,900
‫You can run this line also and it will give you the same result.

49
00:04:47,000 --> 00:04:54,780
‫It will assign the x value of deplaning set of fashion amnesty variable into the train images.

50
00:04:54,790 --> 00:05:02,930
‫We would however get us allows us to do that in a different way in this format.

51
00:05:02,930 --> 00:05:09,230
‫You can assign the two variables brain images and three labels at the same time.

52
00:05:09,860 --> 00:05:17,070
‫If you run this line of code this will assign the x values of brain to brain images and divide value

53
00:05:17,070 --> 00:05:20,570
‫of brain to train labels.

54
00:05:20,570 --> 00:05:21,950
‫Let's give them this line of code.

55
00:05:21,960 --> 00:05:31,760
‫Now you can see that we have a brain images variable and then labels variable brain images has the X

56
00:05:31,760 --> 00:05:33,210
‫part and bring labels.

57
00:05:33,200 --> 00:05:35,960
‫I did my part.

58
00:05:35,990 --> 00:05:39,020
‫Same goes with the test images and test labels.

59
00:05:39,020 --> 00:05:39,410
‫Next one.

60
00:05:39,410 --> 00:05:44,070
‫This code also and we have two more variables.

61
00:05:45,140 --> 00:05:52,190
‫Although we have seen the structure of training data and this data if you still want to check out the

62
00:05:52,190 --> 00:05:57,700
‫structure of the new variables you can know these two lines of code.

63
00:05:58,910 --> 00:06:05,990
‫Them and within brackets variable name gives you the dimension of this variable to this variable has

64
00:06:06,920 --> 00:06:07,910
‫three dimensions.

65
00:06:07,910 --> 00:06:16,520
‫First is the sixty thousand values of different images and then 28 across 28 for all the individual

66
00:06:16,520 --> 00:06:18,290
‫pixels.

67
00:06:18,410 --> 00:06:24,710
‫If you're on the SDR command which gives you structure there'll be some additional information that

68
00:06:24,950 --> 00:06:31,920
‫it has integer type of values and the initial few values are do do do do do do.

69
00:06:33,080 --> 00:06:40,250
‫So both of these are used for the same thing to understand what is the structure of this variable that

70
00:06:40,250 --> 00:06:42,270
‫we have.

71
00:06:42,270 --> 00:06:47,190
‫Now let me show you the images so that you get a feel of what kind of data we have here.

72
00:06:49,160 --> 00:06:56,240
‫We can store the information of one image into a variable called F object.

73
00:06:56,240 --> 00:07:05,150
‫So when I've done this line of code it will assign the information of the fifth image all the pixels

74
00:07:05,750 --> 00:07:08,230
‫into this object which is f object.

75
00:07:09,560 --> 00:07:12,200
‫Let's end this.

76
00:07:12,200 --> 00:07:18,560
‫You can see that f object is a wounded cross to indicate two dimensional edit containing all the pixel

77
00:07:18,560 --> 00:07:22,640
‫data of this fifth image.

78
00:07:22,660 --> 00:07:31,280
‫Now if you want to plot this image you can then this line of code which has blocked function in plot

79
00:07:31,280 --> 00:07:31,840
‫function.

80
00:07:31,880 --> 00:07:37,730
‫We are telling that we have to plot this variable as a raster image.

81
00:07:37,760 --> 00:07:41,510
‫That estimate is basically a pixilated image.

82
00:07:41,510 --> 00:07:43,040
‫So when we done this line of code.

83
00:07:43,520 --> 00:07:46,650
‫So here you can see the image on the right.

84
00:07:46,730 --> 00:07:54,410
‫It's a small twenty eight cross twenty eight pixel image so the image quality is not good but you can

85
00:07:54,410 --> 00:07:56,210
‫make out the object.

86
00:07:56,210 --> 00:07:58,180
‫It probably looks like a top.

87
00:07:58,610 --> 00:08:05,480
‫If you want to check what it is we need to see the image label which is stored in the green label variable

88
00:08:08,970 --> 00:08:10,530
‫in the train label variable.

89
00:08:10,530 --> 00:08:13,470
‫We saw that the values are in decoded format.

90
00:08:13,470 --> 00:08:18,020
‫That is it is written from 0 to 9.

91
00:08:18,300 --> 00:08:23,390
‫So to get the actual name of the class B first grade class name.

92
00:08:23,410 --> 00:08:32,790
‫Edit this edit contains the list of names in the order in which we have coded these names so Zito stands

93
00:08:32,790 --> 00:08:42,510
‫for t shirt so if you see nine hit nine stands for ankle boot to stand spot pullover.

94
00:08:42,600 --> 00:08:44,070
‫It starts with D2.

95
00:08:44,100 --> 00:08:45,440
‫This is the second element.

96
00:08:45,660 --> 00:08:48,260
‫This is the ninth element.

97
00:08:48,570 --> 00:08:56,130
‫Once we created this array we can find out the name of this object which all t.

98
00:08:56,130 --> 00:09:07,340
‫Fifth image indeed trainee labels variable so the label of the fifth image plus one because decoding

99
00:09:07,350 --> 00:09:08,930
‫started with 0.

100
00:09:09,340 --> 00:09:16,090
‫So we just want the plus Vernet to eliminate from this edit.

101
00:09:16,750 --> 00:09:25,530
‫So let's first create the setting and now find out the name of this fifth image.

102
00:09:25,530 --> 00:09:29,100
‫You can see that the 50 made is a t shirt slash top

103
00:09:32,160 --> 00:09:35,880
‫you can check this again for an entertainment.

104
00:09:36,360 --> 00:09:47,940
‫So let's try it out for ninth image on this come on my mean slaughtered I mean this looks like a sandal

105
00:09:50,180 --> 00:09:51,450
‫not if we take the liberty

106
00:09:55,480 --> 00:10:08,210
‫to be updated and dig delivered it comes out to it so this is our data we have created for variables

107
00:10:09,690 --> 00:10:17,930
‫dream images contains all the predictive variables dream levels contains the output variable using these

108
00:10:17,930 --> 00:10:26,690
‫two variables will be bringing our model then we will be using that model to predict on the test images

109
00:10:27,230 --> 00:10:29,870
‫and we will compare the prediction of the test levels

110
00:10:32,870 --> 00:10:41,710
‫the last thing I'm going to discuss in this video is normalization of data when we have heterogeneous

111
00:10:41,710 --> 00:10:48,310
‫data a learning model takes a lot of time to converge to handle this problem we do normalization of

112
00:10:48,310 --> 00:10:58,300
‫data to normalize data usually a general formalize we subtract the mean of that variable from that variable

113
00:10:59,140 --> 00:11:02,420
‫and divided by the standard deviation.

114
00:11:02,920 --> 00:11:11,350
‫So this is the general formula but since our training data is not that heterogeneous every value is

115
00:11:11,350 --> 00:11:15,370
‫of a pixel having a value between 0 to 255.

116
00:11:17,380 --> 00:11:23,760
‫So we can just divide all the values in the pixels by 255.

117
00:11:23,800 --> 00:11:31,030
‫This will result in values between 0 to 1 and we can input these values into our operating model.

118
00:11:32,230 --> 00:11:37,390
‫So normalization is required when we have different types of variables in a dataset.

119
00:11:37,600 --> 00:11:44,590
‫If that is the case use this formula to normalize here since our model is already very homogeneous we

120
00:11:44,590 --> 00:11:53,040
‫can just divide the the pixel values by the highest value to get the simple normalized value.

121
00:11:53,530 --> 00:11:59,130
‫Now using these train and test values we'll be creating a model in the next 12.

