1
00:00:02,000 --> 00:00:03,000
Hi guys.

2
00:00:03,000 --> 00:00:10,000
In this video tutorial we will see how we can use Adobe Air for personal protective equipment detection.

3
00:00:10,000 --> 00:00:14,000
So the dataset I will be using in this project is available on Roboflow.

4
00:00:14,000 --> 00:00:22,000
Publicly, the dataset consists of 3235 images and let us show you the dataset.

5
00:00:22,000 --> 00:00:27,000
So we have seven different classes which are marked as zero 11 three four, five.

6
00:00:28,000 --> 00:00:30,000
We will change the name of this classes.

7
00:00:30,000 --> 00:00:34,000
For example, in the class number zero, it's handmade.

8
00:00:34,000 --> 00:00:37,000
So we will change the name of the class zero to help it.

9
00:00:37,000 --> 00:00:41,000
The Class 11, which is the shield.

10
00:00:41,000 --> 00:00:44,000
You can see here the person is wearing a shield.

11
00:00:44,000 --> 00:00:46,000
So it's detecting that the person is wearing the shield.

12
00:00:46,000 --> 00:00:51,000
So we will change the name of this class from 11 to shield.

13
00:00:51,000 --> 00:00:55,000
And in the third class, we can see that the person is wearing a jacket.

14
00:00:55,000 --> 00:01:03,000
So we will change the name of this class from three to jacket and plus we can further we can see the

15
00:01:03,000 --> 00:01:07,000
class for the model is just detecting the boss.

16
00:01:07,000 --> 00:01:12,000
So we will change the noun name of this class from four to mask.

17
00:01:13,000 --> 00:01:16,000
In the same way we will change the name of other classes as well.

18
00:01:16,000 --> 00:01:24,000
So like in this case we can see that in the ninth class the model is detecting that it's the class of

19
00:01:24,000 --> 00:01:24,000
boots.

20
00:01:24,000 --> 00:01:30,000
So we will change the name of of this class from nine to boots.

21
00:01:30,000 --> 00:01:30,000
Okay.

22
00:01:30,000 --> 00:01:38,000
So in this way we will rename all the classes so that when the model detects handmade jacket or boot.

23
00:01:38,000 --> 00:01:44,000
So we should know that the model has detected a jacket handmade or both instead of it should appear

24
00:01:44,000 --> 00:01:45,000
in the detected object.

25
00:01:45,000 --> 00:01:47,000
It is zero one 2 or 9.

26
00:01:47,000 --> 00:01:48,000
Okay.

27
00:01:48,000 --> 00:01:49,000
So we don't want a numeric number.

28
00:01:49,000 --> 00:01:51,000
We should we want the name of the class.

29
00:01:51,000 --> 00:01:52,000
Okay.

30
00:01:52,000 --> 00:01:55,000
So let's start towards the implementation.

31
00:01:55,000 --> 00:01:59,000
But before going towards the implementation, let us see the model of health check.

32
00:02:01,000 --> 00:02:10,000
Well, we can see that we have a total of 3235 images and we have each image is contains around in average

33
00:02:10,000 --> 00:02:11,000
4 to 5 bounding boxes.

34
00:02:11,000 --> 00:02:15,000
So in each image we are detecting 4 to 5 different classes or objects.

35
00:02:15,000 --> 00:02:27,000
So we have around 14,341 invasion in 3235 images, and each image contains around 4 to 5 bounding boxes.

36
00:02:27,000 --> 00:02:29,000
And that image size is average.

37
00:02:29,000 --> 00:02:33,000
Size is 4416, cross 416.

38
00:02:33,000 --> 00:02:41,000
But we will resize the image to 640 plus 640 because our YOLO V8 model is trained on 640 cross 640 images,

39
00:02:41,000 --> 00:02:43,000
so it's always better to keep that size.

40
00:02:43,000 --> 00:02:50,000
And here we can see that the class balance so we can see that these three last three classes are unbalanced

41
00:02:50,000 --> 00:02:53,000
so that that data set is not very much balanced.

42
00:02:53,000 --> 00:02:58,000
You can see that some classes are imbalanced or you can say that some classes are underrepresented,

43
00:02:58,000 --> 00:03:01,000
which include class four, six and 11.

44
00:03:01,000 --> 00:03:02,000
Okay.

45
00:03:02,000 --> 00:03:09,000
So if the data set, we can see that we have 2300 images for the training set, 647 images for the validation

46
00:03:09,000 --> 00:03:13,000
set and 324 images in the train testing site.

47
00:03:13,000 --> 00:03:16,000
So our split ratio is 70, 2010.

48
00:03:16,000 --> 00:03:22,000
70% is for the training, 20% is for the validation set and 10% for the testing set.

49
00:03:22,000 --> 00:03:23,000
Okay.

50
00:03:23,000 --> 00:03:25,000
So to import this model.

51
00:03:26,000 --> 00:03:28,000
Into our basic CoLab notebook.

52
00:03:28,000 --> 00:03:32,000
We will click on download and just show down the stacked format.

53
00:03:32,000 --> 00:03:39,000
So basically we will select YOLO V5 PyTorch because Ultralytics has also basically invented Yolov5 PyTorch.

54
00:03:39,000 --> 00:03:43,000
So if we choose Yolov5 PI torch or V8, it's the same.

55
00:03:43,000 --> 00:03:44,000
So.

56
00:03:45,000 --> 00:03:51,000
So just click on show, download ebook and just copy this from here and just click and copy.

57
00:03:51,000 --> 00:03:56,000
So now we will just paste this code into our Google CoLab notebook.

58
00:03:56,000 --> 00:04:01,000
And in this way we will export this dataset from here into our Google CoLab notebook.

59
00:04:01,000 --> 00:04:06,000
So before running the script, please make sure that you have selected the runtime as GPU.

60
00:04:06,000 --> 00:04:13,000
Okay, now we will click on import OS, basically import OS because we are using OS to create a helper

61
00:04:13,000 --> 00:04:19,000
variable over here so that we can navigate to different files dataset easily.

62
00:04:20,000 --> 00:04:25,000
We're just running this cell first, then we are using Glob is used to return all flight paths.

63
00:04:25,000 --> 00:04:30,000
Like if you want the input images file path, we can easily return using glob library.

64
00:04:31,000 --> 00:04:34,000
Then we are importing image and display.

65
00:04:34,000 --> 00:04:39,000
Basically, we are using these two libraries to display any output image like predicted output image

66
00:04:39,000 --> 00:04:43,000
or any input image into our Google CoLab notebook.

67
00:04:43,000 --> 00:04:47,000
So it will display any output or input image into our Google CoLab notebook.

68
00:04:47,000 --> 00:04:55,000
We require image and display library and using display dot that output function we are basically this

69
00:04:55,000 --> 00:04:58,000
function is used to clear output in the notebook.

70
00:04:58,000 --> 00:05:03,000
So if you want to clear output in the notebook we use via dash output function.

71
00:05:06,000 --> 00:05:12,000
Okay, So just okay, so we can leave it for now as well because.

72
00:05:12,000 --> 00:05:15,000
Because currently we don't have any output, so we don't need this.

73
00:05:15,000 --> 00:05:19,000
So in the first step we need to check whether we have access to GPU or not.

74
00:05:19,000 --> 00:05:21,000
So we'll just run this cell.

75
00:05:22,000 --> 00:05:27,000
And it will show us that whether we are using or not so GPU memory uses.

76
00:05:27,000 --> 00:05:28,000
So that's fine.

77
00:05:28,000 --> 00:05:33,000
So we are defining our basically run directory here.

78
00:05:33,000 --> 00:05:36,000
So this is our home directory which is over here.

79
00:05:36,000 --> 00:05:38,000
This all is our home directory.

80
00:05:38,000 --> 00:05:42,000
So now we will install a text using PIP store.

81
00:05:42,000 --> 00:05:47,000
Basically it can be installed in two ways from the source phone or via PIP.

82
00:05:47,000 --> 00:05:55,000
So V8 is the first version of YOLO, the first iteration of YOLO, which has its own optional package,

83
00:05:55,000 --> 00:05:55,000
then YOLO.

84
00:05:55,000 --> 00:05:56,000
V7.

85
00:05:56,000 --> 00:06:01,000
They don't have their own official package, so YOLO behaved as its own official package.

86
00:06:01,000 --> 00:06:03,000
So use installing ultralytics.

87
00:06:03,000 --> 00:06:10,000
We can install YOLO and if you use PIP install ultralytics it will install the V8 version.

88
00:06:10,000 --> 00:06:14,000
So if you do PIP install ultralytics will install the YOLO V8 version.

89
00:06:14,000 --> 00:06:18,000
So there is another way you can install or implement YOLO.

90
00:06:18,000 --> 00:06:23,000
V8 is to clone the github repo of your ultralytics yolo v8 GitHub repo.

91
00:06:23,000 --> 00:06:29,000
So if you want to clone the implement using by cloning the GitHub repo, you can also do this.

92
00:06:29,000 --> 00:06:32,000
Here is the code written to clone the GitHub repo.

93
00:06:32,000 --> 00:06:37,000
So what will follow the easy way, which is the easy way is use to install ultralytics.

94
00:06:37,000 --> 00:06:43,000
We use to clone the github repo in that cases where we need to make some changes in the code.

95
00:06:43,000 --> 00:06:50,000
For example, I want to add the speed estimation in the script in my prediction.py file so that in that

96
00:06:50,000 --> 00:06:56,000
case I will go on the github repo where I need to make some change in the predict.py with a train.py.

97
00:06:56,000 --> 00:07:02,000
In this case I don't need to make change in the predict.py or the train.py or the validation script.

98
00:07:02,000 --> 00:07:08,000
So in the current step straight in the current project we are not making any change in the predictions.

99
00:07:08,000 --> 00:07:09,000
We have training, script or a validation step.

100
00:07:10,000 --> 00:07:12,000
So we are not cloning the GitHub repo.

101
00:07:12,000 --> 00:07:18,000
We need to add some code in the training validation or prediction step then will clone the GitHub repo.

102
00:07:18,000 --> 00:07:23,000
So I'm using PIP install activities to install the latest version of YOLO, which is YOLO V8.

103
00:07:25,000 --> 00:07:29,000
So this will install the latest version of YOLO, which is YOLO V8.

104
00:07:29,000 --> 00:07:31,000
So it might take some seconds.

105
00:07:32,000 --> 00:07:33,000
Okay.

106
00:07:35,000 --> 00:07:42,000
So now we will import ultralytics to check whether our YOLO model installed and it's working fine or

107
00:07:42,000 --> 00:07:43,000
not.

108
00:07:43,000 --> 00:07:44,000
If it's not working, fine.

109
00:07:45,000 --> 00:07:46,000
And see what's the issue.

110
00:07:46,000 --> 00:07:48,000
But it's working fine over here.

111
00:07:48,000 --> 00:07:48,000
So.

112
00:07:48,000 --> 00:07:49,000
Okay, now.

113
00:07:49,000 --> 00:07:52,000
Now we will import the data detection dataset.

114
00:07:52,000 --> 00:07:53,000
This dataset from Roboflow.

115
00:07:53,000 --> 00:07:56,000
So first we will create a folder over here.

116
00:07:56,000 --> 00:07:59,000
So what choice is click on new folder or.

117
00:07:59,000 --> 00:08:06,000
So create a folder here, but we will create a folder using mkdir and the name of the folder is datasets.

118
00:08:06,000 --> 00:08:09,000
Okay, so you can change the name of the folder as well.

119
00:08:09,000 --> 00:08:14,000
So now you can see over here we have a folder by the name of datasets.

120
00:08:14,000 --> 00:08:18,000
Okay, this is an empty folder, but we will download the dataset into this folder.

121
00:08:18,000 --> 00:08:19,000
Okay.

122
00:08:19,000 --> 00:08:26,000
So just seeing what is our present working directory now setting the current directory as this datasets

123
00:08:26,000 --> 00:08:31,000
folder so that we can download the dataset directly into this datasets folder.

124
00:08:31,000 --> 00:08:37,000
So just copy this from here and just remove this and just paste this over here.

125
00:08:37,000 --> 00:08:44,000
So now you will be able to download the personal protective equipment of dataset from Roboflow to your

126
00:08:44,000 --> 00:08:45,000
Google CoLab notebook.

127
00:08:47,000 --> 00:08:52,000
So the dataset is being downloaded from here, so it might take some time.

128
00:08:52,000 --> 00:08:53,000
Okay, so.

129
00:08:54,000 --> 00:09:03,000
It's downloading and it's around 6%, 7%, because that is consist of around 303,255 images.

130
00:09:03,000 --> 00:09:07,000
So it will take some time to download, so please bear with me.

131
00:09:07,000 --> 00:09:09,000
And this is the datasets get download.

132
00:09:09,000 --> 00:09:16,000
So it's at 42% currently and 5,450% 5,560%.

133
00:09:17,000 --> 00:09:22,000
So in this way we will download our dataset completely, but it will take some time.

134
00:09:23,000 --> 00:09:26,000
Okay, it's 90 and it's 99%.

135
00:09:26,000 --> 00:09:27,000
Okay.

136
00:09:27,000 --> 00:09:33,000
The dataset is downloaded over here and we can see, okay, so if you happen some slide, you then just

137
00:09:33,000 --> 00:09:39,000
need to reload this because if I click over here a lot of files appear, so it just need to reload it

138
00:09:40,000 --> 00:09:41,000
and it will work fine.

139
00:09:41,000 --> 00:09:43,000
So don't need to worry in any case.

140
00:09:44,000 --> 00:09:44,000
Okay.

141
00:09:44,000 --> 00:09:51,000
So just wait for few seconds and here is our datasets folder and here is our detection datasets which

142
00:09:51,000 --> 00:09:55,000
consist of training, test and validation.

143
00:09:55,000 --> 00:09:55,000
Okay?

144
00:09:55,000 --> 00:10:01,000
And this is our data file, which consists of nine different seven different classes.

145
00:10:01,000 --> 00:10:02,000
Okay.

146
00:10:02,000 --> 00:10:05,000
One, two, three, four, five, six, seven different classes here.

147
00:10:05,000 --> 00:10:06,000
Okay.

148
00:10:06,000 --> 00:10:09,000
So first we just need to do one thing.

149
00:10:09,000 --> 00:10:15,000
I have told you that we will rename this classes by their names of the object, so I have done this.

150
00:10:15,000 --> 00:10:20,000
So let me upload that updated file data dot yml file over here.

151
00:10:20,000 --> 00:10:22,000
So just give me a minute.

152
00:10:22,000 --> 00:10:23,000
I will upload this.

153
00:10:28,000 --> 00:10:33,000
So, guys, you can see over here, this is my updated data dot yml file.

154
00:10:33,000 --> 00:10:39,000
So you can see that we have a class name against each class, like protective helmet, shield jacket,

155
00:10:39,000 --> 00:10:42,000
dust mask, eyewear glove, protective boots.

156
00:10:42,000 --> 00:10:44,000
So it's not a number.

157
00:10:44,000 --> 00:10:45,000
We have a class name against each class.

158
00:10:45,000 --> 00:10:53,000
Okay, so we have downloaded the data set and just checking we the data set location over here.

159
00:10:54,000 --> 00:10:54,000
Okay.

160
00:10:54,000 --> 00:10:55,000
So.

161
00:10:57,000 --> 00:11:03,000
So basically, if you want to train, validate and run inference or model and you don't need to do any

162
00:11:03,000 --> 00:11:08,000
modification in the code, for example, don't need to add any speed estimation or tracking or any other

163
00:11:08,000 --> 00:11:13,000
step, then you use command line interface is the easiest way to do it.

164
00:11:13,000 --> 00:11:19,000
So we are currently using the command line interface to implement the training, validation and testing

165
00:11:19,000 --> 00:11:20,000
of our model.

166
00:11:20,000 --> 00:11:21,000
Okay.

167
00:11:21,000 --> 00:11:24,000
So if you want to do detection, you can select cross is equal to detect.

168
00:11:25,000 --> 00:11:29,000
And if you want to do classification, you can select cross is equal to classify or segmentation.

169
00:11:29,000 --> 00:11:31,000
You can select class is equal to segment.

170
00:11:31,000 --> 00:11:33,000
But we added to doing detection over here.

171
00:11:33,000 --> 00:11:38,000
So we have selected class is equal to detect and what is equal to train because we have first training

172
00:11:38,000 --> 00:11:43,000
our model and here we have chosen the YOLO V8 medium model and so on.

173
00:11:44,000 --> 00:11:45,000
Okay, so.

174
00:11:46,000 --> 00:11:48,000
Before we start the training.

175
00:11:48,000 --> 00:11:49,000
We just need to make few changes over here.

176
00:11:49,000 --> 00:11:56,000
We just need to rename this folder and just click one enter over here and just open the data dot yaml

177
00:11:56,000 --> 00:11:56,000
file.

178
00:11:58,000 --> 00:11:59,000
Okay.

179
00:11:59,000 --> 00:12:01,000
And just open the data dot yml file.

180
00:12:02,000 --> 00:12:03,000
And just.

181
00:12:05,000 --> 00:12:08,000
Just go to train and just copy part.

182
00:12:09,000 --> 00:12:14,000
And just paste it over here and just go to valid copy this path.

183
00:12:15,000 --> 00:12:17,000
And just paste it over here.

184
00:12:18,000 --> 00:12:19,000
Okay.

185
00:12:19,000 --> 00:12:20,000
And just save it.

186
00:12:20,000 --> 00:12:22,000
Okay, so just click one ctrl s.

187
00:12:24,000 --> 00:12:29,000
And now we are training our model for 90 parks and we are taking our image size at 640.

188
00:12:29,000 --> 00:12:35,000
And here is our data dot yml file path over here and let's run the training.

189
00:12:42,000 --> 00:12:43,000
So first we are downloading our model.

190
00:12:47,000 --> 00:12:53,000
So the training will take around 2 to 3 hours, but we will stop the recording and we will be back when

191
00:12:53,000 --> 00:12:54,000
the training completes.

192
00:12:55,000 --> 00:12:56,000
Okay.

193
00:12:56,000 --> 00:12:57,000
Well.

194
00:12:58,000 --> 00:13:00,000
The training has started.

195
00:13:00,000 --> 00:13:01,000
You can see over here.

196
00:13:01,000 --> 00:13:04,000
The training has started, but it will take quite some time.

197
00:13:04,000 --> 00:13:09,000
So we will pause the video as the training gets started so you can see that the training has started.

198
00:13:09,000 --> 00:13:13,000
So as the training complete, I will be back and explain you the rest of the code.

199
00:13:13,000 --> 00:13:16,000
Till then, see you then when the training completes.

200
00:13:19,000 --> 00:13:21,000
Guys the training of the model has completed.

201
00:13:21,000 --> 00:13:25,000
We have trained our model on 90 epochs with image size 640.

202
00:13:26,000 --> 00:13:29,000
And here are the training results we get.

203
00:13:32,000 --> 00:13:36,000
So in total, we have trained our model on 90 epochs.

204
00:13:36,000 --> 00:13:41,000
So it took around three hours for the training to complete.

205
00:13:41,000 --> 00:13:48,000
And we get we have got the best weights file and the 90th epoch last weights file as well.

206
00:13:48,000 --> 00:13:55,000
Okay, so we have seven different classes, protective helmet, shield jacket, dust mask and gloves

207
00:13:55,000 --> 00:13:56,000
and protective boots.

208
00:13:56,000 --> 00:14:02,000
So here we got the mean average precision with IOU 50 and mean average precision.

209
00:14:02,000 --> 00:14:06,000
When IOU varies from 50 to 95% for each of the class.

210
00:14:06,000 --> 00:14:17,000
So we can see that we have a very good for all the classes like it's around 97.8%, 66.1% and it's 92.8%

211
00:14:17,000 --> 00:14:24,000
for the jacket class and for 96.8% for the dust mask and IOU.

212
00:14:25,000 --> 00:14:28,000
When IOU varies from 50 to 95 with mean average precision.

213
00:14:28,000 --> 00:14:31,000
So we can see that here.

214
00:14:31,000 --> 00:14:33,000
We have also got very good results as well.

215
00:14:33,000 --> 00:14:41,000
So we have a stored our results in program runs detect train to.

216
00:14:45,000 --> 00:14:48,000
And here we can see that here is the word weights file.

217
00:14:48,000 --> 00:14:54,000
And here we have the F1 curve to check what different files or results we have in this folder.

218
00:14:54,000 --> 00:14:55,000
Just run this cell.

219
00:14:57,000 --> 00:14:59,000
So you can see that we have the confusion matrix.

220
00:15:00,000 --> 00:15:04,000
So confusion Matrix basically tells us how our model handles different classes.

221
00:15:04,000 --> 00:15:10,000
We have the F1 curve, precision curve, precision recall curves, and then we also have the recall

222
00:15:10,000 --> 00:15:13,000
curve, and then we have the results in the form of CSP.

223
00:15:13,000 --> 00:15:19,000
So results dot csv file show the performance of the model on each of the epoch in the results dot PNG.

224
00:15:20,000 --> 00:15:25,000
We have the training and validation losses and then we have the model predictions on the validation

225
00:15:25,000 --> 00:15:26,000
batches as well.

226
00:15:27,000 --> 00:15:27,000
Okay.

227
00:15:27,000 --> 00:15:31,000
So let's first see what the confusion matrix we are getting.

228
00:15:31,000 --> 00:15:35,000
So confusion matrix is the chart that shows how our model handles different classes.

229
00:15:35,000 --> 00:15:43,000
So if we consider a jacket, for example, so we can see that 92% of the time our model detected correctly

230
00:15:43,000 --> 00:15:48,000
that a person is wearing a jacket, while 1% of the time we get the bounding box.

231
00:15:48,000 --> 00:15:57,000
But the jacket is incorrectly classified as the ideal, while 7% of the time our model will be trained

232
00:15:57,000 --> 00:15:58,000
fine tune.

233
00:15:58,000 --> 00:16:05,000
Your model for detection is unable to classify that the person is wearing a jacket or model is unable

234
00:16:05,000 --> 00:16:08,000
to detect that the person is wearing a jacket.

235
00:16:08,000 --> 00:16:13,000
So a model does not detect anything, although the person is wearing a jacket.

236
00:16:13,000 --> 00:16:13,000
Okay.

237
00:16:13,000 --> 00:16:19,000
So you can see that while 7% of the time when person is wearing the jacket, the model is unable to

238
00:16:19,000 --> 00:16:19,000
detect it.

239
00:16:19,000 --> 00:16:24,000
So in this way we can get information from the confusion matrix.

240
00:16:24,000 --> 00:16:30,000
Here we can see that 96% of times our model has detected correctly that the person is wearing a helmet,

241
00:16:30,000 --> 00:16:36,000
while 4% of the time when the person is wearing a helmet, our model is unable to detect the helmet.

242
00:16:38,000 --> 00:16:41,000
So here we have the graph with the training and validation loss.

243
00:16:41,000 --> 00:16:45,000
So we have the graphs of training and validation loss in the results dot PNG.

244
00:16:46,000 --> 00:16:49,000
So we can see that the loss value is continuously decreasing.

245
00:16:49,000 --> 00:16:54,000
So if we train this model on 200 to 50 epochs, we can get some more better results.

246
00:16:54,000 --> 00:17:00,000
While you can see that mean average precision is continuously increasing and recall is also getting

247
00:17:00,000 --> 00:17:01,000
better.

248
00:17:01,000 --> 00:17:08,000
So here you can see that mean average precision curve is continuously increasing with 50 and 50 to 95.

249
00:17:08,000 --> 00:17:10,000
So our results are quite good.

250
00:17:11,000 --> 00:17:14,000
So here are the model predictions on the validation batch.

251
00:17:14,000 --> 00:17:16,000
So these images are not used for training.

252
00:17:16,000 --> 00:17:22,000
So it's always better to have a looks like here model is just detecting that person is wearing the dust

253
00:17:22,000 --> 00:17:23,000
mask here.

254
00:17:23,000 --> 00:17:28,000
The model is detecting that the person are wearing gloves on both of his hands and the person is also

255
00:17:28,000 --> 00:17:30,000
wearing protective boots as well.

256
00:17:30,000 --> 00:17:33,000
So the model is working quite fine.

257
00:17:33,000 --> 00:17:35,000
Like here is not wear a glove.

258
00:17:35,000 --> 00:17:39,000
So here's the model has not detected that a person has wear a glove while the person has.

259
00:17:39,000 --> 00:17:41,000
Here we have been wearing a glove.

260
00:17:41,000 --> 00:17:44,000
So model is able to detect that the person is wearing a glove.

261
00:17:45,000 --> 00:17:50,000
Okay, so let's validate the model or validate our custom order.

262
00:17:50,000 --> 00:17:58,000
So we are taking the best weights which we got over here, best dot, and we will validate our custom

263
00:17:58,000 --> 00:17:58,000
model.

264
00:17:58,000 --> 00:18:04,000
So similarly as before, we are using command line interface to do the training.

265
00:18:04,000 --> 00:18:07,000
So we will be using command line interface to validate the model.

266
00:18:07,000 --> 00:18:10,000
So previously we utilized light mode is equal to train.

267
00:18:10,000 --> 00:18:14,000
So here we will write mod is equal to validate and we are performing detection.

268
00:18:14,000 --> 00:18:16,000
So DOS is equal to detect.

269
00:18:16,000 --> 00:18:23,000
And here we are just passing data dot yml file path, which basically data dot file contains our training

270
00:18:23,000 --> 00:18:25,000
set, testing test and validation test images path.

271
00:18:26,000 --> 00:18:33,000
Okay, so here we are validating our custom model so we can see that we also got very good result in

272
00:18:33,000 --> 00:18:37,000
terms of mean average precision with 50 and mean average precision.

273
00:18:37,000 --> 00:18:39,000
When IOU varies from 50 to 95.

274
00:18:39,000 --> 00:18:41,000
So all the results are quite good.

275
00:18:42,000 --> 00:18:45,000
So here we are doing inference machine with custom model.

276
00:18:45,000 --> 00:18:52,000
So inference means a prediction that we can run on image to detect a label, whether it is classification

277
00:18:52,000 --> 00:18:55,000
or a bounding box or a segmentation.

278
00:18:55,000 --> 00:18:59,000
So here we are testing model on the test dataset images.

279
00:18:59,000 --> 00:19:04,000
So here I pass that path for the test dataset images for example.

280
00:19:05,000 --> 00:19:05,000
Let me show you.

281
00:19:05,000 --> 00:19:09,000
Here we have the data set and here we have the test images.

282
00:19:09,000 --> 00:19:13,000
So we have just copied this path and just paste it over here.

283
00:19:14,000 --> 00:19:15,000
Let me do it again.

284
00:19:17,000 --> 00:19:17,000
There have just.

285
00:19:19,000 --> 00:19:19,000
Wait.

286
00:19:19,000 --> 00:19:25,000
I have just added a part of my test dataset images and this is the path of my best weights file which

287
00:19:25,000 --> 00:19:26,000
am added over here.

288
00:19:26,000 --> 00:19:32,000
You can see and I am not doing prediction, so I am just doing more to prediction.

289
00:19:32,000 --> 00:19:34,000
Previously I was doing validation.

290
00:19:34,000 --> 00:19:36,000
So I've written where when I was doing training.

291
00:19:36,000 --> 00:19:39,000
So I was written train and the task is detect.

292
00:19:39,000 --> 00:19:43,000
So now we have been detection of object detection.

293
00:19:43,000 --> 00:19:51,000
So I have written the text detect, okay, so just run this cell and it will throw test our model on

294
00:19:51,000 --> 00:19:53,000
that test dataset images.

295
00:19:53,000 --> 00:19:56,000
So it might take few seconds to run.

296
00:19:56,000 --> 00:19:58,000
Then we will see what the results do we get.

297
00:20:00,000 --> 00:20:05,000
So please wait for two seconds until it drains completely.

298
00:20:05,000 --> 00:20:07,000
Then we will see what results we are getting.

299
00:20:07,000 --> 00:20:10,000
So the model has run on the test dataset images.

300
00:20:10,000 --> 00:20:15,000
So they are 324 images and the results are saved in the prediction.

301
00:20:15,000 --> 00:20:17,000
Predict eight file.

302
00:20:21,000 --> 00:20:24,000
As the results are saving run detect predicted.

303
00:20:24,000 --> 00:20:27,000
So just copy this path and just paste it over here.

304
00:20:27,000 --> 00:20:33,000
So as we have total 324 images, so we will not display all the 324 images.

305
00:20:33,000 --> 00:20:36,000
What results do we get on all the 324 images?

306
00:20:36,000 --> 00:20:41,000
Instead, we will only check the results on the first five images just sitting over here.

307
00:20:41,000 --> 00:20:48,000
So here to display output on the Google CoLab notebook I have imported from IPython dot display import

308
00:20:48,000 --> 00:20:49,000
image display.

309
00:20:49,000 --> 00:20:54,000
So just run this cell now and see what results do we get.

310
00:20:54,000 --> 00:20:57,000
So it might take few seconds, so please wait.

311
00:20:57,000 --> 00:21:03,000
So here you can see that our model is able to detect correctly the protected boots, the jacket, the

312
00:21:03,000 --> 00:21:06,000
gloves, dust masks, the protected helmet.

313
00:21:06,000 --> 00:21:08,000
So results are very wonderful here.

314
00:21:08,000 --> 00:21:13,000
The model is also able to detect the jacket, protective helmet.

315
00:21:13,000 --> 00:21:19,000
And here the model is also able to detect the protective helmets which are all below the gloves, everything.

316
00:21:19,000 --> 00:21:26,000
And here the model is also able to detect the dust mask and the protective helmet as well.

317
00:21:26,000 --> 00:21:29,000
So the results of the model are quite impressive.

318
00:21:31,000 --> 00:21:38,000
Guys, let's test our model on some demo videos and see how our model performs on the demo videos.

319
00:21:38,000 --> 00:21:42,000
So here I am downloading a demo video directly from my Google Drive.

320
00:21:42,000 --> 00:21:43,000
So just run this cell.

321
00:21:44,000 --> 00:21:49,000
The name of that video is demo dot MP4, so it might take few seconds to download.

322
00:21:49,000 --> 00:21:51,000
The video is downloaded.

323
00:21:51,000 --> 00:21:54,000
So now let's test our model on this demo video.

324
00:21:54,000 --> 00:21:57,000
I have passed my best model weights over here.

325
00:21:57,000 --> 00:22:03,000
I have set the confidence to 0.25 cause we are performing object detection and we are doing prediction.

326
00:22:04,000 --> 00:22:06,000
And here is that demo video path.

327
00:22:06,000 --> 00:22:10,000
So just run this cell and see what results do we get.

328
00:22:10,000 --> 00:22:13,000
So it might take few seconds to run this demo video.

329
00:22:14,000 --> 00:22:20,000
So because it will process the video frame by frame so we can see that the model is detecting the protective

330
00:22:20,000 --> 00:22:26,000
helmets, jackets and thus frame rate is 27.2 millisecond.

331
00:22:26,000 --> 00:22:27,000
Okay.

332
00:22:27,000 --> 00:22:30,000
So it's 70, 80, 90.

333
00:22:30,000 --> 00:22:32,000
This is a progressive thing.

334
00:22:32,000 --> 00:22:36,000
We are using GPU so it's a bit fast than CPU CPU in CPU.

335
00:22:36,000 --> 00:22:40,000
It takes quite some time to put to training and to do prediction.

336
00:22:40,000 --> 00:22:42,000
So it's always better to do a test.

337
00:22:42,000 --> 00:22:46,000
Our model on GPU and Google CoLab offers some free GPU as well.

338
00:22:46,000 --> 00:22:49,000
So I'm using Google CoLab free GPU.

339
00:22:50,000 --> 00:22:51,000
Okay guys.

340
00:22:51,000 --> 00:22:52,000
So.

341
00:22:54,000 --> 00:22:59,000
Might take some more time, although we can see that the model is detecting and protective helmets as

342
00:22:59,000 --> 00:23:00,000
well as jackets.

343
00:23:00,000 --> 00:23:01,000
So.

344
00:23:02,000 --> 00:23:05,000
Our results will be saved in runs the.

345
00:23:07,000 --> 00:23:09,000
And train on him.

346
00:23:09,000 --> 00:23:10,000
The prediction too.

347
00:23:10,000 --> 00:23:13,000
And this is our output demo video.

348
00:23:13,000 --> 00:23:14,000
Okay.

349
00:23:14,000 --> 00:23:15,000
So.

350
00:23:16,000 --> 00:23:17,000
As this completed.

351
00:23:17,000 --> 00:23:19,000
Okay, so our results are saved in runs.

352
00:23:19,000 --> 00:23:19,000
Detect predict.

353
00:23:19,000 --> 00:23:20,000
True.

354
00:23:20,000 --> 00:23:23,000
So as our model, this file is a bit large.

355
00:23:23,000 --> 00:23:25,000
So I think it will not be able to display here.

356
00:23:25,000 --> 00:23:31,000
So it's better to download this demo video from here and see what results do actually we get.

357
00:23:32,000 --> 00:23:32,000
Okay.

358
00:23:34,000 --> 00:23:36,000
So it might take some time to download.

359
00:23:36,000 --> 00:23:39,000
So I will pause, pause this video as it downloads.

360
00:23:39,000 --> 00:23:43,000
I will be back and then we will check what the output do we get.

361
00:23:46,000 --> 00:23:50,000
It was able to download the demo video and let me play showing the output.

362
00:23:50,000 --> 00:23:56,000
So our model is able to detect the jacket's protective jacket, the protective helmet.

363
00:23:56,000 --> 00:24:01,000
You can see that the both person protective helmet and the jacket is rejected, although the model is

364
00:24:01,000 --> 00:24:08,000
not detecting the gloves, but the model is able to detect the jacket as well as the protective helmet.

365
00:24:08,000 --> 00:24:15,000
So let's test our model on some other demo videos and see what results do we get over there.

366
00:24:15,000 --> 00:24:21,000
So here we will download the demo video to and see what results we get on that test demo video.

367
00:24:21,000 --> 00:24:23,000
So let me test this.

368
00:24:23,000 --> 00:24:25,000
Our model on this demo video to.

369
00:24:26,000 --> 00:24:28,000
Actually, the name of the video is Demo three.

370
00:24:28,000 --> 00:24:30,000
So let's run this.

371
00:24:30,000 --> 00:24:36,000
So it might take two seconds to execute, but we can see that the model is detecting two protective

372
00:24:36,000 --> 00:24:37,000
helmets, one jacket.

373
00:24:38,000 --> 00:24:39,000
Okay?

374
00:24:39,000 --> 00:24:43,000
Two protective helmets, one jacket, two protective helmets, one jacket.

375
00:24:43,000 --> 00:24:46,000
It might take some few minutes to run.

376
00:24:46,000 --> 00:24:53,000
And then we will download the output and then we will see what results we get.

377
00:24:53,000 --> 00:24:59,000
Our output will be saved in the prediction three And this is the name of our output video demo three

378
00:24:59,000 --> 00:25:00,000
dot mp4.

379
00:25:00,000 --> 00:25:03,000
Okay, so I will just download from here.

380
00:25:03,000 --> 00:25:06,000
So it might take some time to download as the download complete.

381
00:25:06,000 --> 00:25:08,000
I will be back with the output video.

382
00:25:11,000 --> 00:25:17,000
Well, guys, I was able to download the output demo video and let me play the output demo video so

383
00:25:17,000 --> 00:25:21,000
you can see that the model is detecting protective helmet jacket.

384
00:25:21,000 --> 00:25:25,000
And here also the model is detecting the jacket, the protective helmet.

385
00:25:25,000 --> 00:25:30,000
The persons are not wearing the gloves, so the model is not detecting it.

386
00:25:30,000 --> 00:25:37,000
So let's test our model on the demo video three and see what results do we get over there.

387
00:25:37,000 --> 00:25:38,000
Okay.

388
00:25:38,000 --> 00:25:42,000
So let's test our model on the demo three.

389
00:25:42,000 --> 00:25:46,000
So I'm actually not displaying the results over here because these file sizes are large.

390
00:25:46,000 --> 00:25:50,000
So if not, we will not be able to show this output.

391
00:25:50,000 --> 00:25:51,000
Demo videos over here.

392
00:25:51,000 --> 00:25:57,000
I have already tested it, so not wasting time to show it, trying to show the results over here.

393
00:25:57,000 --> 00:26:02,000
So downloading the demo video three from my drive directly into the Google CoLab notebook.

394
00:26:02,000 --> 00:26:10,000
So it might take some time and then I will run the model on the demo video three and see what actual

395
00:26:10,000 --> 00:26:11,000
results do we get?

396
00:26:12,000 --> 00:26:14,000
So it might take some time.

397
00:26:14,000 --> 00:26:15,000
So.

398
00:26:16,000 --> 00:26:23,000
Okay, So motor is detecting two protective helmets, one jacket, two protective helmets, one jacket,

399
00:26:24,000 --> 00:26:26,000
two protective helmets, one jacket.

400
00:26:26,000 --> 00:26:27,000
Fine.

401
00:26:27,000 --> 00:26:27,000
That's good.

402
00:26:28,000 --> 00:26:30,000
It's the model's jacket into protective helmets.

403
00:26:30,000 --> 00:26:31,000
One jacket.

404
00:26:32,000 --> 00:26:32,000
Okay.

405
00:26:32,000 --> 00:26:35,000
So it's total 337 frames we have.

406
00:26:35,000 --> 00:26:36,000
And it's currently we have process.

407
00:26:36,000 --> 00:26:39,000
The model has processed 177 frames.

408
00:26:39,000 --> 00:26:40,000
Okay.

409
00:26:40,000 --> 00:26:53,000
So 206 frames, 20, 24, 45 to 50 6 to 62, 62 to 70 3 to 95 and to 99.

410
00:26:53,000 --> 00:26:55,000
And it's around over.

411
00:26:55,000 --> 00:27:00,000
So let me download this output demo video and show you what results should we get.

412
00:27:00,000 --> 00:27:03,000
The output is saved in prediction for demo for.

413
00:27:03,000 --> 00:27:07,000
Let me download this output demo video and then we're back with the results.

414
00:27:10,000 --> 00:27:16,000
Well, guys, these are the results from my output demo video three So we can see that the model is

415
00:27:16,000 --> 00:27:20,000
able to detect the protective helmet jackets as well.

416
00:27:20,000 --> 00:27:22,000
So this person is not wearing the jacket.

417
00:27:22,000 --> 00:27:27,000
So the model is not detecting the jacket while these two person are wearing the protective helmet.

418
00:27:27,000 --> 00:27:30,000
So the model is able to detect the protective helmet and this person jacket.

419
00:27:30,000 --> 00:27:34,000
Now, let's test our model on the live webcam.

420
00:27:34,000 --> 00:27:39,000
Although I'm not wearing a protective personal protective equipment, but I want to see if I wear a

421
00:27:39,000 --> 00:27:40,000
mask.

422
00:27:40,000 --> 00:27:44,000
The model is able to detect it or not while on the videos and images.

423
00:27:44,000 --> 00:27:46,000
Our results are very good.

424
00:27:46,000 --> 00:27:50,000
So let's test our model on the live webcam in the next part of the video.

425
00:27:50,000 --> 00:27:51,000
Thanks for watching.