1
00:00:02,000 --> 00:00:03,000
Hello everyone.

2
00:00:03,000 --> 00:00:04,000
This video tutorial.

3
00:00:04,000 --> 00:00:09,000
We will see how we can use YOLO V8 for segmentation on images and videos.

4
00:00:09,000 --> 00:00:14,000
So this is a complete step by step guideline, so don't skip any part of the video and do watch the

5
00:00:14,000 --> 00:00:15,000
complete video.

6
00:00:15,000 --> 00:00:18,000
So YOLO V8 comes with five different models.

7
00:00:18,000 --> 00:00:26,000
So starting from your V8 segmentation and ending to YOLO, V8 X segmentation, YOLO V8 and segmentation

8
00:00:26,000 --> 00:00:32,000
is less accurate, but it is more fast than other YOLO V8 model, while YOLO V8 X is the most accurate

9
00:00:32,000 --> 00:00:39,000
model in the YOLO V8 series, but it is not very much fast as compared to the other YOLO V8 model.

10
00:00:39,000 --> 00:00:46,000
So we can say that YOLO v8 n is the fastest, but it is less accurate while your YOLO v8 X is the most

11
00:00:46,000 --> 00:00:51,000
accurate, but it is in less fast than other YOLO V8 models.

12
00:00:51,000 --> 00:00:57,000
So here we will see how we can implement YOLO V8 segmentation in Google CoLab.

13
00:00:58,000 --> 00:01:04,000
Okay, so first of all I am just importing the required library from IPython dot display import image

14
00:01:05,000 --> 00:01:06,000
importing this library.

15
00:01:07,000 --> 00:01:13,000
Basically we use this library if we want to display any input or output image in our Google CoLab notebook.

16
00:01:13,000 --> 00:01:16,000
So then I am installing ultralytics.

17
00:01:16,000 --> 00:01:16,000
So yolo.

18
00:01:16,000 --> 00:01:22,000
It can be implemented in two ways by cloning the GitHub repo of YOLO V8 which is over here.

19
00:01:22,000 --> 00:01:28,000
So you can simply go over here and just clone this GitHub repo or you can install the package of YOLO

20
00:01:28,000 --> 00:01:30,000
V8 using ultralytics.

21
00:01:30,000 --> 00:01:31,000
So.

22
00:01:32,000 --> 00:01:36,000
Week so we can simply implement YOLO V8 by installing the package.

23
00:01:36,000 --> 00:01:37,000
Ultralytics.

24
00:01:37,000 --> 00:01:44,000
So here I'm just so if you want to simple train test or validate your model so it's always better to

25
00:01:44,000 --> 00:01:46,000
implement YOLO V8 by installing the package.

26
00:01:46,000 --> 00:01:47,000
Ultralytics.

27
00:01:47,000 --> 00:01:52,000
Or if you want to make some changes in the predict.py or in the training.py file.

28
00:01:52,000 --> 00:01:55,000
So then it's all you need to clone the github repo.

29
00:01:55,000 --> 00:02:01,000
For example, if I want to add a speed estimation or distance calculation code in the predict.py file,

30
00:02:01,000 --> 00:02:07,000
then I need to clone the github repo while in case of simple train test validation you just need to

31
00:02:07,000 --> 00:02:11,000
install the package ultralytics to implement YOLO v8.

32
00:02:11,000 --> 00:02:12,000
So here I'm just doing it.

33
00:02:12,000 --> 00:02:17,000
So before running the script please make sure that you are selecting the runtime as GPU.

34
00:02:17,000 --> 00:02:20,000
Then just run this cell those two cells.

35
00:02:20,000 --> 00:02:21,000
So.

36
00:02:23,000 --> 00:02:26,000
Okay, so this might take a few seconds more.

37
00:02:27,000 --> 00:02:31,000
Okay, So we have checked that whether the GPU is available or not.

38
00:02:31,000 --> 00:02:33,000
So I think this should be at the start.

39
00:02:34,000 --> 00:02:40,000
Okay, So just make it a start by just going over here and change runtime.

40
00:02:40,000 --> 00:02:42,000
Please make sure that you have GPU.

41
00:02:42,000 --> 00:02:44,000
Then I will import torch.

42
00:02:44,000 --> 00:02:48,000
Basically YOLO v8 is built using PI torch, so we need to uh.

43
00:02:49,000 --> 00:02:53,000
So we need to have a PI torch set in our Google colab notebook.

44
00:02:53,000 --> 00:02:59,000
To do this, we import that torch and here we can see that we have the GPU available and the torch version

45
00:02:59,000 --> 00:03:03,000
is 1.13 point one and 2.16.

46
00:03:03,000 --> 00:03:11,000
So basically to test the segmentation model on images or videos, let's download some images and video

47
00:03:11,000 --> 00:03:15,000
from Google Drive into our Google CoLab notebook.

48
00:03:15,000 --> 00:03:21,000
So I'm just downloading a sample image and video from Google Drive into my Google CoLab notebook and

49
00:03:21,000 --> 00:03:27,000
then I will run segmentation on this sample image which we have downloaded.

50
00:03:27,000 --> 00:03:32,000
So just running the segmentation on this sample image and let me show you the output image which we

51
00:03:32,000 --> 00:03:32,000
get.

52
00:03:33,000 --> 00:03:37,000
So you can see that as I'm performing segmentation.

53
00:03:37,000 --> 00:03:40,000
So I've written toss is equal to segmentation, mod is equal to prediction.

54
00:03:40,000 --> 00:03:44,000
And here we have the YOLO V8 Pre-trained model path over here.

55
00:03:44,000 --> 00:03:46,000
So you just need to write the model name.

56
00:03:46,000 --> 00:03:51,000
It will automatically download it from the GitHub repo into the Google CoLab notebook.

57
00:03:51,000 --> 00:03:55,000
So now my results are saved in runs per segmentation predict.

58
00:03:55,000 --> 00:03:58,000
So this is the place where my results are saved.

59
00:03:58,000 --> 00:04:03,000
Let me show you like runs segment predict.

60
00:04:03,000 --> 00:04:06,000
And here is my output image which I have displayed over here.

61
00:04:06,000 --> 00:04:09,000
So you just need to copy path and just paste it over here.

62
00:04:09,000 --> 00:04:15,000
And if you run this you will get this output image so you can see that we are able to detect the person

63
00:04:15,000 --> 00:04:17,000
and we have a mask around it as well.

64
00:04:17,000 --> 00:04:23,000
Like you can see this uh, pink color mask and you can see this green color mask on the bus.

65
00:04:23,000 --> 00:04:30,000
So, uh, you can see that we are able to do the detection as well as the segmentation as well.

66
00:04:30,000 --> 00:04:30,000
Okay?

67
00:04:30,000 --> 00:04:36,000
So if you want to hide these labels and this confidence value, basically 0.91 represents the confidence

68
00:04:36,000 --> 00:04:37,000
value person is the label.

69
00:04:37,000 --> 00:04:41,000
So confidence value means how much model is showed that this is the person.

70
00:04:41,000 --> 00:04:47,000
So if I want to hide this confidence value, you can just simply write hide dash labels, hide dash

71
00:04:47,000 --> 00:04:53,000
label will hide the person label, which is over here, or backpack label over here and hide dash Confidence

72
00:04:53,000 --> 00:04:59,000
will hide this confidence value, which is 0.91 over here and 0.53 over here and 0.54 over here.

73
00:04:59,000 --> 00:05:01,000
So let me show you.

74
00:05:01,000 --> 00:05:05,000
If you set hide dash labels, it will hide this label person and hide dash.

75
00:05:05,000 --> 00:05:07,000
Confidence will hide the confidence value over here.

76
00:05:09,000 --> 00:05:10,000
So this might take a few seconds.

77
00:05:10,000 --> 00:05:14,000
And here is our output image you can see over here.

78
00:05:16,000 --> 00:05:19,000
So now you can see that we don't have any label or the confidence value.

79
00:05:19,000 --> 00:05:23,000
So similarly we can run segmentation on any video as well.

80
00:05:23,000 --> 00:05:27,000
So let's run the segmentation on a video.

81
00:05:27,000 --> 00:05:32,000
So this might take a few seconds for the execution, but let's wait and see.

82
00:05:32,000 --> 00:05:33,000
What results do we get.

83
00:05:33,000 --> 00:05:36,000
So you can see that the model is able to detect the cars.

84
00:05:36,000 --> 00:05:44,000
So let's divide the complete video into 1314 frames and it is doing the detections frame by frame.

85
00:05:44,000 --> 00:05:50,000
So we can see eight cars, one truck detected in this frame, three cars, five cars, three trucks

86
00:05:50,000 --> 00:05:54,000
are detected in this frame and five cars are detected also in this frame.

87
00:05:54,000 --> 00:05:57,000
Similarly, four cars, one truck is detected in this frame.

88
00:05:57,000 --> 00:06:00,000
And here we are detected six cars.

89
00:06:00,000 --> 00:06:03,000
So this might take some take some time for the execution.

90
00:06:03,000 --> 00:06:08,000
So let's wait until the execution gets complete and then we proceed further.

91
00:06:08,000 --> 00:06:09,000
So.

92
00:06:09,000 --> 00:06:10,000
Okay, so.

93
00:06:11,000 --> 00:06:15,000
The processing on the video is done and our results are saved in runs.

94
00:06:15,000 --> 00:06:19,000
Segment Predict three And here is our output video.

95
00:06:19,000 --> 00:06:21,000
So I have already displayed the output.

96
00:06:21,000 --> 00:06:26,000
I have already previously run the script, so if you display the output video it will be like this.

97
00:06:26,000 --> 00:06:32,000
So let me just download this output video and show you how our results look like.

98
00:06:32,000 --> 00:06:36,000
So let me just navigate my screen towards this output video.

99
00:06:37,000 --> 00:06:39,000
So just give me a minute.

100
00:06:40,000 --> 00:06:40,000
Okay.

101
00:06:40,000 --> 00:06:42,000
I'm just navigating my screen.

102
00:06:42,000 --> 00:06:48,000
Okay, So now in front of you, you can see the output video over here in front of your screen.

103
00:06:48,000 --> 00:06:52,000
So I think you can see the output video over here.

104
00:06:52,000 --> 00:06:53,000
Okay?

105
00:06:53,000 --> 00:06:57,000
So you can see that we are able to do that segmentation as well.

106
00:06:57,000 --> 00:06:59,000
Like you can see the mask as well.

107
00:06:59,000 --> 00:07:04,000
So you can see that here we have implemented a semantic segmentation like when there is a car over here,

108
00:07:04,000 --> 00:07:09,000
you can see that we have the same color of the bounding box and the same color of the mask as well,

109
00:07:09,000 --> 00:07:15,000
while in the case of truck or any other object, uh, so the same object have the same color of the

110
00:07:15,000 --> 00:07:17,000
bounding box and the same mask.

111
00:07:17,000 --> 00:07:22,000
Like you can see that car and foot truck as well, like the same color of the bounding box and the same

112
00:07:22,000 --> 00:07:23,000
color of the mask as well.

113
00:07:23,000 --> 00:07:28,000
Like you can see here, the trucks over here, they have the same color of the bounding box, which

114
00:07:28,000 --> 00:07:31,000
is a green and the same color of the mask as well.

115
00:07:31,000 --> 00:07:36,000
While for the car they have the same color of the bounding box and the same color for the mask as well.

116
00:07:36,000 --> 00:07:38,000
So that's great.

117
00:07:39,000 --> 00:07:46,000
So if we go towards the CoLab file again over here, so now we can explore the YOLO VR segmentation

118
00:07:46,000 --> 00:07:47,000
model in the Onnx format.

119
00:07:47,000 --> 00:07:51,000
So if you have some Pre-trained model, you can also export in the Onnx format.

120
00:07:51,000 --> 00:07:58,000
But if you have some custom trained, you have trained a custom model or for segmentation in YOLO,

121
00:07:59,000 --> 00:08:02,000
you can also export into the Onnx format as well.

122
00:08:02,000 --> 00:08:08,000
Okay, so here I'm just exporting the model into the Onnx format over here.

123
00:08:10,000 --> 00:08:14,000
So this might take a few seconds for execution.

124
00:08:14,000 --> 00:08:16,000
So let's see what results do actually, we get over here.

125
00:08:20,000 --> 00:08:20,000
Okay.

126
00:08:22,000 --> 00:08:27,000
But now you can see here, here, should we have the onnx file so you can see that our YOLO V8 segmentation

127
00:08:27,000 --> 00:08:30,000
model is being converted into the ONNX format.

128
00:08:30,000 --> 00:08:33,000
And here we have the Onnx file over here.

129
00:08:33,000 --> 00:08:39,000
Okay, so in this way you can convert the model into the onnx format or you can simply implement YOLO

130
00:08:39,000 --> 00:08:42,000
v8 model in a.py file as well.

131
00:08:42,000 --> 00:08:46,000
So to run yolo v8 model in a.py file, simply copy this.

132
00:08:46,000 --> 00:08:55,000
All this and just paste all this code in any.py file and you can execute the YOLO V8 model on any custom

133
00:08:55,000 --> 00:08:56,000
on any video or image.

134
00:08:56,000 --> 00:09:05,000
Or you can also implement you train yolo v8 custom model train the V8 model on a custom dataset in a.

135
00:09:06,000 --> 00:09:08,000
In App.py file as well.

136
00:09:08,000 --> 00:09:13,000
Okay, so first of all, you from Ultralytics, you will import YOLO, so it will automatically import

137
00:09:13,000 --> 00:09:15,000
YOLO V8 as well.

138
00:09:15,000 --> 00:09:19,000
And then you can do the prediction on any sample image.

139
00:09:19,000 --> 00:09:19,000
Okay.

140
00:09:19,000 --> 00:09:22,000
So you can simply copy this code in a.py file and execute it.

141
00:09:25,000 --> 00:09:26,000
So this might take a few seconds.

142
00:09:31,000 --> 00:09:32,000
Okay.

143
00:09:33,000 --> 00:09:37,000
31.2, 42.7 and 46.

144
00:09:37,000 --> 00:09:37,000
And.

145
00:09:38,000 --> 00:09:41,000
Basically it's downloading the segmentation model.

146
00:09:43,000 --> 00:09:44,000
From the.

147
00:09:45,000 --> 00:09:45,000
Uh, GitHub repo.

148
00:09:47,000 --> 00:09:54,000
And save is equal to true means we will save the results of the output will save the output prediction.

149
00:09:54,000 --> 00:09:57,000
Or you can say the output image as well.

150
00:09:57,000 --> 00:10:00,000
So you can see that we have saved the output.

151
00:10:00,000 --> 00:10:03,000
Image image one over here, which is the result over here.

152
00:10:03,000 --> 00:10:08,000
And here we have the labels files which contains the coordinates of the bounding box.

153
00:10:09,000 --> 00:10:15,000
Okay, so here we have the label dot text files, which contains the coordinates of the bounding box

154
00:10:15,000 --> 00:10:16,000
over here.

155
00:10:16,000 --> 00:10:17,000
You can see here.

156
00:10:17,000 --> 00:10:18,000
Okay.

157
00:10:20,000 --> 00:10:23,000
Okay, So you can display this output image over here as well.

158
00:10:26,000 --> 00:10:31,000
So in a similar way, you can also run segmentation on a demo video as well, which you can see over

159
00:10:31,000 --> 00:10:32,000
here.

160
00:10:32,000 --> 00:10:34,000
So here we have run the segmentation on the demo video.

161
00:10:34,000 --> 00:10:38,000
I have already done this trip so I will not run again.

162
00:10:38,000 --> 00:10:38,000
Okay.

163
00:10:38,000 --> 00:10:42,000
So and you can display this output video over here as well.

164
00:10:42,000 --> 00:10:47,000
Like you can see that we have the bounding box around the detected object.

165
00:10:47,000 --> 00:10:50,000
Plus we have also have a mask as well on the car as well.

166
00:10:50,000 --> 00:10:51,000
Okay.

167
00:10:51,000 --> 00:10:55,000
So simply you can also export the model in Onnx format as well.

168
00:10:55,000 --> 00:10:57,000
So just by just writing model is equal to YOLO.

169
00:10:57,000 --> 00:11:01,000
So here you can pass any custom trained YOLO V8 model as well.

170
00:11:01,000 --> 00:11:04,000
YOLO V8 model trained with the custom data set as well.

171
00:11:04,000 --> 00:11:07,000
And here I've just written the format as onnx.

172
00:11:07,000 --> 00:11:12,000
Okay, so we will have the onnx file over here.

173
00:11:13,000 --> 00:11:13,000
Okay.

174
00:11:13,000 --> 00:11:18,000
So this is for the previous model and here we have the YOLO V8.

175
00:11:18,000 --> 00:11:24,000
Previously we have the YOLO V8 and so YOLO V8 segmentation dot Onex file we have over here.

176
00:11:25,000 --> 00:11:25,000
Okay.

177
00:11:25,000 --> 00:11:29,000
So in this way you can implement your segmentation in Google CoLab.

178
00:11:29,000 --> 00:11:31,000
So that's all from this video tutorial.

179
00:11:31,000 --> 00:11:32,000
See you all in the next video tutorial.

180
00:11:32,000 --> 00:11:33,000
Till then, bye bye.