1
00:00:00,150 --> 00:00:05,560
Hi and welcome back to the course in this section, will take a look at another point cloud project.

2
00:00:05,580 --> 00:00:09,570
This time we're doing segmentation as opposed to classification.

3
00:00:09,660 --> 00:00:11,040
So let's get started.

4
00:00:11,310 --> 00:00:14,250
So we're doing point cloud segmentation.

5
00:00:14,250 --> 00:00:17,430
So what exactly is going cloud segmentation?

6
00:00:17,790 --> 00:00:21,270
Well, imagine you have that point cloud that say, let's see.

7
00:00:21,270 --> 00:00:24,210
It's a point cloud of a table and chair together.

8
00:00:24,840 --> 00:00:27,100
Now we can individually classify them.

9
00:00:27,150 --> 00:00:33,930
What if we wanted to categorize the legs of the table or chair separately from this seat or the table

10
00:00:33,930 --> 00:00:35,790
top or the backrest?

11
00:00:36,240 --> 00:00:39,150
Those types of things that would involve segmentation.

12
00:00:39,690 --> 00:00:40,920
So that's what would be going.

13
00:00:41,250 --> 00:00:43,410
What what we would be doing in this lesson.

14
00:00:43,440 --> 00:00:47,250
So let's import to a libraries and download our data set.

15
00:00:47,260 --> 00:00:50,800
So the dataset we'll be looking at is the shape that dataset.

16
00:00:50,820 --> 00:00:52,260
Let's take a look at that.

17
00:00:52,650 --> 00:00:59,010
We can see it covers 55 common objects with about fifty one thousand unique models.

18
00:00:59,730 --> 00:01:02,310
So you can see, let's take a look at models.

19
00:01:03,460 --> 00:01:04,830
Well, maybe not sitting.

20
00:01:07,050 --> 00:01:09,570
Still, look at browsing categories here.

21
00:01:11,400 --> 00:01:13,340
So you can see this is what the model consists.

22
00:01:13,380 --> 00:01:18,830
These nice 3D models of different airplanes, different could bird houses.

23
00:01:18,840 --> 00:01:20,010
There's a lot of weight off.

24
00:01:20,080 --> 00:01:21,480
They're going to be loaded now.

25
00:01:22,170 --> 00:01:29,610
So you can see this is a very cool, exhaustive dataset, basically for ships instead of images like

26
00:01:29,730 --> 00:01:30,030
image.

27
00:01:30,030 --> 00:01:37,140
And it was a huge dataset consisting of so much different categories and health care labels ship that

28
00:01:37,140 --> 00:01:38,520
is similar as well.

29
00:01:39,180 --> 00:01:41,430
And they're basically the one we're going to be using.

30
00:01:41,430 --> 00:01:47,610
Here is a 12 object category of the Pasco Treaty plus as part of the ship that core dataset.

31
00:01:48,150 --> 00:01:49,850
So let's download the dataset here.

32
00:01:49,860 --> 00:01:54,570
That takes about two minutes and we load only to sit here as well.

33
00:01:55,830 --> 00:02:02,760
And in this example, we'll be turning point not to the previous model we use to segment parts of an

34
00:02:02,760 --> 00:02:03,660
airplane body.

35
00:02:03,840 --> 00:02:05,250
It's a pretty cool project, isn't it?

36
00:02:05,880 --> 00:02:11,810
So now let's take a look at how the dataset is structured so you can see there's a point clouds here.

37
00:02:11,820 --> 00:02:17,700
This is a list of number five list of objects that contained X, Y and Z coordinates of that point.

38
00:02:18,150 --> 00:02:22,260
Then there's the test point clouds, which is the same format, but doesn't have to label so that our

39
00:02:22,260 --> 00:02:22,890
test data.

40
00:02:23,460 --> 00:02:30,120
And then it's the labels of all the datasets here, as well as the point cloud labels, which again

41
00:02:30,600 --> 00:02:36,390
are the labels for each coordinate and if hot one encoded form corresponding to the point cloud list.

42
00:02:37,350 --> 00:02:41,940
So let's load our datasets here and get on a more pointed out information.

43
00:02:42,420 --> 00:02:45,780
This takes a while to run, maybe just about under five minutes.

44
00:02:46,950 --> 00:02:53,040
So now we can take a look at some samples from the in-memory arrays we just created.

45
00:02:53,760 --> 00:02:55,060
So let's take a look at this here.

46
00:02:55,080 --> 00:02:57,480
You can see body wing. point cloud levels.

47
00:02:58,080 --> 00:02:58,840
What are you doing?

48
00:02:58,890 --> 00:03:04,070
So that's a separate pots tail of the airplane, as well as a point clouds associates associated with

49
00:03:04,080 --> 00:03:04,380
it.

50
00:03:05,010 --> 00:03:07,230
So no, it's actually visualize this.

51
00:03:07,620 --> 00:03:08,850
So let's take a look at that.

52
00:03:09,810 --> 00:03:10,830
Well, it's pretty cool.

53
00:03:10,830 --> 00:03:13,530
Isn't it an actual airplane in 3D space?

54
00:03:13,570 --> 00:03:14,800
You can see this is a body.

55
00:03:15,270 --> 00:03:18,700
This is the wing and believing this is part of the tail.

56
00:03:18,720 --> 00:03:19,440
Yes, it is.

57
00:03:19,950 --> 00:03:21,870
These are the engines they built separately.

58
00:03:22,680 --> 00:03:24,340
Here's another plane as well.

59
00:03:24,450 --> 00:03:28,260
You can see it's been segmented and labeled nicely.

60
00:03:29,040 --> 00:03:32,490
So not that we have already to visualize and we've loaded it.

61
00:03:32,970 --> 00:03:35,350
We can run some processing on it.

62
00:03:35,400 --> 00:03:41,280
So this pre-processing involves some sampling, so we have a fixed number to represent each of those

63
00:03:41,280 --> 00:03:48,990
point clouds, as well as normalization and some to scale invariant to some normalization to make the

64
00:03:48,990 --> 00:03:50,520
data scalar variant.

65
00:03:51,150 --> 00:03:56,310
So let's take a look at what a normalized data looks like, and you can see this again here.

66
00:03:57,120 --> 00:03:59,690
And here's another normalized version of that dataset.

67
00:04:00,540 --> 00:04:07,590
So now we can create a TensorFlow format dataset, so we just create this loaded function here to load

68
00:04:07,590 --> 00:04:08,550
a batch of data.

69
00:04:09,840 --> 00:04:11,850
And we have an augmentation functions here.

70
00:04:12,330 --> 00:04:14,860
So we're just going to add some random jitter.

71
00:04:14,880 --> 00:04:17,440
Believe to this to random noise.

72
00:04:18,000 --> 00:04:23,910
Then we just generate the datasets here with this function and we just create everything here and we

73
00:04:23,910 --> 00:04:27,360
can take a look at what our training and validation data sets look like.

74
00:04:27,810 --> 00:04:32,790
It's it's 29 100 for training, 739 for validation.

75
00:04:33,330 --> 00:04:36,330
And here the fixed point, not model.

76
00:04:36,720 --> 00:04:44,130
So you can see we have our input points going here and by tree input, transform this MLP right there.

77
00:04:44,640 --> 00:04:47,340
Then there's a future transform and the MLP here.

78
00:04:47,340 --> 00:04:49,680
And then this gives us the global features here.

79
00:04:50,190 --> 00:04:53,250
And then there's another MLP that outputs the scores as well.

80
00:04:53,460 --> 00:04:55,620
And then this new segmentation part's here.

81
00:04:56,470 --> 00:04:59,820
Basically, this takes the point features and predicts the scores.

82
00:05:00,000 --> 00:05:02,660
For each point effectively.

83
00:05:03,860 --> 00:05:06,860
So something called permutation and variance.

84
00:05:08,030 --> 00:05:13,070
So given how a point cloud data is quite unstructured and any number of points, you have a number of

85
00:05:13,070 --> 00:05:14,030
permutations.

86
00:05:14,540 --> 00:05:21,680
We do need to find a way to represent, to make it invariant, to input permutations in future.

87
00:05:21,980 --> 00:05:23,390
So what do we do here?

88
00:05:24,200 --> 00:05:29,360
We have to ensure that it's invariant in a way so that when it's encoded into the global feature vector,

89
00:05:29,960 --> 00:05:34,820
it basically can handle some sort of jitter and hautement and variance.

90
00:05:35,390 --> 00:05:42,800
So that's what this part of the network does here provides some global sorry, some variance there.

91
00:05:43,040 --> 00:05:45,470
Two different permutations of the data.

92
00:05:46,250 --> 00:05:49,040
And notice also transformation and variance here.

93
00:05:50,030 --> 00:05:55,520
So we need the transformation and variance to handle issues like scaling and translation as well.

94
00:05:56,480 --> 00:05:59,780
And then also, we have to talk about the point interactions.

95
00:06:00,590 --> 00:06:06,800
So in segmentation, we have to be able to leverage local point features because they're they often

96
00:06:06,800 --> 00:06:09,740
interact and correlate or have some sort of relationship.

97
00:06:10,340 --> 00:06:15,440
So that's what we do here for point that actually takes advantage of that.

98
00:06:17,690 --> 00:06:23,660
I guess you could see correlation or interaction so that we could actually use that information for

99
00:06:23,660 --> 00:06:25,100
our segmentation network.

100
00:06:25,760 --> 00:06:29,180
So now we're going to build the point net network.

101
00:06:29,180 --> 00:06:30,620
So we have all building blocks here.

102
00:06:30,620 --> 00:06:32,540
We have to construct MLP block.

103
00:06:33,320 --> 00:06:38,720
Then we have the orthogonal regularised, which you would have seen in the previous chapter, as well

104
00:06:38,720 --> 00:06:44,510
as, no, we have the transmission that so we create that more layer and this is the transmission block

105
00:06:44,510 --> 00:06:45,170
as well.

106
00:06:46,310 --> 00:06:53,110
And now let's get the ship segmentation model finally, so we can create a model or final model.

107
00:06:53,690 --> 00:06:59,930
So that's now we dysfunction creates that final model for all those building blocks we created previously.

108
00:07:00,530 --> 00:07:05,810
And so now we instantiate the model, just get the number of points and classes from a batch of the

109
00:07:05,810 --> 00:07:07,190
X Y data here.

110
00:07:07,850 --> 00:07:10,280
And then we just create the segmentation model.

111
00:07:10,280 --> 00:07:11,630
SUMMARY Right, dear.

112
00:07:11,650 --> 00:07:18,590
So we have seven million parameters in this network's fairly moderately sized and now we can start training

113
00:07:18,590 --> 00:07:18,920
on that.

114
00:07:18,920 --> 00:07:19,420
We're here.

115
00:07:19,430 --> 00:07:24,360
So let's begin treating it well as this is done in the training group here.

116
00:07:24,380 --> 00:07:28,280
So this is your own experiment loop here that actually executes the training.

117
00:07:28,850 --> 00:07:34,760
And finally, we can begin to train and train for 60 epochs where it takes about 20 seconds.

118
00:07:34,760 --> 00:07:38,030
But epoch, so roughly this would have taken me.

119
00:07:38,420 --> 00:07:44,690
I don't have the time here, but you can just multiply 20 by 60 and given the time it would take just

120
00:07:44,690 --> 00:07:45,530
under 10 minutes.

121
00:07:46,160 --> 00:07:51,380
And now you can visualize what the model is predicting, so we can take a look at the training loss

122
00:07:51,860 --> 00:07:55,700
as well as accuracy and see accuracy is going up over.

123
00:07:55,700 --> 00:08:01,670
There have been some weird spikes in the validation accuracy as well as you can see and validation lost.

124
00:08:01,670 --> 00:08:08,300
Not sure what this would happened in this batch, but anyway, you can see it was getting better close

125
00:08:08,300 --> 00:08:15,290
to how it is and for training at our own mid-eighties, for the validation accuracy.

126
00:08:15,300 --> 00:08:22,400
And now we can finally run inference on some random input data from the validation batch.

127
00:08:23,480 --> 00:08:29,960
So you can see this is an airplane here, and we're predicting fairly accurately to body the wings.

128
00:08:29,960 --> 00:08:31,640
We can see that there and the teal.

129
00:08:32,360 --> 00:08:33,200
Let's take a look again.

130
00:08:34,070 --> 00:08:35,480
So yeah, it's pretty good.

131
00:08:35,870 --> 00:08:37,580
So that's it for this lesson.

132
00:08:38,360 --> 00:08:45,590
That's pretty much very in-depth lesson where we take a deeper look at some point that for point cloud

133
00:08:45,590 --> 00:08:52,130
segmentation, do I do encourage you to check out this link here to learn more about this repository,

134
00:08:52,130 --> 00:09:00,020
and you can use it when your own call up the books on your local machine or some easy to instance out

135
00:09:00,020 --> 00:09:03,620
there of which you are your machine cloud instance is enough to be an Amazon one.

136
00:09:03,620 --> 00:09:04,220
It can be any.

137
00:09:05,300 --> 00:09:07,910
And yeah, that's it for this lesson.

138
00:09:07,910 --> 00:09:08,720
I hope you enjoyed it.

139
00:09:08,870 --> 00:09:09,890
Thank you for watching.