1
00:00:00,420 --> 00:00:07,590
And welcome back in this lesson, we'll take a look at loading a mask at pre-trained mask, our CNN

2
00:00:08,100 --> 00:00:09,180
using TensorFlow.

3
00:00:09,330 --> 00:00:13,620
So let's take a look at Notebook 55 and we can get started.

4
00:00:14,490 --> 00:00:21,330
So this notebook well, this model, I should say, has been developed, developed by a company called

5
00:00:21,330 --> 00:00:24,570
Matterport, and Matterport is actually a really cool company.

6
00:00:25,170 --> 00:00:30,980
You can see they basically make a lot of 360 degree cameras, but they're also able to generate these

7
00:00:30,990 --> 00:00:33,680
sort of like point cloud 3D representations.

8
00:00:33,680 --> 00:00:35,850
This is all a treaty, as you can see here.

9
00:00:36,390 --> 00:00:44,220
So you can walk around with a camera, essentially, and it basically creates a virtual map using RGV.

10
00:00:44,760 --> 00:00:52,980
That's depth sensors and cameras and recreates these and using point clouds into these very cool two-dimensional

11
00:00:52,980 --> 00:00:53,700
structures.

12
00:00:54,330 --> 00:00:56,220
So I would encourage you to check them out.

13
00:00:56,430 --> 00:01:02,130
The cameras are quite expensive, though, but they do offer a lot of really cool stuff, and they have

14
00:01:02,130 --> 00:01:08,080
a GitHub repo with a lot of open sourced computer vision algorithms and models.

15
00:01:08,100 --> 00:01:09,930
And that's one of them we'll be using now.

16
00:01:10,890 --> 00:01:18,330
So firstly, we need to uninstall H5 Pi because is this model is compatible with the older version,

17
00:01:18,330 --> 00:01:19,440
which is 2.1.

18
00:01:19,440 --> 00:01:22,470
So we installed that there fixed about a minute to run.

19
00:01:23,190 --> 00:01:29,490
Next, we have to tell on the book that we are going to use TensorFlow version one point X. So we do

20
00:01:29,490 --> 00:01:32,760
that before you import TensorFlow, you can do this.

21
00:01:33,150 --> 00:01:36,360
TensorFlow version changes after you import.

22
00:01:36,360 --> 00:01:38,190
You can't just remember that in a notebook.

23
00:01:38,790 --> 00:01:45,840
So next, we'll include the Matterport mask, CNN repo, and then we just click here into the samples

24
00:01:45,840 --> 00:01:49,430
and then we know what we do with this important model.

25
00:01:49,470 --> 00:01:57,450
We set some parts, we get the samples here, and then we just create more parts where we'll be saving

26
00:01:57,450 --> 00:01:59,580
the images after we run the detection.

27
00:02:00,150 --> 00:02:07,020
And this is basically a class for the inference config as well, and that it's displaying a model basically

28
00:02:07,020 --> 00:02:08,220
to model parameters.

29
00:02:08,380 --> 00:02:08,640
Yep.

30
00:02:09,730 --> 00:02:16,170
Next note we can do is create a model here so we can print it and set to motor inference who is pointed

31
00:02:16,170 --> 00:02:19,260
in Model X three point to the conflict that we want to use.

32
00:02:19,800 --> 00:02:23,190
We lowered the width, so we loading it for the cuckoo cuckoo.

33
00:02:23,820 --> 00:02:27,510
Remember the sense of a common object occurrences dataset?

34
00:02:28,380 --> 00:02:31,830
And no, is it a classes in that dataset there?

35
00:02:32,910 --> 00:02:39,510
And then finally, we can just look at a random image here and then run the detection model using the

36
00:02:39,510 --> 00:02:43,530
model that detect here and visualize deal output.

37
00:02:43,680 --> 00:02:46,830
So let's take a look at how this pre-trained maps are.

38
00:02:46,830 --> 00:02:52,770
CNN and TensorFlow works, and you can see it works fairly well.

39
00:02:53,190 --> 00:02:55,290
It gets most of the cores right.

40
00:02:55,770 --> 00:02:59,000
Some cores that are quite small on this and it doesn't get.

41
00:02:59,010 --> 00:03:02,250
But that's understandable because the scale is quite small.

42
00:03:02,970 --> 00:03:03,820
This is a better way.

43
00:03:03,840 --> 00:03:10,380
These are some of the research areas of computer vision will have to solve in the next few years.

44
00:03:10,380 --> 00:03:15,300
Because, as you can see, we have we have a lot of basic models that do work quite well.

45
00:03:15,330 --> 00:03:17,880
It gets most of the pedestrians that are visible here.

46
00:03:18,390 --> 00:03:25,770
However, as a human, you contextually know that because cause of parked along the street here, this

47
00:03:25,770 --> 00:03:31,860
entire street here, even though you can tell it, even if I if I would have crop an image here and

48
00:03:31,860 --> 00:03:34,530
displayed to you, you probably would have no idea what it is.

49
00:03:35,010 --> 00:03:38,190
And that's kind of effectively how it is going to be divisions, see?

50
00:03:38,610 --> 00:03:39,720
Plus, it's kind of like this.

51
00:03:40,380 --> 00:03:45,240
However, we know because of context that cars are going to be parked all along the street here.

52
00:03:45,720 --> 00:03:51,480
So something like this here, which I mean, to be fair, it looks like a car, but that's because I

53
00:03:51,480 --> 00:03:52,200
know it's a car.

54
00:03:53,130 --> 00:03:58,010
This will be pretty much labelled as a car as a human, but it can be.

55
00:03:58,020 --> 00:04:02,790
Division algorithm will have a very tough time to visually distinguished, and that's a call.

56
00:04:03,510 --> 00:04:06,750
So that's just a bit of computer vision information.

57
00:04:08,010 --> 00:04:15,120
So it can help, maybe if we need a lot more people to use to be researching these areas.

58
00:04:15,120 --> 00:04:17,340
So I hope I encourage some of you to do that.

59
00:04:17,760 --> 00:04:19,350
Make my life easier as well.

60
00:04:19,950 --> 00:04:22,470
So thank you for that lesson.

61
00:04:22,890 --> 00:04:26,250
And in the next lesson, we'll take a look at the tech front, too.

62
00:04:26,430 --> 00:04:30,270
That's Facebook's Detection two API library.

63
00:04:31,050 --> 00:04:37,740
I guess you can call it a library that offers a bunch of different models and NASCAR scenes being one

64
00:04:37,740 --> 00:04:38,100
of them.

65
00:04:38,670 --> 00:04:41,730
So stay tuned for that and I hope to see you in that lesson.

66
00:04:41,880 --> 00:04:42,180
But.
