1
00:00:01,530 --> 00:00:09,570
Welcome to our lecture on Deteccion, too, which is an entire moral platform library for not just object

2
00:00:09,570 --> 00:00:11,880
detection, but a number of other capabilities.

3
00:00:12,360 --> 00:00:13,620
So let's get started.

4
00:00:13,770 --> 00:00:17,250
So firstly, detection too is a research platform.

5
00:00:17,250 --> 00:00:23,400
It's not a model, it's an entire platform and production library for object detection and a number

6
00:00:23,400 --> 00:00:25,470
of other things that you'll see shortly.

7
00:00:26,400 --> 00:00:30,780
It was developed an open sourced by Facebook's research team called Fair.

8
00:00:31,260 --> 00:00:34,590
And it takes over from the original detection model.

9
00:00:35,010 --> 00:00:41,100
That model was actually made in caffey to the new detection to is developing PyTorch and makes it much

10
00:00:41,100 --> 00:00:47,580
faster, much more modular and more accurate as well because it has improvements all across the board

11
00:00:48,270 --> 00:00:51,090
detection to features a number of capabilities.

12
00:00:51,090 --> 00:00:59,430
As I mentioned, things such as Cascade, our CNN's Mask, our CNN's Point Trend then puts deep lab

13
00:00:59,760 --> 00:01:01,570
for segmentation and many more.

14
00:01:01,590 --> 00:01:06,330
And you can see an example of detection to object detection right here.

15
00:01:07,380 --> 00:01:14,100
So let's take a look at a demo video that Facebook Facebook's A-Team uses to promote detection, too.

16
00:01:14,130 --> 00:01:21,300
And you can see -- and you can see its vast capabilities here, so you can see, firstly, those object

17
00:01:21,300 --> 00:01:22,530
detection quite well.

18
00:01:24,570 --> 00:01:29,520
Next, you can do body pose estimation as well as dense pools here.

19
00:01:31,200 --> 00:01:34,610
Then you can do some semantic segmentation, which you're seeing there.

20
00:01:37,530 --> 00:01:39,870
And this is pent up tech segmentation now.

21
00:01:43,770 --> 00:01:45,630
So you can see it's quite amazing, isn't it?

22
00:01:47,040 --> 00:01:52,360
So this is the timeline for object detectors, however, doesn't include many of them, including you.

23
00:01:53,040 --> 00:01:59,250
I believe a lot of these research teams leave out the oil because they don't see YOLO as being an academic

24
00:01:59,250 --> 00:02:02,660
type network because, yes, the original books were published.

25
00:02:02,670 --> 00:02:07,650
I don't think they were published in peer reviewed journals, unlike these others here.

26
00:02:07,680 --> 00:02:12,900
So for some reason, a lot of them leave them out and their comparisons, but the one don't.

27
00:02:18,500 --> 00:02:19,790
But don't let that get you down.

28
00:02:19,940 --> 00:02:24,980
What a yellow, because yellow is actually quite good and quite popular in the industry, so you can

29
00:02:24,980 --> 00:02:26,630
see here firstly, they were first.

30
00:02:27,170 --> 00:02:30,260
Our CNN's and first hour of CNN's All Those and Cafe.

31
00:02:30,800 --> 00:02:36,570
Then there was the tech trend here in 2018, my report to the company that actually pioneered mass Garcia

32
00:02:36,570 --> 00:02:41,120
and models, and they used TensorFlow intensive tech for the RC and ends.

33
00:02:41,750 --> 00:02:45,340
Then there was the Textron PyTorch in 2018.

34
00:02:45,350 --> 00:02:48,040
Later on in March, then they had Musk.

35
00:02:48,050 --> 00:02:53,750
Our CNN's and pipe torches will then different networks, Simple Network of Simple Detect, which I've

36
00:02:53,750 --> 00:02:54,710
never used, actually.

37
00:02:55,340 --> 00:02:58,910
And then 2019, late 2019 there was the on two.

38
00:02:59,330 --> 00:03:02,320
And I mean, to be honest, I don't think it.

39
00:03:02,330 --> 00:03:06,560
I don't find it actually became as popular as it should have because it's quite good.

40
00:03:07,190 --> 00:03:11,570
However, I think you five and you look was actually much more popular on the time.

41
00:03:11,570 --> 00:03:16,040
So a lot of people in a complete division will actually didn't know about detection, too, but they

42
00:03:16,040 --> 00:03:17,180
haven't actually used it.

43
00:03:18,260 --> 00:03:21,020
And this is some improvements are made with training time.

44
00:03:21,020 --> 00:03:25,160
We can see it's actually quite fast to train and in my experience, it is quite fast to train.

45
00:03:26,120 --> 00:03:29,720
And now let's talk about these segmentation models they offer.

46
00:03:30,230 --> 00:03:36,830
So this is a normal image here, and this is how it looks at semantic segmentation, where all the all

47
00:03:36,830 --> 00:03:40,550
the cores blend into one core color here and people as well.

48
00:03:41,120 --> 00:03:46,640
Then there's instant segmentation, where each core is not individually separated into different colors

49
00:03:47,030 --> 00:03:53,060
and panopticon segmentation, which combines both semantic and instant segmentation to give you basically

50
00:03:53,060 --> 00:03:58,520
the best of all the worlds right now, and it's probably the most challenging type of segmentation to

51
00:03:58,520 --> 00:03:59,150
do right now.

52
00:04:00,080 --> 00:04:06,140
So detection to, as I mentioned initially in the first slide, also made quite a bit of improvements

53
00:04:06,140 --> 00:04:07,460
in the deployment sphere.

54
00:04:07,880 --> 00:04:14,360
This detection to go, which is a library that the additional software layer that the engineers introduced

55
00:04:14,720 --> 00:04:17,810
that allowed it to do modern deployments quite easily.

56
00:04:18,290 --> 00:04:25,850
It made it it had offered a number of things, such as the network quantization, which is shrinking

57
00:04:25,850 --> 00:04:27,530
the network size as well as well.

58
00:04:27,980 --> 00:04:35,030
Technically, it's decreasing to the floating point that storage values of the widths so that the models

59
00:04:35,030 --> 00:04:37,970
take up less space and also a much faster to execute.

60
00:04:38,870 --> 00:04:42,620
And you do lose some accuracy, but actually it's not that much in the real world.

61
00:04:43,280 --> 00:04:48,980
Model conversion to different model formats, depending on the different deployment platforms, can

62
00:04:49,130 --> 00:04:54,230
use some of those model formats and deploy them quite easily on mobile phones as well.

63
00:04:54,830 --> 00:05:01,550
So the Section two does offer a wide range of services that can be used in industry as well in research

64
00:05:02,240 --> 00:05:06,650
and detection to talk a bit about some of their network designs in here.

65
00:05:07,310 --> 00:05:12,620
So you can see it's just not actually that complicated for diagram here, but it just shows you that

66
00:05:12,620 --> 00:05:18,350
there's a backbone, gives you proposals, crop and warp, and then you can get different things here

67
00:05:18,350 --> 00:05:20,330
so you can get the semantic segmentation.

68
00:05:20,840 --> 00:05:22,730
And then from that, you can even get panopticon.

69
00:05:22,730 --> 00:05:26,030
Segmentation can get key points as it can body pools.

70
00:05:26,030 --> 00:05:32,180
Estimation is dense because regular mask our CNN's as well as normal bounding boxes, which we were

71
00:05:32,180 --> 00:05:33,800
talking about merely in this section.

72
00:05:34,550 --> 00:05:36,980
So it does offer quite a bit, doesn't it?

73
00:05:37,910 --> 00:05:44,960
And it's also quite customizable because in the framework it allows for custom models, the datasets,

74
00:05:45,290 --> 00:05:49,340
data loaders, augmentation, various metrics and tests.

75
00:05:49,350 --> 00:05:55,400
Remember the entire framework similar to how you will find was made by Ultra X, as well as training

76
00:05:55,400 --> 00:05:55,860
logic.

77
00:05:56,270 --> 00:06:01,610
So it does offer a lot of flexibility, and a lot of this is good for researchers, especially because

78
00:06:01,610 --> 00:06:06,380
they would want to tweak things, try different augmentation strategies, try different techniques in

79
00:06:06,380 --> 00:06:10,920
the data loading, as well as optimize for different metrics as well.

80
00:06:10,940 --> 00:06:11,720
So it's quite good.

81
00:06:12,170 --> 00:06:14,270
So that's it for detection, too.

82
00:06:14,780 --> 00:06:15,830
We'll stop there now.

83
00:06:15,830 --> 00:06:22,010
You might be quite bored of these terry slides, so now we'll go into the code again on call up and

84
00:06:22,010 --> 00:06:28,160
we're implementing 10 different object detection projects, all in different datasets and using all

85
00:06:28,160 --> 00:06:33,860
different networks will touch on a few different visions of yellow detection to obviously officially

86
00:06:33,870 --> 00:06:34,430
detect.

87
00:06:34,610 --> 00:06:40,550
I believe the two different models here mobile net SSD, as well as faster or CNN, is the only one

88
00:06:40,550 --> 00:06:43,070
we don't touch on, is written in that.

89
00:06:43,160 --> 00:06:48,350
But that's OK because these are these models are capable of much better performance and much better

90
00:06:48,350 --> 00:06:49,460
efficiency anyway.

91
00:06:49,760 --> 00:06:55,460
And these are by far the most popular models that are deployed in the real world as well as academically

92
00:06:55,460 --> 00:06:55,910
as well.

93
00:06:56,510 --> 00:07:01,190
So that's it, and I hope you enjoyed it, and I'll see you now in the cool lessons.

94
00:07:01,310 --> 00:07:01,760
Thank you.