1
00:00:00,940 --> 00:00:02,050
Hi and welcome back.

2
00:00:02,800 --> 00:00:08,110
We're about to take a look at denseness, which basically effectively resonates, but better.

3
00:00:08,590 --> 00:00:09,460
So let's get started.

4
00:00:09,550 --> 00:00:18,760
So dense that was introduced in 2016 by a joint effort by the Cornwall University, Shinhwa University

5
00:00:18,760 --> 00:00:20,500
and Facebook's EIA research team.

6
00:00:20,920 --> 00:00:28,090
And in fact, denseness won best people in 2017 CPR Conference, which is the Premier Computer Vision

7
00:00:28,090 --> 00:00:29,440
Conference for publishing.

8
00:00:29,440 --> 00:00:30,490
So that's a big deal.

9
00:00:31,090 --> 00:00:36,340
And it was able to obtain an even higher accuracy than resonate with fewer parameters.

10
00:00:36,880 --> 00:00:40,060
This is just a preview of the architecture we'll we'll be discussing shortly.

11
00:00:40,540 --> 00:00:41,770
So how did it do that?

12
00:00:41,860 --> 00:00:47,170
Well, basically, it solves the same problem that residents of which is a vanishing gradient problem.

13
00:00:47,650 --> 00:00:53,950
And as we know, just to do a recap when networks get very, very big, having a very small gradient

14
00:00:54,250 --> 00:01:00,010
in that network can basically cause the gradients along that chin to become zero, which is not good.

15
00:01:00,160 --> 00:01:06,010
It basically means that you're kind of wasting parameters, wasting computation time at that point.

16
00:01:06,430 --> 00:01:12,700
So then, since we're able to solve this concept by introducing something called collective knowledge,

17
00:01:13,060 --> 00:01:16,690
where each layer receives information from two previous layers.

18
00:01:16,690 --> 00:01:18,850
So let's take a look and see how that's implemented.

19
00:01:19,330 --> 00:01:23,230
So this here is all classical scene in architecture.

20
00:01:23,800 --> 00:01:24,520
You remember this?

21
00:01:24,520 --> 00:01:29,020
This was the feature maps, feature map outputs and blah blah blah.

22
00:01:29,350 --> 00:01:33,250
We have okunnu layers to reload and all of that stuff.

23
00:01:33,640 --> 00:01:35,560
So you should be familiar with this.

24
00:01:37,180 --> 00:01:43,090
Now, this is a resonant architecture resonance basically introduced at Short-Circuit here, so you

25
00:01:43,090 --> 00:01:44,770
don't always get vanishing gradients.

26
00:01:45,100 --> 00:01:47,950
So what did dense density do differently?

27
00:01:48,220 --> 00:01:49,240
Well, let's take a look.

28
00:01:49,990 --> 00:01:53,140
Dense that's connect to have multiple connections so you can see.

29
00:01:53,140 --> 00:01:54,760
Let's follow the path of this read.

30
00:01:54,760 --> 00:01:57,880
One can see it connects here just like a resonant short-circuit.

31
00:01:58,360 --> 00:02:03,250
Then there's a yellow pad that goes all the way to here, and then there's a green one that goes all

32
00:02:03,250 --> 00:02:06,070
the way to here and a pink one that goes all the way to here.

33
00:02:06,490 --> 00:02:12,100
So you can see this input is received at all the intermediate CNN layers.

34
00:02:12,880 --> 00:02:15,580
So each layer receives information from the previous layers.

35
00:02:15,580 --> 00:02:20,410
And the growth rate, which is the additional number of channels for each layer, is controlled by a

36
00:02:20,410 --> 00:02:21,580
parameter called K.

37
00:02:23,050 --> 00:02:24,610
So that's not all there is to then.

38
00:02:24,670 --> 00:02:29,020
That's the basic dense and composition layer has this foundation here.

39
00:02:29,440 --> 00:02:31,600
They have a pre activation batch norm.

40
00:02:33,070 --> 00:02:37,240
And then the second layer with Crabtree filters, that's basically it.

41
00:02:37,600 --> 00:02:40,580
However, they also introduce something called bottleneck layer.

42
00:02:40,600 --> 00:02:42,340
So what are these bottleneck layers?

43
00:02:42,940 --> 00:02:48,310
Well, remember we have batch known Mary-Lou, and then it's followed by a one by one layer here.

44
00:02:48,790 --> 00:02:51,100
And then that's a bottleneck point right there.

45
00:02:51,100 --> 00:02:54,610
And then we can have a batch known and a tree by tree columns here.

46
00:02:55,090 --> 00:03:00,640
So you can see this sort of like a squeeze or the file module similar to the file modulo, squeezing

47
00:03:00,650 --> 00:03:03,580
it sort of a squeeze and a squeeze expand layer.

48
00:03:04,060 --> 00:03:05,200
It's actually quite similar.

49
00:03:05,740 --> 00:03:12,220
So this was the architecture diagram used to buddy that Transnet researchers and the people, and you

50
00:03:12,220 --> 00:03:14,230
could kind of get a feeling of what it is here.

51
00:03:14,230 --> 00:03:18,820
It's not that descriptive, but basically, here's a note I put in.

52
00:03:19,180 --> 00:03:23,080
There's one by one converters, followed by a two by two averaging pool.

53
00:03:23,470 --> 00:03:29,170
And these are used as transition layers between two contiguous, contiguous dense blocks.

54
00:03:29,650 --> 00:03:34,990
So you can see they have a block here block here that is followed by a two by two average pooling.

55
00:03:35,500 --> 00:03:36,790
And that's the transition layers.

56
00:03:36,910 --> 00:03:45,370
So this here then Senate introduced several networks of different depths this that 121, 169 to one

57
00:03:45,370 --> 00:03:46,500
and 264.

58
00:03:46,510 --> 00:03:47,950
You can see that it's quite deep.

59
00:03:48,700 --> 00:03:52,600
And growth rate is controlled by a but I mentioned for you guys.

60
00:03:53,500 --> 00:03:55,060
So let's take a look at performance now.

61
00:03:55,060 --> 00:04:01,060
You can see the error rate for theinternet was quite good and you can see it going down as network layers

62
00:04:01,060 --> 00:04:01,640
increase.

63
00:04:02,110 --> 00:04:06,370
Similar for the top five or it, you can see it went down to quite low values here.

64
00:04:06,940 --> 00:04:12,340
But let's compare this to Reznik now, given the number of parameters given less parameters here.

65
00:04:12,730 --> 00:04:18,070
You can see denseness performed quite similarly, basically better with less parameters.

66
00:04:18,070 --> 00:04:24,190
Even like look at this one, denseness 121 is getting an error rate of twenty five point five, whereas

67
00:04:24,220 --> 00:04:27,160
resonated for IS getting error rate of twenty six point five.

68
00:04:27,700 --> 00:04:30,910
Accuracy reversed, sensitive using accuracy, we use error rate.

69
00:04:31,360 --> 00:04:38,590
I do tend to prefer to use accuracy, but some researchers keep this standard going forward, so let's

70
00:04:38,590 --> 00:04:40,000
just stay with it anyway.

71
00:04:40,030 --> 00:04:44,560
So basically, you can see the lowest error rate here for theInternet 264.

72
00:04:44,830 --> 00:04:50,530
It still performs better than a resonant, which has twice the amount of parameters.

73
00:04:51,070 --> 00:04:57,070
Similarly, for flops, which is the computational inference number of calculations that it does in

74
00:04:57,120 --> 00:04:57,850
inference.

75
00:04:58,390 --> 00:05:00,250
So you can see denseness do perform.

76
00:05:00,550 --> 00:05:04,300
They're faster than resonates and as well as more accurate.

77
00:05:04,780 --> 00:05:06,610
So that's quite an achievement.

78
00:05:07,420 --> 00:05:12,640
So that's it for a tour of all these CNN architectures.

79
00:05:13,180 --> 00:05:20,590
These this should give you a very good grasp and understanding of complicated state of the art scene

80
00:05:20,590 --> 00:05:23,290
and architecture designs and why they work.

81
00:05:23,830 --> 00:05:29,840
So what we'll do now, we'll stop there, even though there's more to this, which was expanded for

82
00:05:29,920 --> 00:05:30,490
the lessons.

83
00:05:30,940 --> 00:05:34,090
I'll talk about things like So we'll stop here for now.

84
00:05:34,120 --> 00:05:39,130
That basically ends the tour of the state of the art CNN architectures.

85
00:05:39,520 --> 00:05:41,290
Not all of these were state of the art.

86
00:05:41,320 --> 00:05:48,310
However, it is very important to understand the foundation of this the designs of these CNN's because

87
00:05:48,760 --> 00:05:54,400
many of them will be utilized in future networks as well as we take bits of Terry here and there that

88
00:05:54,400 --> 00:05:54,990
we've learned.

89
00:05:55,010 --> 00:05:56,770
Researchers have learned that this works.

90
00:05:57,250 --> 00:06:02,890
So it's very, very good that you understand this will also take a tour of vision transformers.

91
00:06:02,890 --> 00:06:05,110
However, I'll do that later on in the course.

92
00:06:05,620 --> 00:06:11,140
For now, though, what we'll do, we'll dive into the code and actually load some of these pre-trained

93
00:06:11,140 --> 00:06:13,270
networks and explore them.

94
00:06:13,780 --> 00:06:17,080
However, these networks are all trained well.

95
00:06:17,080 --> 00:06:22,000
The ones we can load with cameras of PyTorch are all trained on the image networks.

96
00:06:22,360 --> 00:06:26,500
Which brings me to the next chapter of what is image that we may have mentioned.

97
00:06:26,890 --> 00:06:33,040
It's the benchmark image dataset, computer vision dataset that we used to benchmark these models.

98
00:06:33,640 --> 00:06:36,760
However, we haven't really explored it or understood it.

99
00:06:37,180 --> 00:06:39,400
So that's what we'll do in the next section.

100
00:06:39,400 --> 00:06:44,680
And then after that, we'll go back into the code and start working with pre-trained models.

101
00:06:45,040 --> 00:06:49,780
Many of these pre-trained models like Feig and ception mobile net efficient net.

102
00:06:50,140 --> 00:06:55,780
So we'll take a tour of this and give you some hands on experience with loading and unloading some of

103
00:06:55,780 --> 00:07:01,150
the best networks and implementing them yourself and testing testing them on some images.

104
00:07:01,570 --> 00:07:03,700
So I'll see you in the next section.

105
00:07:03,820 --> 00:07:04,270
Thank you.
