1
00:00:11,630 --> 00:00:17,050
Let's briefly discuss a few of the popular architectures that I use for transfer learning.

2
00:00:17,210 --> 00:00:22,820
The first and most popular is probably veggie which stands for Visual geometry group.

3
00:00:22,850 --> 00:00:26,950
This is just the name of the research team that created it.

4
00:00:27,140 --> 00:00:32,300
When you look at the architecture for the G.G. although it got state of the art results at the time

5
00:00:32,690 --> 00:00:35,420
it's not that different from what we already know.

6
00:00:35,540 --> 00:00:39,960
It's just the usual CNN with an unusually large number of layers.

7
00:00:39,980 --> 00:00:43,250
At least it was a large number of layers compared to other neural networks.

8
00:00:43,250 --> 00:00:44,510
At the time it was created.

9
00:00:45,230 --> 00:00:47,120
So here's how it looks.

10
00:00:47,120 --> 00:00:50,180
First we have to convolutions followed by pooling.

11
00:00:50,180 --> 00:00:56,240
We repeat that once then we have three convolutions followed by pulling and we repeat that block three

12
00:00:56,240 --> 00:00:57,500
times.

13
00:00:57,500 --> 00:01:01,370
Finally we have three fully connected or dense layers.

14
00:01:01,370 --> 00:01:05,570
As you can see it's pretty much the same thing as the Lina just bigger.

15
00:01:05,630 --> 00:01:10,600
We have a series of convolutions and pulling followed by a few dense layers.

16
00:01:10,640 --> 00:01:17,330
Note that there are a few variations of the veggie network such as VEGF 16 and visa 19 which have 16

17
00:01:17,330 --> 00:01:19,410
layers and 19 layers respectively.

18
00:01:24,590 --> 00:01:30,040
Another common choice is resonant resonant is an even larger network than V.

19
00:01:30,410 --> 00:01:35,390
We won't go into too much detail in this course since our objective is not to understand resonate but

20
00:01:35,390 --> 00:01:37,690
just to know what our options are.

21
00:01:37,700 --> 00:01:43,220
The idea behind resonant is that it's a neuron that work with branches each branch is responsible for

22
00:01:43,220 --> 00:01:45,040
learning something different.

23
00:01:45,110 --> 00:01:48,680
It turns out that one of the branches is just the identity function.

24
00:01:48,710 --> 00:01:55,940
So the other branch is responsible for learning the residual hence the name residual network like Fiji

25
00:01:56,360 --> 00:02:00,450
resident also has different variations with different numbers of layers.

26
00:02:00,590 --> 00:02:07,250
There's resident 50 with 50 layers resonate one to one with one to one layers and resonate 152 which

27
00:02:07,250 --> 00:02:10,010
has 152 layers.

28
00:02:10,010 --> 00:02:15,200
There are also some resonates with different architectures such as resonant V2 and a rest next

29
00:02:20,350 --> 00:02:23,010
yet another common choice is inception.

30
00:02:23,140 --> 00:02:29,140
The layout of Inception is similar to resonate in that you have multiple parallel branches but with

31
00:02:29,140 --> 00:02:35,170
Inception The concept is a little different with Inception instead of just a single convolution going

32
00:02:35,170 --> 00:02:40,150
into another convolution you'll do multiple convolutions in parallel.

33
00:02:40,240 --> 00:02:44,490
You recall that one of the hyper parameters we have to choose is the filter size.

34
00:02:44,620 --> 00:02:48,910
Well inception says why not simply try them all and concatenate the result.

35
00:02:49,540 --> 00:02:55,200
So for example you'll have a one by one filter a three by three filter and a five by five filter and

36
00:02:55,210 --> 00:02:58,330
kind of all the same input image by all these filters.

37
00:02:58,360 --> 00:03:03,840
Once you've done that you'll take the resulting images and stack them all together.

38
00:03:03,910 --> 00:03:09,550
The idea behind the inception network is that you'll have multiple layers of Inception blocks just like

39
00:03:09,550 --> 00:03:14,060
how the resonant has multiple layers of residual learning blocks.

40
00:03:14,070 --> 00:03:20,700
Now I know a lot of people are going to ask Well which one should I choose e.g. resonant or inception.

41
00:03:20,700 --> 00:03:26,400
And the answer is as always you have to try it on your specific dataset with your specific task.

42
00:03:26,400 --> 00:03:30,420
There is no way to predict which one will work better for each use case

43
00:03:35,580 --> 00:03:40,980
another popular appreciate model used for transfer learning is called Mobile on that which specializes

44
00:03:40,980 --> 00:03:42,540
in being lightweight.

45
00:03:42,660 --> 00:03:46,950
So this one actually does have a specific purpose because it's lightweight.

46
00:03:46,960 --> 00:03:50,170
There's a tradeoff made between a speed and accuracy.

47
00:03:50,190 --> 00:03:55,410
Generally speaking the ideal use case for mobile on that is if you want to use convolution or none that

48
00:03:55,410 --> 00:04:00,350
works on less powerful machines such as mobile devices and embedded devices.