WEBVTT

00:00.360 --> 00:03.120
-: Hello and welcome back to the course on Deep Learning.

00:03.120 --> 00:06.060
So, we've learned quite a lot in this section of the course,

00:06.060 --> 00:08.550
let's summarize what we've talked about.

00:08.550 --> 00:10.080
All right, so here we go.

00:10.080 --> 00:11.790
We started with an input image

00:11.790 --> 00:15.480
to which we applied multiple different feature detectors,

00:15.480 --> 00:19.080
or also called filters, to create these feature maps,

00:19.080 --> 00:21.570
and this comprises our convolution layer.

00:21.570 --> 00:24.720
Then on top of that conventional layer we applied the RELU

00:24.720 --> 00:29.010
or Rectified Linear Unit to remove any linearity

00:29.010 --> 00:32.010
or increased non-linearity in our images.

00:32.010 --> 00:36.990
Then we applied a pooling layer to our convolution layer,

00:36.990 --> 00:40.260
so from every single feature map

00:40.260 --> 00:42.840
we created a pooled feature map.

00:42.840 --> 00:45.870
And basically the pooling layer has lots of advantages,

00:45.870 --> 00:47.790
the main purpose of the pooling layer

00:47.790 --> 00:50.940
is to make sure that we have

00:50.940 --> 00:54.720
a special in variance in our images,

00:54.720 --> 00:56.760
so basically if something tilts or twists

00:56.760 --> 01:01.230
or is a bit different to the ideal scenario,

01:01.230 --> 01:03.030
then we can still pick up that feature.

01:03.030 --> 01:07.020
Plus, pooling significantly reduces the size of our images

01:07.020 --> 01:10.890
and also pooling helps with avoiding

01:10.890 --> 01:13.560
any kind of overfitting of our data

01:13.560 --> 01:15.150
or of our model to the data,

01:15.150 --> 01:18.420
because it just simply gets rid of a lot of that data.

01:18.420 --> 01:22.140
But at the same time pooling preserves the main features

01:22.140 --> 01:24.330
that we were after, just because the way it's structured

01:24.330 --> 01:26.940
and the pooling we used was max pooling.

01:26.940 --> 01:29.520
Then we flattened all of the pooled images

01:29.520 --> 01:34.520
into one long a vector or column of all of these values

01:35.520 --> 01:38.310
and we input that into an artificial neural network,

01:38.310 --> 01:40.110
and that was step three flattening

01:40.110 --> 01:42.720
and step four is the fully connected

01:42.720 --> 01:44.040
artificial neural network

01:44.040 --> 01:46.920
where all of these features are processed

01:46.920 --> 01:47.753
through the network.

01:47.753 --> 01:50.220
And then we have this final layer,

01:50.220 --> 01:53.880
final fully connected layer, which performs the voting

01:53.880 --> 01:56.040
towards the classes that we're after.

01:56.040 --> 01:58.020
And then all of this is trained

01:58.020 --> 02:02.520
through a forward propagation and back propagation process,

02:02.520 --> 02:05.220
and lots of lots of iterations and (indistinct)

02:05.220 --> 02:09.720
and in the end we have a very well defined neural network.

02:09.720 --> 02:11.700
And another important thing is

02:11.700 --> 02:13.020
not only the weights are trained

02:13.020 --> 02:14.310
in artificial neural network part

02:14.310 --> 02:17.460
but also the feature detectors are trained

02:17.460 --> 02:21.960
and adjusted in that same gradient decent process,

02:21.960 --> 02:23.940
and that allows us to come up with the best feature maps.

02:23.940 --> 02:26.460
And in the end we get a fully trained

02:26.460 --> 02:27.690
convolutional neural network

02:27.690 --> 02:31.740
which can recognize images and classify them.

02:31.740 --> 02:32.573
So, there we go,

02:32.573 --> 02:35.700
that's how convolutional neural networks work.

02:35.700 --> 02:38.970
And now you should be totally comfortable with this concept

02:38.970 --> 02:42.300
and ready to proceed to the practical applications.

02:42.300 --> 02:44.610
If you'd like to do some additional reading,

02:44.610 --> 02:49.170
then there's a great blog by Adit Deshpande

02:49.170 --> 02:53.370
from 2016, you can see the link over there at the bottom.

02:53.370 --> 02:55.950
So, the blog is called "The Nine Deep Learning Papers

02:55.950 --> 02:59.310
You Need to Know About Understanding CNN's Part Three".

02:59.310 --> 03:01.680
And this blog actually gives you a short overview

03:01.680 --> 03:05.356
of nine different CNNs that have been created by people like

03:05.356 --> 03:07.920
Jan (indistinct) and others,

03:07.920 --> 03:10.590
which you can then go ahead and study further.

03:10.590 --> 03:13.650
So, there will be a lot of new things

03:13.650 --> 03:15.630
that will be totally new to you

03:15.630 --> 03:18.540
and that you will have to get your head around,

03:18.540 --> 03:20.610
but just keep this blog in mind

03:20.610 --> 03:22.530
or these nine papers in mind,

03:22.530 --> 03:25.170
and even if you're not ready to go through them right now

03:25.170 --> 03:26.850
maybe after the practical tutorials,

03:26.850 --> 03:30.060
maybe after you do some additional training

03:30.060 --> 03:31.530
in the space of deep learning,

03:31.530 --> 03:33.900
slowly you can then reference these works

03:33.900 --> 03:37.680
and, ideally, I think you will get a lot of value

03:37.680 --> 03:39.930
through looking through other people's neural networks

03:39.930 --> 03:43.230
and how they structured their convolution nets;

03:43.230 --> 03:46.020
and that'll help you understand what are the best practices

03:46.020 --> 03:48.420
and why people did certain things in a certain way.

03:48.420 --> 03:49.470
And that will help you

03:49.470 --> 03:51.900
with your architecture of neural networks,

03:51.900 --> 03:55.060
Because neural networks and convolutional neural networks

03:56.820 --> 03:57.990
are not an exception

03:57.990 --> 04:01.620
they are like an architecture challenge.

04:01.620 --> 04:05.400
You have to come up with a idea and then structure it,

04:05.400 --> 04:06.750
and then adjust it and tweak it

04:06.750 --> 04:08.670
to get the best possible design

04:08.670 --> 04:11.760
and the best possible and optimal performance.

04:11.760 --> 04:13.440
So, there we go, that's us for today.

04:13.440 --> 04:16.140
I hope you enjoyed today's tutorial and this whole section,

04:16.140 --> 04:17.670
and I look forward to seeing you next time.

04:17.670 --> 04:19.623
Until then, enjoy Deep Learning.
