WEBVTT

00:00.300 --> 00:02.910
-: Hello, and welcome to this Python tutorial.

00:02.910 --> 00:05.160
Alright, so we just built the architecture

00:05.160 --> 00:06.180
of our neural network

00:06.180 --> 00:09.000
with the init function of our Network class.

00:09.000 --> 00:10.800
And now we're gonna make a second function

00:10.800 --> 00:13.050
which is going to be the Forward function,

00:13.050 --> 00:15.810
and that's the function that will activate the neurons.

00:15.810 --> 00:16.980
That is, it's the function

00:16.980 --> 00:19.830
that will perform Forward Propagation.

00:19.830 --> 00:20.910
So let's do this.

00:20.910 --> 00:22.500
Let's make this function.

00:22.500 --> 00:25.650
Let's call it Forward, as we just said.

00:25.650 --> 00:29.610
And this function is going to take two arguments.

00:29.610 --> 00:32.100
First is, as usual, Self, you know

00:32.100 --> 00:34.500
to be able to use the variables of the object,

00:34.500 --> 00:37.470
because we're gonna use fc1 and fc2.

00:37.470 --> 00:41.370
So, we need this Self to be able to use these variables.

00:41.370 --> 00:43.560
And then, we're gonna need a second argument,

00:43.560 --> 00:45.120
which is our input.

00:45.120 --> 00:47.100
And we're gonna call it State

00:47.100 --> 00:50.580
because State is exactly the input of our neural networks.

00:50.580 --> 00:51.960
You know, that's the State.

00:51.960 --> 00:54.810
There are the inputs entering the neural network.

00:54.810 --> 00:57.120
And then as output, we will have the Q-values

00:57.120 --> 00:58.680
of the three possible actions;

00:58.680 --> 01:00.690
go left, go straight or go right.

01:00.690 --> 01:03.150
But, we don't need to input it as an argument here

01:03.150 --> 01:05.880
because, that's exactly what we want to return.

01:05.880 --> 01:07.770
So this Forward function is not only going to

01:07.770 --> 01:09.630
activate the neurons, but also

01:09.630 --> 01:12.510
and mostly, it'll return the Q-values

01:12.510 --> 01:16.620
for each possible action, depending on the input state here.

01:16.620 --> 01:17.453
Alright.

01:17.453 --> 01:19.140
So that's the two arguments we need.

01:19.140 --> 01:21.300
And now let's go inside the function

01:21.300 --> 01:24.780
and let's specify what we want it to do.

01:24.780 --> 01:27.090
Okay, So the first thing we're gonna do is

01:27.090 --> 01:29.040
activate the hidden neurons

01:29.040 --> 01:32.190
and we're gonna call the hidden neurons by the variable X.

01:32.190 --> 01:34.770
So X represents the hidden neurons.

01:34.770 --> 01:36.900
And so how are we going to activate them?

01:36.900 --> 01:40.140
Well, of course, we're gonna take our input neurons.

01:40.140 --> 01:43.230
We're gonna use our first full connection fc1 to

01:43.230 --> 01:45.510
get the hidden neurons, and then we're going to

01:45.510 --> 01:48.270
apply an activation function on them, which will be

01:48.270 --> 01:49.830
rectifier function.

01:49.830 --> 01:51.380
So how are we going to do that?

01:52.244 --> 01:57.000
Remember? we imported the torch.nn.functional module

01:57.000 --> 01:59.970
that contains all the functions in PyTorch to

01:59.970 --> 02:03.420
implement a neural network, and we gave it the shortcut F.

02:03.420 --> 02:05.700
So actually what we're gonna do now is we're going to

02:05.700 --> 02:08.970
use one of these functions from the functional module.

02:08.970 --> 02:11.400
And this function is the ReLU function.

02:11.400 --> 02:13.019
So what is ReLU?

02:13.019 --> 02:15.180
Relu is the rectifier function that you saw

02:15.180 --> 02:16.560
in the intuition lectures.

02:16.560 --> 02:19.170
That's just the name given to the rectifier function.

02:19.170 --> 02:23.192
But since this function is taken from an INDOT functional

02:23.192 --> 02:26.550
which was given the shortcut F; we need to type here

02:26.550 --> 02:30.900
first F., and then that's where we can take

02:30.900 --> 02:31.980
this function.

02:31.980 --> 02:34.710
And actually, if I type Re, here we go,

02:34.710 --> 02:38.040
we have the ReLU function. So that's the rectifier function

02:38.040 --> 02:42.000
that will activate the hidden neurons, that is, X.

02:42.000 --> 02:45.180
So, in this ReLU function, now we understand perfectly

02:45.180 --> 02:46.230
what we have to input.

02:46.230 --> 02:48.930
That's the neurons that we want to activate

02:48.930 --> 02:50.760
that is, the hidden neurons.

02:50.760 --> 02:52.620
And so to get these hidden neurons, we're gonna

02:52.620 --> 02:55.860
take our first full connection fc1, which we

02:55.860 --> 02:59.010
will apply to our input neurons to go from the

02:59.010 --> 03:01.374
input neurons to the hidden neurons.

03:01.374 --> 03:04.620
So let's take our first full connection, fc1.

03:04.620 --> 03:08.100
But our first full connection is a variable of our object.

03:08.100 --> 03:12.870
Therefore, we need to type here first self.fc1.

03:12.870 --> 03:13.703
Here we go.

03:13.703 --> 03:16.950
That's the first full connection of our neural network.

03:16.950 --> 03:19.020
And, inside this first full connection

03:19.020 --> 03:21.780
we are gonna input our input states to go

03:21.780 --> 03:24.058
from the input neurons to the hidden neurons.

03:24.058 --> 03:26.640
And so since we gave it the name state

03:26.640 --> 03:29.790
well, here we have to input state.

03:29.790 --> 03:30.690
And there we go.

03:30.690 --> 03:34.500
We now get activated hidden neurons.

03:34.500 --> 03:37.050
Alright? And now that we have the hidden neurons

03:37.050 --> 03:40.620
we are going to return the output neurons.

03:40.620 --> 03:41.760
So, next line.

03:41.760 --> 03:45.240
And as you understood, the output neurons correspond

03:45.240 --> 03:48.240
to our actions, but these are not the actions directly.

03:48.240 --> 03:49.860
These are the Q-values

03:49.860 --> 03:52.860
because we're building a deep Q-learning model

03:52.860 --> 03:56.010
that combines a deep learning model to Q-learning.

03:56.010 --> 03:59.310
And therefore, we use Q-learning here to get the Q-values

03:59.310 --> 04:01.050
for each of our actions.

04:01.050 --> 04:04.050
And then later using a softmax or an argmax,

04:04.050 --> 04:05.823
we will get the final action.

04:06.720 --> 04:08.190
So here, the variable I'm

04:08.190 --> 04:12.030
about to introduce will correspond to the output neurons.

04:12.030 --> 04:14.070
And since the output neurons are the Q-values, well

04:14.070 --> 04:18.210
I'm gonna call this variable Q-values.

04:18.210 --> 04:19.043
There we go.

04:19.043 --> 04:20.070
So Q-values.

04:20.070 --> 04:23.700
And now, we directly take our full connection,

04:23.700 --> 04:25.350
which is the variable fc2

04:25.350 --> 04:27.000
but the variable from our object.

04:27.000 --> 04:30.210
So we take here self.fc2.

04:30.210 --> 04:32.760
And of course, here we input the neurons

04:32.760 --> 04:34.890
of the left side of this full connection.

04:34.890 --> 04:38.250
That is what we got from the first line, which is X.

04:38.250 --> 04:40.770
So, X, there we go.

04:40.770 --> 04:42.660
We now get our Q-values.

04:42.660 --> 04:46.110
That's the output neurons of our neural network.

04:46.110 --> 04:48.810
Okay? And then last line of codes.

04:48.810 --> 04:49.643
Of course

04:49.643 --> 04:53.160
this forward function is used to return these Q-values.

04:53.160 --> 04:58.160
So we just have to add a return and simply Q-values.

04:59.820 --> 05:02.010
And that will return the Q-values

05:02.010 --> 05:03.570
for each possible action.

05:03.570 --> 05:06.022
Go left, go straight or go right.

05:06.022 --> 05:08.370
Alright, so congratulations.

05:08.370 --> 05:10.770
We are done with our first class, and actually,

05:10.770 --> 05:13.680
we are done making the architecture of the neural network.

05:13.680 --> 05:15.510
Remember, this is not a finished job.

05:15.510 --> 05:18.575
You can always improve the architecture

05:18.575 --> 05:21.005
of the neural network by trying different ones.

05:21.005 --> 05:22.800
So feel free to do that by adding more neurons here.

05:22.800 --> 05:25.680
For example, if you want to add 50 hidden neurons,

05:25.680 --> 05:28.806
you can just replace the 30 here and the 30 here by 50,

05:28.806 --> 05:32.850
50 and 50, and then you can add some more hidden layers

05:32.850 --> 05:34.980
by making some new full connections.

05:34.980 --> 05:37.620
Well, that's really the job of an artist.

05:37.620 --> 05:38.970
There is no general rule

05:38.970 --> 05:42.030
of what would be the best architecture in each situation.

05:42.030 --> 05:43.770
So that's why we have to experiment.

05:43.770 --> 05:46.470
But, let's already try with that. You will see

05:46.470 --> 05:50.160
that we'll get eventually a pretty good self-driving car.

05:50.160 --> 05:52.740
Alright, and now we're going to make the next class

05:52.740 --> 05:54.783
which is about experience replay,

05:55.676 --> 05:57.600
and we will be making that in the next three tutorials.

05:57.600 --> 05:59.373
Until then, enjoy AI.
