WEBVTT

00:00.570 --> 00:03.000
Tutor: Hello and welcome to this tutorial.

00:03.000 --> 00:04.110
All right, so now we are going

00:04.110 --> 00:07.050
to implement our artificial intelligence from scratch.

00:07.050 --> 00:09.030
We're going to code it line by line,

00:09.030 --> 00:11.010
and in this first code section we're going

00:11.010 --> 00:12.112
to import the libraries.

00:12.112 --> 00:14.773
But before we start with this first code section,

00:14.773 --> 00:18.242
I would like to explain the connection between the AI

00:18.242 --> 00:20.550
and our map.py file.

00:20.550 --> 00:24.030
That is, why are we implementing this for the map?

00:24.030 --> 00:25.635
What is the purpose of our AI

00:25.635 --> 00:27.343
and where will we be using it?

00:27.343 --> 00:29.721
So it's actually very simple.

00:29.721 --> 00:31.620
We are only making our AI

00:31.620 --> 00:34.696
to select the right action at each time.

00:34.696 --> 00:39.120
So, okay, we import the DQN class from our AI file.

00:39.120 --> 00:42.046
So we will be making this DQN class in this file,

00:42.046 --> 00:47.046
but then we import it only to select the right action

00:47.820 --> 00:49.441
to play at each time.

00:49.441 --> 00:53.040
And we select this action exactly at this line.

00:53.040 --> 00:56.896
Action equals brain update, last reward, last signal.

00:56.896 --> 01:00.269
Last signal will be the input of the neural network.

01:00.269 --> 01:02.610
You know, it's composed of the three signals

01:02.610 --> 01:04.620
of the sensors plus the orientation

01:04.620 --> 01:05.727
and minus orientation.

01:05.727 --> 01:07.320
So that's the input,

01:07.320 --> 01:09.669
but then the output is the action to play.

01:09.669 --> 01:13.530
And that's only what we'll be taking from our AI file

01:13.530 --> 01:14.886
that we're about to make.

01:14.886 --> 01:17.154
So keep that in mind, it's very simple.

01:17.154 --> 01:21.088
We first import the DQN class from the AI

01:21.088 --> 01:25.950
then we create the object brain from the DQN class

01:25.950 --> 01:28.890
which takes as input the encoded vectors

01:28.890 --> 01:31.500
for the states of five dimensions, the three signals,

01:31.500 --> 01:33.419
plus orientation, plus minus orientation,

01:33.419 --> 01:37.020
the reactions go left, go straight, or go right,

01:37.020 --> 01:38.520
and then this gamma parameter,

01:38.520 --> 01:41.070
that's the only parameters of the DQN class

01:41.070 --> 01:42.101
that we will be making.

01:42.101 --> 01:44.522
And then once we create that object,

01:44.522 --> 01:47.520
we select in the game class,

01:47.520 --> 01:50.130
the action to play at each time.

01:50.130 --> 01:51.900
And that depends on the last reward

01:51.900 --> 01:53.601
and the last signal, which is the input.

01:53.601 --> 01:54.840
And that's all.

01:54.840 --> 01:57.317
That's the only purpose of making this AI.

01:57.317 --> 02:00.525
That's in order to have a real artificial intelligence

02:00.525 --> 02:03.554
playing the right actions at each time, the right move,

02:03.554 --> 02:06.720
instead of having random actions like we observed

02:06.720 --> 02:07.742
in the previous tutorial.

02:07.742 --> 02:09.750
All right, so let's do this.

02:09.750 --> 02:12.008
Let's implement our artificial intelligence.

02:12.008 --> 02:14.666
And as we said, we are gonna start by importing

02:14.666 --> 02:17.784
all the libraries that we'll be using to implement it.

02:17.784 --> 02:20.533
So that way we will have all the tools we need.

02:20.533 --> 02:22.800
All right, so let's start with the first one.

02:22.800 --> 02:27.600
The first one is the inevitable, the NumPy library.

02:27.600 --> 02:30.211
The NumPy library, I always recommend to import it.

02:30.211 --> 02:33.052
It's the library which allows us to play

02:33.052 --> 02:35.278
and work with the arrays.

02:35.278 --> 02:37.800
And this mPy here is just a shortcut.

02:37.800 --> 02:40.092
More convenience when we want to use NumPy.

02:40.092 --> 02:43.587
All right, then second library is random.

02:43.587 --> 02:45.864
So this is just because we will

02:45.864 --> 02:49.710
be taking some random samples from the different batches

02:49.710 --> 02:51.900
when implementing experience replay.

02:51.900 --> 02:53.826
So we have to import this random library as well.

02:53.826 --> 02:57.833
Then we will import OS,

02:57.833 --> 03:01.290
that will be be just useful when we want to load the model

03:01.290 --> 03:03.300
because you know, once the model is ready,

03:03.300 --> 03:05.561
we will implement some code to save the model

03:05.561 --> 03:08.700
and then another code to load the model.

03:08.700 --> 03:11.310
That's when we want to, you know, save the brain

03:11.310 --> 03:13.020
and load the brain whenever you want

03:13.020 --> 03:14.550
to shut down your computer

03:14.550 --> 03:17.024
and reuse the brain that was trained before

03:17.024 --> 03:18.678
for some new experiment.

03:18.678 --> 03:19.907
So that's important.

03:19.907 --> 03:24.907
Then we are going to import the Torch library, essential.

03:26.910 --> 03:29.640
That's because we will be implementing our neural network

03:29.640 --> 03:32.190
with PyTorch, which I recommend much more

03:32.190 --> 03:34.410
than the other ones for artificial intelligence

03:34.410 --> 03:36.256
because it can handle diamond graphs.

03:36.256 --> 03:37.623
So there we go with Torch.

03:37.623 --> 03:42.623
Then from Torch, we are going to import Torch.nn,

03:44.733 --> 03:47.370
The nn module is the most essential one.

03:47.370 --> 03:49.345
That's the module that contains all the tools

03:49.345 --> 03:51.199
to implement some neural networks.

03:51.199 --> 03:53.717
And of course there will be a deep neural network

03:53.717 --> 03:57.090
that will take as inputs the three signals

03:57.090 --> 03:59.610
of the three sensors, plus orientation

03:59.610 --> 04:00.870
and minus orientation,

04:00.870 --> 04:04.080
and will return as output, the action to play.

04:04.080 --> 04:06.360
Well actually, it'll return the Q values

04:06.360 --> 04:07.623
of the different actions.

04:08.572 --> 04:11.580
And using a soft max we will return the action to play,

04:11.580 --> 04:15.300
only one the most relevant one to accomplish the car score.

04:15.300 --> 04:17.464
So Torch.nn most essential one,

04:17.464 --> 04:22.464
then we are gonna give a shortcut to the functional package.

04:26.042 --> 04:29.940
From here we go, the functional package from the nn module.

04:29.940 --> 04:31.839
So this functional package

04:31.839 --> 04:34.500
contains the different functions that we use

04:34.500 --> 04:36.300
when implementing a neural network.

04:36.300 --> 04:38.430
So typically the last function,

04:38.430 --> 04:39.927
we will be using the uber loss

04:39.927 --> 04:41.693
because that improves convergence.

04:41.693 --> 04:44.370
And the uber loss is contained

04:44.370 --> 04:47.160
in this functional sub module from the nn module.

04:47.160 --> 04:48.900
And since all this is pretty long,

04:48.900 --> 04:50.370
we're gonna give it a shortcut

04:50.370 --> 04:53.458
and we are gonna call it F, simply.

04:53.458 --> 04:56.931
Then only three modules to import left.

04:56.931 --> 05:01.931
So the next one is another essential one, which is Optim.

05:03.150 --> 05:05.377
And we take it from still the Torch library

05:05.377 --> 05:07.896
and then Optim, there we go.

05:07.896 --> 05:12.648
And let's just call it Optim instead of Torch.optim.

05:12.648 --> 05:14.952
So that's of course for the optimizer.

05:14.952 --> 05:17.190
We will be importing some optimizers

05:17.190 --> 05:19.007
to perform the cast degrade in descent.

05:19.007 --> 05:21.131
So we will definitely need it.

05:21.131 --> 05:24.782
And then we need to import autograd,

05:24.782 --> 05:28.246
and that's only to take the variable class from autograd.

05:28.246 --> 05:31.620
So the purpose of it is a little bit technical.

05:31.620 --> 05:34.503
Basically, we need to import the variable class

05:34.503 --> 05:38.100
to make some conversion from tensors,

05:38.100 --> 05:40.170
which are like more advanced arrays

05:40.170 --> 05:42.712
to a variable that contains a gradient.

05:42.712 --> 05:46.139
So it's like we don't want to have only a tensor by itself.

05:46.139 --> 05:48.507
We want to put the tensor into a variable

05:48.507 --> 05:50.541
that will also contain a gradient.

05:50.541 --> 05:51.630
And to do this,

05:51.630 --> 05:54.960
we need to use the variable class to convert this tensor

05:54.960 --> 05:58.470
into a variable containing the tensor and the gradient.

05:58.470 --> 05:59.670
So that's a little bit technical

05:59.670 --> 06:02.640
but that's what we have to do when working with PyTorch.

06:02.640 --> 06:04.890
And we do this thanks to the variable class

06:04.890 --> 06:06.960
but before getting the variable class

06:06.960 --> 06:10.442
we need to import Torch.autograd

06:10.442 --> 06:14.932
and let's give a shortcut as well, autograd.

06:14.932 --> 06:19.932
And then from Torch.autograd, we import variable.

06:25.830 --> 06:26.663
There we go.

06:26.663 --> 06:28.980
And now we have all the libraries

06:28.980 --> 06:31.590
that we'll be using to implement our AI.

06:31.590 --> 06:33.626
So we won't bother importing any other library.

06:33.626 --> 06:35.443
We have all the tools we need

06:35.443 --> 06:38.114
and now we are ready to create the architecture

06:38.114 --> 06:39.385
of the neural network.

06:39.385 --> 06:42.284
So that's exactly what we'll do in the next tutorial.

06:42.284 --> 06:44.673
And until then, enjoy AI.