WEBVTT

00:00.390 --> 00:02.670
-: Hello and welcome to this tutorial.

00:02.670 --> 00:05.970
All right, it's now time to build our very first AI.

00:05.970 --> 00:06.803
Because right now,

00:06.803 --> 00:08.820
we've only made the manual of instructions

00:08.820 --> 00:12.330
with the AI class, but we haven't created any object yet,

00:12.330 --> 00:15.150
and so we haven't a real, actual AI yet.

00:15.150 --> 00:16.740
But we're about to get it right now

00:16.740 --> 00:20.400
because we're about to create one object of this AI class.

00:20.400 --> 00:23.010
And this object will be nothing else than an AI,

00:23.010 --> 00:25.800
which will have a brain and a body.

00:25.800 --> 00:27.090
All right, so let's do this.

00:27.090 --> 00:29.610
It's actually very simple to do it

00:29.610 --> 00:32.550
now that we have defined everything with the classes.

00:32.550 --> 00:34.680
So, basically, what we need to do

00:34.680 --> 00:37.830
is first, create a brain, because, as you can see,

00:37.830 --> 00:40.860
when we create an AI, we need to input a brain,

00:40.860 --> 00:42.140
but we also input a body,

00:42.140 --> 00:44.250
so we need to create a body as well.

00:44.250 --> 00:48.030
And then, once we created a brain object and a body object,

00:48.030 --> 00:50.940
well, we will be able to create the AI.

00:50.940 --> 00:51.773
But no worries,

00:51.773 --> 00:54.150
we will build the brain and the body in a Flashlight.

00:54.150 --> 00:55.830
And actually, let's do it right now.

00:55.830 --> 00:57.210
Let's start with the brain.

00:57.210 --> 00:59.460
We're gonna call the brain CNN

00:59.460 --> 01:02.160
because the brain is a convolutional neural network

01:02.160 --> 01:05.730
and it will be an object of the CNN class,

01:05.730 --> 01:07.800
so it makes sense to call it CNN.

01:07.800 --> 01:12.120
So CNN equals, and then we take our CNN class this time,

01:12.120 --> 01:15.120
and we input, in parenthesis, according to you.

01:15.120 --> 01:16.650
Well, at this point right now

01:16.650 --> 01:19.050
when we create an object of a class,

01:19.050 --> 01:21.360
what we have to input is very simply

01:21.360 --> 01:24.330
the argument of the init function.

01:24.330 --> 01:25.890
And that's number actions.

01:25.890 --> 01:27.840
And thanks to what we previously,

01:27.840 --> 01:29.520
when getting the doom environment,

01:29.520 --> 01:31.680
where we already have this number actions

01:31.680 --> 01:36.000
and therefore, we simply need to input number actions here

01:36.000 --> 01:38.130
into the CNN class.

01:38.130 --> 01:38.970
Perfect.

01:38.970 --> 01:40.350
So now we have the brain.

01:40.350 --> 01:42.480
Now, let's make the body.

01:42.480 --> 01:45.510
We're gonna create an object of the softmax body class

01:45.510 --> 01:49.620
and we're gonna call this object softmax body.

01:49.620 --> 01:51.420
That will be the body of our AI.

01:51.420 --> 01:56.420
And this object is an object of the softmax body class

01:57.240 --> 02:00.090
to which we have to input the only argument

02:00.090 --> 02:02.790
of the init function of the softmax body class,

02:02.790 --> 02:04.410
which is the temperature, T.

02:04.410 --> 02:07.890
And therefore, here we input T, but we have to specify value

02:07.890 --> 02:10.680
because so far, T is just an argument.

02:10.680 --> 02:14.490
So T equals, and we're gonna start with one.

02:14.490 --> 02:17.670
That's a small temperature, but this might work very well.

02:17.670 --> 02:21.660
And actually, I already know this will work very well, so.

02:21.660 --> 02:23.670
But you can try with other temperatures.

02:23.670 --> 02:24.750
You know how it works now.

02:24.750 --> 02:27.840
Your actions will be more sure of themselves.

02:27.840 --> 02:29.940
That is the action with the highest Q-value

02:29.940 --> 02:32.280
will have a higher probability to be selected,

02:32.280 --> 02:34.020
as opposed to the other actions,

02:34.020 --> 02:36.450
which will have lower probabilities to be selected

02:36.450 --> 02:38.490
and therefore, they will be less explored.

02:38.490 --> 02:40.800
But anyway, we can start with one.

02:40.800 --> 02:43.143
This will get us a good body.

02:44.010 --> 02:47.310
All right, so now we have a brain, we have a body,

02:47.310 --> 02:50.850
so I guess it's time to make the final AI, eventually.

02:50.850 --> 02:52.620
So now, you're gonna see

02:52.620 --> 02:54.780
how things are gonna become so simple.

02:54.780 --> 02:57.150
It's when the intuition reaches its peak.

02:57.150 --> 03:00.090
To make an AI, we simply need to create an object

03:00.090 --> 03:03.930
that we call, of course, AI from our AI class.

03:03.930 --> 03:07.020
And since an AI is composed of a brain and a body,

03:07.020 --> 03:08.640
we input the brain,

03:08.640 --> 03:12.630
which is our convolutional neural network, for the object

03:12.630 --> 03:15.060
and a body, which is nothing else

03:15.060 --> 03:20.060
than the softmax body object from the softmax body class.

03:21.420 --> 03:24.840
And see, we built an AI in a Flashlight

03:24.840 --> 03:27.600
by just inputting a brain and a body.

03:27.600 --> 03:30.570
And now, we have an AI ready to be trained.

03:30.570 --> 03:31.740
So now it's time to launch

03:31.740 --> 03:34.680
the whole Deep convolutional Q-Learning process

03:34.680 --> 03:35.910
with experience replay,

03:35.910 --> 03:38.940
that bonus of eligibility trace on 10 steps.

03:38.940 --> 03:41.130
And eventually, once we have all this,

03:41.130 --> 03:43.980
we will train the AI to make it smart.

03:43.980 --> 03:46.020
So can't wait to do this.

03:46.020 --> 03:47.790
The next section is gonna be about

03:47.790 --> 03:49.650
setting up experience replay.

03:49.650 --> 03:52.020
So we're not going to implement it all over again,

03:52.020 --> 03:54.840
like for the self-driving car, because the good news is

03:54.840 --> 03:57.000
that we already have it implemented.

03:57.000 --> 03:57.990
So that will be fast.

03:57.990 --> 04:01.200
We will just create an object of the replay memory class

04:01.200 --> 04:03.480
that is in this experience replay file.

04:03.480 --> 04:04.830
So that will help us a lot.

04:04.830 --> 04:06.660
And therefore, we will move on quickly

04:06.660 --> 04:10.020
to what's new and most important, that is the training.

04:10.020 --> 04:12.150
So let's attack this in the next tutorials.

04:12.150 --> 04:13.983
And until then, enjoy AI.
