WEBVTT

00:00.330 --> 00:02.850
-: Hello, and welcome to the fun tutorial

00:02.850 --> 00:05.130
of this first module self-driving car.

00:05.130 --> 00:06.330
It's gonna be epic.

00:06.330 --> 00:09.030
We're gonna test our AI on the environment

00:09.030 --> 00:11.790
and we're gonna test it on four different levels.

00:11.790 --> 00:13.380
That is, we're gonna play a game.

00:13.380 --> 00:15.660
The game will have four levels of difficulty,

00:15.660 --> 00:18.390
and the AI will have to pass these four levels.

00:18.390 --> 00:20.610
So what are gonna be these four levels?

00:20.610 --> 00:23.852
First level one, the first level is going to be

00:23.852 --> 00:27.180
to reach the airport and then do some round trips

00:27.180 --> 00:29.310
between the airport and the downtown.

00:29.310 --> 00:31.620
So as soon as we see the car do these round trips

00:31.620 --> 00:33.630
well we pass level one.

00:33.630 --> 00:34.680
Then level two.

00:34.680 --> 00:37.560
Level two will be to still do these round trips

00:37.560 --> 00:41.130
but on a specific road that we draw ourselves.

00:41.130 --> 00:43.920
But it's gonna be an easy road because it's level two.

00:43.920 --> 00:47.022
And of course the car will have to self drive

00:47.022 --> 00:49.170
by staying on that road.

00:49.170 --> 00:50.340
So it'll be a road that goes

00:50.340 --> 00:53.400
from the airport to downtown and then the other way.

00:53.400 --> 00:55.830
And so the car will have to do these round trips

00:55.830 --> 00:57.180
by staying on that road.

00:57.180 --> 00:59.820
If it does, we will pass level two.

00:59.820 --> 01:03.060
Then level three, level three will be to draw some obstacles

01:03.060 --> 01:07.260
on the map to see if the car manages to avoid the obstacles

01:07.260 --> 01:08.880
and still reaching its goal.

01:08.880 --> 01:11.340
So no worries, we'll draw some difficult obstacles

01:11.340 --> 01:13.620
that the car will have to avoid, and we'll see

01:13.620 --> 01:16.711
if it managed to reach the airport and the downtown.

01:16.711 --> 01:18.960
And finally, level four,

01:18.960 --> 01:21.630
the most challenging level for the car will be

01:21.630 --> 01:25.410
to draw a very difficult road to reach to downtown.

01:25.410 --> 01:28.170
So I don't know, you know, it'll be a road like some zigzag.

01:28.170 --> 01:29.820
I'm not a brilliant architect

01:29.820 --> 01:31.920
but I'll try to make a challenging road.

01:31.920 --> 01:35.280
So let's hope we pass at least the first level.

01:35.280 --> 01:36.300
That would be great.

01:36.300 --> 01:39.360
Then let's hope we can also pass level two and three.

01:39.360 --> 01:42.150
And if we pass level four, that would be wonderful.

01:42.150 --> 01:43.200
So let's do this.

01:43.200 --> 01:44.580
Let's take the challenge.

01:44.580 --> 01:46.250
Well, actually the self-driving car

01:46.250 --> 01:47.850
is gonna take the challenge

01:47.850 --> 01:49.830
but we are the brain behind this

01:49.830 --> 01:52.080
so let's still hope that works.

01:52.080 --> 01:54.960
All right, so the first thing I'm gonna do is just

01:54.960 --> 01:57.300
to give you a quick reminder about the map.

01:57.300 --> 01:59.160
So that's the map.

01:59.160 --> 02:00.960
And first we're gonna look at the map.

02:00.960 --> 02:04.140
We're gonna look at the self-driving car without the AI.

02:04.140 --> 02:06.990
So it'll just be a car having those random actions

02:06.990 --> 02:09.570
that you saw at the beginning of this module.

02:09.570 --> 02:11.010
So how can we look at that?

02:11.010 --> 02:13.770
Well, we have to deactivate the AI.

02:13.770 --> 02:15.510
And to deactivate the AI

02:15.510 --> 02:19.620
we simply need to put a temperature equal to zero.

02:19.620 --> 02:22.500
Remember that parameter here is the temperature

02:22.500 --> 02:24.180
and right now it is equal to seven.

02:24.180 --> 02:25.800
So that's a low temperature.

02:25.800 --> 02:27.480
We will increase that afterwards.

02:27.480 --> 02:29.820
But if we don't want the car to have a brain,

02:29.820 --> 02:32.580
that is if we don't want to activate the AI, we simply

02:32.580 --> 02:36.180
need to set the temperature to zero, to equal zero,

02:36.180 --> 02:38.940
and same here, of course, that's the real temperature

02:38.940 --> 02:39.810
in the code.

02:39.810 --> 02:42.852
So there we go, and then we must not forget to save.

02:42.852 --> 02:45.780
Because otherwise that won't include the change.

02:45.780 --> 02:47.520
Okay, so now we don't have any AI.

02:47.520 --> 02:49.230
The AI is deactivated.

02:49.230 --> 02:51.840
So let's have a look at the map just to give us

02:51.840 --> 02:54.750
a quick refresher, a quick reminder about what it looks

02:54.750 --> 02:59.750
like, so I'm going to select everything and press enter.

03:00.507 --> 03:03.840
All right, and there's our map and there's our car.

03:03.840 --> 03:07.620
So as you can see, the car is having totally random actions

03:07.620 --> 03:10.110
you know, to go left, to go straight or to go right,

03:10.110 --> 03:13.140
and therefore it is not reaching the airport, which

03:13.140 --> 03:17.520
is I remind at the upper left of the map and not reaching,

03:17.520 --> 03:20.250
well, it just did, but that's totally random.

03:20.250 --> 03:22.800
You see, right now it is at the airport

03:22.800 --> 03:25.080
and it is not reaching the other goal

03:25.080 --> 03:28.440
which is downtown at the bottom right of the map.

03:28.440 --> 03:31.500
So we were just like here, but we can clearly see now

03:31.500 --> 03:33.990
that the actions are totally random.

03:33.990 --> 03:35.850
It is going nowhere

03:35.850 --> 03:39.510
and there is definitely no artificial intelligence.

03:39.510 --> 03:42.360
But no worries, we will activate it right now.,

03:42.360 --> 03:45.210
I'm going to close the map,

03:45.210 --> 03:48.270
and then I'm going to restart the kernel.

03:48.270 --> 03:49.710
Restart the kernel.

03:49.710 --> 03:52.800
You click on this tool button here, and then yes.

03:52.800 --> 03:54.750
And now time for the show.

03:54.750 --> 03:58.230
We're finally going to put this brain we made

03:58.230 --> 04:01.680
in the car and activate the AI.

04:01.680 --> 04:04.110
I'm super excited to see what's gonna happen.

04:04.110 --> 04:06.360
We are gonna activate the AI right now,

04:06.360 --> 04:09.480
and to do this we need to raise the temperature.

04:09.480 --> 04:12.090
So to change the temperature, we just need to replace

04:12.090 --> 04:17.090
that zero by, well, let's start with seven as we had before.

04:17.430 --> 04:19.650
So let's specify seven here.

04:19.650 --> 04:21.570
All right, let's not forget to save.

04:21.570 --> 04:23.640
And now let's get back to our map.

04:23.640 --> 04:25.890
And now we can just re-execute this again

04:25.890 --> 04:27.870
because we restarted the kernel.

04:27.870 --> 04:29.970
So let's execute.

04:29.970 --> 04:31.770
And there we go, we have the car.

04:31.770 --> 04:33.870
And what is it doing?

04:33.870 --> 04:37.830
Well, it is trying to find its way, it's exploring,

04:37.830 --> 04:39.960
it's understanding what it has to do

04:39.960 --> 04:42.030
and it's about to reach the airport.

04:42.030 --> 04:45.660
And there we go, first goal reached, wonderful.

04:45.660 --> 04:48.120
And now the next goal is to reach downtown.

04:48.120 --> 04:51.690
And there it did just reach downtown, and now it's trying

04:51.690 --> 04:54.240
to find the airport back, going to the airport.

04:54.240 --> 04:55.562
And there it did again.

04:55.562 --> 04:57.450
Wonderful, so that works.

04:57.450 --> 04:58.850
It didn't take time actually

04:59.922 --> 05:01.260
to explore, learn from the mistake,

05:01.260 --> 05:04.500
you know, the mistake here is to get further from the goal.

05:04.500 --> 05:06.090
That's where we punish the car

05:06.090 --> 05:08.970
by giving it a slightly negative reward, you know

05:08.970 --> 05:10.500
it's minus 0.2.

05:10.500 --> 05:12.630
So it learned from that mistake.

05:12.630 --> 05:14.940
And by learning from that mistake, it managed

05:14.940 --> 05:19.260
to get the positive rewards by getting closer to the goal.

05:19.260 --> 05:21.930
And now it finally understood what it has to do.

05:21.930 --> 05:24.810
It's definitely reaching the airport and

05:24.810 --> 05:28.470
then reaching the downtown and then doing these round trips.

05:28.470 --> 05:29.940
So that's perfect.

05:29.940 --> 05:33.540
We have a self-driving car, but I can help

05:33.540 --> 05:36.120
but notice it is looking like an insect.

05:36.120 --> 05:38.460
The car doesn't really seem sure of itself.

05:38.460 --> 05:40.590
You know, it doesn't have a very confident movement.

05:40.590 --> 05:42.720
It's like doing left and right.

05:42.720 --> 05:44.640
That's not looking like a car movement.

05:44.640 --> 05:46.470
It looks more like a bug.

05:46.470 --> 05:47.910
So we're gonna fix that.

05:47.910 --> 05:51.240
And as you might have guessed, the way to fix that is

05:51.240 --> 05:54.600
increase the temperature, because remember, the temperature

05:54.600 --> 05:56.850
is the parameter and the soft mac function

05:56.850 --> 05:57.960
that we can increase so

05:57.960 --> 06:00.810
that the action is returned with more certainty.

06:00.810 --> 06:03.480
So that makes sense that if we increase the temperature,

06:03.480 --> 06:05.642
well we might end up getting a car more sure of itself

06:05.642 --> 06:08.130
because the AI will be more sure

06:08.130 --> 06:10.200
of which action it should play.

06:10.200 --> 06:12.120
And that remember is because

06:12.120 --> 06:15.300
the action will be played with a higher probability.

06:15.300 --> 06:18.150
The only problem with this, increasing the temperature

06:18.150 --> 06:21.870
is that remember the AI is less exploring the other actions

06:21.870 --> 06:23.430
because by increasing the temperature,

06:23.430 --> 06:26.370
the other actions will have low probabilities.

06:26.370 --> 06:28.170
But right now, that doesn't seem to be a problem

06:28.170 --> 06:32.062
because the car seems to have no problem reaching its goals

06:32.062 --> 06:34.080
the airport and the downtown.

06:34.080 --> 06:37.050
So we can totally increase the temperature if we want

06:37.050 --> 06:41.370
this thing that so far looks like an insect look like a car.

06:41.370 --> 06:44.820
So let's do this, I'm going to close this now.

06:44.820 --> 06:49.820
There we go, restart the kernel again and press yes.

06:50.790 --> 06:53.160
And now we're going to increase the temperature.

06:53.160 --> 06:56.580
So let's do this, I'm going back to my AI file

06:56.580 --> 07:00.003
then replace these t equals seven by 100.

07:01.380 --> 07:03.720
There we go, then we save,

07:03.720 --> 07:07.860
and now we have a self-driving car, sure of itself.

07:07.860 --> 07:09.180
So we might get better results

07:09.180 --> 07:12.180
and we might get something that looks more like a car.

07:12.180 --> 07:15.753
So let's click on map and then let's re execute that again.

07:16.590 --> 07:18.930
All right, what happened?

07:18.930 --> 07:23.190
Okay, it did some kind of a burnout not sure why, but anyway

07:23.190 --> 07:26.010
now we have something that looks more like a car.

07:26.010 --> 07:28.230
You can see that it is going more straight.

07:28.230 --> 07:31.140
It is not doing these quick left and right movements.

07:31.140 --> 07:33.360
That's because now the car is more sure

07:33.360 --> 07:35.760
of which direction to take at each time.

07:35.760 --> 07:38.250
You know, it wants to take the best direction,

07:38.250 --> 07:40.980
going to the airport and then to downtown.

07:40.980 --> 07:44.460
So clearly we can now say that we passed level one.

07:44.460 --> 07:46.080
The car is doing these round trips

07:46.080 --> 07:47.970
between the airports and the downtown.

07:47.970 --> 07:49.800
So we're gonna save that.

07:49.800 --> 07:52.560
That's, I'm gonna show you how to save the brain.

07:52.560 --> 07:55.140
We just need to click on this save button.

07:55.140 --> 07:59.550
And if we look at what happens here,

07:59.550 --> 08:01.770
well we have the curve of the reward.

08:01.770 --> 08:04.740
At the beginning, we can observe some mistakes that it made.

08:04.740 --> 08:07.170
So that's where the reward is negative.

08:07.170 --> 08:09.240
But then it learned from its mistakes

08:09.240 --> 08:12.480
and the reward increased little by little

08:12.480 --> 08:16.680
until reaching a constant positive reward equals to 0.1.

08:16.680 --> 08:19.200
But that's the maximum reward we set.

08:19.200 --> 08:21.840
And that's because it ended up exploring.

08:21.840 --> 08:23.640
That's the exploration phase

08:23.640 --> 08:25.712
and then it just knew what it had to do

08:25.712 --> 08:28.770
and that's where it was doing these round trips

08:28.770 --> 08:32.610
between the airports and the downtown without any mistake.

08:32.610 --> 08:35.121
So there we go, we passed level one, congratulations,

08:35.121 --> 08:38.160
now let's get things more challenging.

08:38.160 --> 08:40.110
Let's take things at the next level.

08:40.110 --> 08:41.820
Let's try to pass level two

08:41.820 --> 08:44.640
which I remind will be to do these round trips

08:44.640 --> 08:47.130
on a specific road we're gonna draw ourselves.

08:47.130 --> 08:49.170
So let's check that out in the next tutorial.

08:49.170 --> 08:51.123
And until then, enjoy AI.
