WEBVTT

00:00.570 --> 00:01.710
-: What Allison should know-

00:01.710 --> 00:03.257
-: What is Internet anyway?

00:04.470 --> 00:07.950
-: Internet is that massive computer network?

00:07.950 --> 00:10.980
-: The one that's becoming really big now

00:10.980 --> 00:12.102
-: What do you mean that's big?

00:12.102 --> 00:12.935
How does one-

00:12.935 --> 00:13.768
what do you write to it like mail?

00:13.768 --> 00:16.380
-: No, A lot of people use it and communicate.

00:16.380 --> 00:18.930
I guess they can communicate with NBC writers and producers.

00:18.930 --> 00:21.270
Allison, can you explain what internet is?

00:21.270 --> 00:25.603
(Beethoven's Symphony no. 5 starts)

00:29.130 --> 00:30.630
-: How amazing is that?

00:30.630 --> 00:31.980
Just over 20 years ago

00:31.980 --> 00:34.470
people didn't even know what the internet was.

00:34.470 --> 00:37.290
And today we can't even imagine our lives without it.

00:37.290 --> 00:39.540
Welcome to the Deep Learning A to Z course.

00:39.540 --> 00:40.710
My name is Kirill Eremenko

00:40.710 --> 00:43.230
and along with the co-instructor Hadelin de Ponteves,

00:43.230 --> 00:45.570
we're super excited to have you on board

00:45.570 --> 00:47.580
and today we're going to give you a quick overview

00:47.580 --> 00:49.470
of what deep learning it is

00:49.470 --> 00:52.260
and why it's picking up right now.

00:52.260 --> 00:53.610
So let's get started.

00:53.610 --> 00:55.380
Why did we have a look at that clip

00:55.380 --> 00:57.660
and what is this photo over here?

00:57.660 --> 01:00.270
Well, that clip was from 1994.

01:00.270 --> 01:03.180
This is a photo of computer from 1980

01:03.180 --> 01:05.580
and the reason why we kind of delving

01:05.580 --> 01:06.600
into history a little bit

01:06.600 --> 01:10.050
is because neural networks along with deep learning

01:10.050 --> 01:12.150
have been around for quite some time

01:12.150 --> 01:14.910
and they've only started picking up now

01:14.910 --> 01:16.740
and impacting the world right now.

01:16.740 --> 01:19.020
But if you look back at the eighties you'll see

01:19.020 --> 01:20.370
that even though they were invented

01:20.370 --> 01:22.230
in the sixties and seventies,

01:22.230 --> 01:25.020
they really caught onto a trend

01:25.020 --> 01:27.690
or caught wind in the eighties.

01:27.690 --> 01:30.780
So people started talking about them a lot.

01:30.780 --> 01:32.880
There was a lot of research in that area

01:32.880 --> 01:34.830
and everybody thought that deep learning

01:34.830 --> 01:38.490
or neural networks were this new thing that

01:38.490 --> 01:39.960
is going to impact the world,

01:39.960 --> 01:41.340
that's going to change everything,

01:41.340 --> 01:43.140
is gonna solve all the world problems.

01:43.140 --> 01:46.110
And then it kind of slowly died off over the next decade.

01:46.110 --> 01:47.108
And so what happened?

01:47.108 --> 01:49.890
Why did the neural networks not survive

01:49.890 --> 01:50.850
and not change the world?

01:50.850 --> 01:52.710
Was it the reason for that

01:52.710 --> 01:54.030
that they were just not good enough,

01:54.030 --> 01:55.530
that they're, you know,

01:55.530 --> 01:57.240
not that good at predicting things

01:57.240 --> 01:58.710
and not that good at modeling

01:58.710 --> 02:02.340
and basically just not a good invention?

02:02.340 --> 02:03.420
Or is there another reason?

02:03.420 --> 02:05.160
Well, actually there is another reason

02:05.160 --> 02:06.750
and the reason is in front of us.

02:06.750 --> 02:08.880
It's the fact that technology back then

02:08.880 --> 02:11.640
was not up to the right standard

02:11.640 --> 02:13.740
to facilitate neural networks.

02:13.740 --> 02:16.410
In order for neural networks and deep learning

02:16.410 --> 02:17.880
to work properly you need two things.

02:17.880 --> 02:20.220
You need data and you need a lot of data

02:20.220 --> 02:21.630
and you need processing power.

02:21.630 --> 02:24.120
You need strong computers to process that data

02:24.120 --> 02:25.980
and facilitate the neural networks.

02:25.980 --> 02:29.850
So let's have a look at how data

02:29.850 --> 02:32.310
or storage of data has evolved over the years

02:32.310 --> 02:34.830
and then we'll look at how technology has evolved.

02:34.830 --> 02:38.853
So here we've got three years, 1956, 1980, 2017.

02:39.810 --> 02:43.260
How did storage look back in 1956?

02:43.260 --> 02:45.420
Well, there's a hard drive and

02:45.420 --> 02:48.480
that hard drive is only a five, wait for it,

02:48.480 --> 02:49.770
megabyte hard drive.

02:49.770 --> 02:53.670
That's 5 megabytes right there on the forklift.

02:53.670 --> 02:55.290
The size of a small room,

02:55.290 --> 02:58.080
that's a hard drive being transported

02:58.080 --> 03:01.350
to another location on a plane.

03:01.350 --> 03:04.220
And that is what storage looked like in the-

03:04.220 --> 03:05.580
in 1956.

03:05.580 --> 03:06.420
You had to pay-

03:06.420 --> 03:09.030
a company had to pay two and a half thousand dollars

03:09.030 --> 03:12.419
of those days dollars to rent that hard drive.

03:12.419 --> 03:15.423
To rent it - not buy it- to rent it for one month.

03:16.380 --> 03:18.810
In 1980, the situation improved a little bit.

03:18.810 --> 03:20.460
So here we've got a 10 megabyte hard drive

03:20.460 --> 03:22.800
for three and a half thousand dollars.

03:22.800 --> 03:25.170
It's still very expensive and only 10 megabytes.

03:25.170 --> 03:27.240
So that's like one photo these days.

03:27.240 --> 03:32.240
And today in 2017, we've got a 256 gigabyte SSD card

03:32.850 --> 03:37.080
for $150, which can fit on your finger.

03:37.080 --> 03:40.980
And if you're watching this video a year later, or like

03:40.980 --> 03:43.830
in 2019 or 2025, you're probably laughing to yourself

03:43.830 --> 03:47.250
because by then you have even stronger storage capacity.

03:47.250 --> 03:49.110
But nevertheless, the point stands.

03:49.110 --> 03:51.270
So if we compare these across the board

03:51.270 --> 03:54.000
and without even taking price and size into consideration

03:54.000 --> 03:58.380
just the capacity of whatever was trending at the time.

03:58.380 --> 04:03.380
So from 1956 to 1980, capacity increased about double

04:04.230 --> 04:09.180
and then it increased about 25,600 times.

04:09.180 --> 04:12.390
And the, you know, the length of the period

04:12.390 --> 04:13.260
is not that different.

04:13.260 --> 04:16.200
So from 1956 to 1980, 24 years

04:16.200 --> 04:18.810
from 1980 to 2017, 37 years

04:18.810 --> 04:21.780
so not that much of an increase in time

04:21.780 --> 04:24.900
but a huge jump in technological progress.

04:24.900 --> 04:28.230
And that stands to show that this is not a linear trend

04:28.230 --> 04:30.600
this is an exponential growth in technology.

04:30.600 --> 04:33.840
And if we add into take into account price and size,

04:33.840 --> 04:37.260
it'll be in the millions of increase.

04:37.260 --> 04:40.620
And here we actually have a chart on a logarithmic scale.

04:40.620 --> 04:44.520
So if we plot the hard drive cost per gigabyte

04:44.520 --> 04:46.410
you'll see that looks something like this.

04:46.410 --> 04:50.250
We're very quickly approaching zero right now.

04:50.250 --> 04:52.980
You can get storage on Dropbox and Google Drive

04:52.980 --> 04:55.620
which doesn't cost you anything cloud storage.

04:55.620 --> 04:57.330
And that's going to continue.

04:57.330 --> 04:58.980
And in fact, over the years

04:58.980 --> 05:01.290
this is going to go even further.

05:01.290 --> 05:04.950
Right now, scientists are already looking into using DNA

05:04.950 --> 05:07.740
for storage and right now it's quite expensive.

05:07.740 --> 05:12.040
It costs $7,000 to synthesize two megabytes of data

05:13.020 --> 05:15.240
and then another thou $2,000 to read it.

05:15.240 --> 05:16.320
But that kind of reminds you

05:16.320 --> 05:18.960
of this whole situation of the hard drive and the plane.

05:18.960 --> 05:20.700
You know that this is gonna be mitigated

05:20.700 --> 05:22.270
very, very quickly with this exponential curve.

05:22.270 --> 05:25.290
10 to 10 years from now, 20 years from now

05:25.290 --> 05:27.060
everybody's gonna be using DNA storage

05:27.060 --> 05:28.650
if we go down this direction.

05:28.650 --> 05:30.000
And here's some stats around that,

05:30.000 --> 05:32.130
so you can explore this further.

05:32.130 --> 05:34.080
Maybe pause this, pause the video if you want to

05:34.080 --> 05:37.020
read a bit more about this, This is from nature.com.

05:37.020 --> 05:40.500
And basically you can store all of the world's data

05:40.500 --> 05:44.700
in just one kilo, one kilogram of DNA storage.

05:44.700 --> 05:47.370
Or you can store about 1 billion terabytes

05:47.370 --> 05:49.410
of data in one gram of DNA storage.

05:49.410 --> 05:51.390
So that's just something to

05:51.390 --> 05:53.880
to show how quickly we're progressing and

05:53.880 --> 05:56.820
that this is why deep learning is picking up.

05:56.820 --> 05:58.980
Now that we are finally at the stage

05:58.980 --> 06:02.580
where we have enough data to train super cool

06:02.580 --> 06:04.260
super sophisticated models.

06:04.260 --> 06:05.280
Back then in the eighties

06:05.280 --> 06:06.497
when it was first initially invented

06:06.497 --> 06:08.700
it was just wasn't the case.

06:08.700 --> 06:12.780
And the second thing we talked about is processing capacity.

06:12.780 --> 06:16.440
So here we've got an exponential curve, again

06:16.440 --> 06:20.250
on a log scale, it's not ideally portrayed here

06:20.250 --> 06:22.020
but on the right you can see it's a log scale.

06:22.020 --> 06:24.390
And this is how computers have been evolving.

06:24.390 --> 06:26.370
So again, feel free to pause a slide.

06:26.370 --> 06:27.420
This is called Moore's Law.

06:27.420 --> 06:29.070
You've probably heard of it

06:29.070 --> 06:31.830
how quickly the processing capacity

06:31.830 --> 06:34.320
of computers has been evolving.

06:34.320 --> 06:35.910
Right now we're somewhere over here

06:35.910 --> 06:39.150
where an average computer you can buy for 1,000 bucks

06:39.150 --> 06:43.680
thinks at the speed of the brain of a rat.

06:43.680 --> 06:47.610
And by 2025 will be the speed of a human, or 2023.

06:47.610 --> 06:50.970
And then by 2050 or 2045

06:50.970 --> 06:54.750
it'll surpass all of the humans combined.

06:54.750 --> 06:58.350
So basically we are entering the era of computers

06:58.350 --> 07:00.330
that are extremely powerful

07:00.330 --> 07:05.330
that can process things way faster than we can imagine.

07:05.700 --> 07:08.580
And that is what is facilitating deep learning.

07:08.580 --> 07:10.800
So all of this brings us to the question

07:10.800 --> 07:11.940
what is deep learning?

07:11.940 --> 07:15.450
What is this whole neural network situation?

07:15.450 --> 07:16.830
What is going on?

07:16.830 --> 07:18.210
What are we even talking about here?

07:18.210 --> 07:20.550
And you've probably seen a picture of something like this,

07:20.550 --> 07:21.600
so let's dive into it.

07:21.600 --> 07:23.490
What is deep learning?

07:23.490 --> 07:25.590
This gentleman over here, Geoffrey Hinton,

07:25.590 --> 07:29.310
is known as the Godfather of Deep Learning

07:29.310 --> 07:33.480
and he did research on deep learning in the eighties.

07:33.480 --> 07:35.820
And he's done lots and lots of work.

07:35.820 --> 07:38.130
Lots of research papers.

07:38.130 --> 07:41.370
He's published in deep learning right now.

07:41.370 --> 07:42.990
He works at Google.

07:42.990 --> 07:44.790
So a lot of the things that we're gonna be talking

07:44.790 --> 07:47.190
about actually come from Geoffrey Hinton

07:47.190 --> 07:48.300
and you can see a lot-

07:48.300 --> 07:49.830
He's got quite a few YouTube videos

07:49.830 --> 07:51.360
he explains things really well.

07:51.360 --> 07:54.090
So highly recommend checking them out.

07:54.090 --> 07:56.190
And so the idea behind deep learning is

07:56.190 --> 07:59.580
to look at the human brain and there's quite-

07:59.580 --> 08:02.088
there's gonna be quite a bit of neuroscience coming up

08:02.088 --> 08:03.330
in these tutorials,

08:03.330 --> 08:06.570
and what we are trying to do here is to mimic

08:06.570 --> 08:09.360
how the human brain operates.

08:09.360 --> 08:11.040
And you know, we don't know that much.

08:11.040 --> 08:12.330
We don't know everything about the human brain

08:12.330 --> 08:14.190
but that little amount that we know

08:14.190 --> 08:16.740
we want to mimic it and recreate it.

08:16.740 --> 08:17.573
And why is that?

08:17.573 --> 08:18.780
Well, because the human brain seems

08:18.780 --> 08:20.580
to be one of the most powerful tools

08:20.580 --> 08:22.290
on this planet for learning.

08:22.290 --> 08:25.080
For learning, adapting skills and then applying them.

08:25.080 --> 08:27.780
And if computers could copy that,

08:27.780 --> 08:29.247
then we could just leverage

08:29.247 --> 08:32.970
what natural selection has already decided for us.

08:32.970 --> 08:34.590
All of those kind of algorithms

08:34.590 --> 08:36.870
that it has decided are the best

08:36.870 --> 08:37.860
we're just gonna leverage that.

08:37.860 --> 08:39.870
Why reinvent the bicycle, right?

08:39.870 --> 08:41.730
So let's see how this works.

08:41.730 --> 08:44.610
Here we've got some neurons.

08:44.610 --> 08:48.663
So these are neurons which are, have been smeared onto glass

08:48.663 --> 08:51.300
and then have been looked at under a microscope

08:51.300 --> 08:52.230
with some coloring.

08:52.230 --> 08:54.300
And this is, you can see what they look like.

08:54.300 --> 08:55.260
So they have like a body

08:55.260 --> 08:56.910
they have these branches

08:56.910 --> 08:58.560
and they have like tails and so on.

08:58.560 --> 08:59.610
So you, you can see then

08:59.610 --> 09:01.920
they have like nucleus inside, in the middle .

09:01.920 --> 09:05.280
And that's basically what a neuron looks like.

09:05.280 --> 09:06.930
In the human brain,

09:06.930 --> 09:10.560
there's approximately 100 billion neurons altogether.

09:10.560 --> 09:11.790
So these are individual neurons

09:11.790 --> 09:13.740
these are actually motor neurons

09:13.740 --> 09:15.330
because they're bigger, they're easier to see.

09:15.330 --> 09:18.360
But nevertheless, there's a 100 billion neurons

09:18.360 --> 09:19.950
in the human brain.

09:19.950 --> 09:22.290
And each neuron is connected to as many as

09:22.290 --> 09:23.970
about a thousand of its neighbors.

09:23.970 --> 09:25.440
So to give you a picture,

09:25.440 --> 09:26.610
this is what it looks like.

09:26.610 --> 09:31.610
This is an actual dissection of the human brain

09:32.130 --> 09:34.800
and this is the cerebellum,

09:34.800 --> 09:38.910
which is this part of your brain at the back,

09:38.910 --> 09:43.380
it is responsible for like motorics

09:43.380 --> 09:45.810
and for, you know, keeping your balance

09:45.810 --> 09:47.730
and some language capabilities and stuff like that.

09:47.730 --> 09:52.323
So this is just to show how vast,

09:53.310 --> 09:54.960
how many neurons there are.

09:54.960 --> 09:57.480
There like billions and billions and billions

09:57.480 --> 09:58.830
of neurons all connecting your brain.

09:58.830 --> 09:59.670
It's not like we're talking

09:59.670 --> 10:02.580
about five or 500 or a thousand or million.

10:02.580 --> 10:04.770
There's billions of neurons in there.

10:04.770 --> 10:06.540
And yeah, so that's

10:06.540 --> 10:08.340
that's what we're going to be trying to recreate.

10:08.340 --> 10:11.880
So how do we recreate this in a computer?

10:11.880 --> 10:14.820
Well, we create an artificial structure called

10:14.820 --> 10:19.820
an artificial neural net where we have nodes or neurons

10:20.610 --> 10:23.490
and we're gonna have some neurons for input values.

10:23.490 --> 10:25.440
So these are values that you

10:25.440 --> 10:27.330
that you know about a certain situation.

10:27.330 --> 10:29.580
So for instance, you're modeling something

10:29.580 --> 10:30.690
you want to predict something

10:30.690 --> 10:32.100
you always gonna have some input

10:32.100 --> 10:35.070
something to start your predictions off.

10:35.070 --> 10:36.780
And then that's called the input layer.

10:36.780 --> 10:38.100
Then you have the output.

10:38.100 --> 10:40.110
So that's a value that you want to predict

10:40.110 --> 10:41.160
whether it's a price

10:41.160 --> 10:44.430
whether it's is somebody going to leave the bank

10:44.430 --> 10:45.498
or stay in the bank?

10:45.498 --> 10:47.940
Is this a fraudulent transaction?

10:47.940 --> 10:49.650
Is this a real transaction?

10:49.650 --> 10:50.910
And so on.

10:50.910 --> 10:52.500
So that's gonna be your output layer.

10:52.500 --> 10:55.350
And in between we're going to have a hidden layer.

10:55.350 --> 10:58.500
So as you could see in your brain,

10:58.500 --> 10:59.790
you have so many neurons.

10:59.790 --> 11:01.410
So some information is coming in

11:01.410 --> 11:03.390
through your eyes, ears, nose.

11:03.390 --> 11:05.100
So you basically your senses.

11:05.100 --> 11:08.760
And then it's not just going right away to the output

11:08.760 --> 11:09.720
where you have the result.

11:09.720 --> 11:11.700
It's going through all of these billions and billions

11:11.700 --> 11:14.550
and billions of neurons before it gets to the output.

11:14.550 --> 11:15.840
And this is the whole concept behind it.

11:15.840 --> 11:17.070
That we're going to model the brain,

11:17.070 --> 11:18.630
so we need these hidden layers

11:18.630 --> 11:20.610
that are there before the output.

11:20.610 --> 11:21.846
So the input layers

11:21.846 --> 11:24.420
neurons connected to hidden layer neurons

11:24.420 --> 11:26.880
that hidden layer neurons are connected to output value.

11:26.880 --> 11:29.310
And so this is, this is pretty cool

11:29.310 --> 11:30.600
but what is this all about?

11:30.600 --> 11:32.130
Where is the deep learning here?

11:32.130 --> 11:32.970
Why is it called deep learning?

11:32.970 --> 11:34.080
Is nothing deep in here.

11:34.080 --> 11:37.050
Well this is kind of like an an option

11:37.050 --> 11:39.480
which one might call shallow learning.

11:39.480 --> 11:41.880
We- there isn't much indeed going on

11:41.880 --> 11:43.440
but why is it called deep learning?

11:43.440 --> 11:46.170
Well, because then we take this to the next level.

11:46.170 --> 11:48.270
We separate it even further

11:48.270 --> 11:51.030
and we have not just one hidden layer.

11:51.030 --> 11:55.560
We have lots and lots and lots of hidden layers

11:55.560 --> 11:59.190
and then we connect everything just like in the human brain.

11:59.190 --> 12:01.950
We connect everything, interconnect everything.

12:01.950 --> 12:05.580
And that's how the input values are

12:05.580 --> 12:07.470
processed through all these hidden layers

12:07.470 --> 12:08.760
just like in the human brain.

12:08.760 --> 12:10.230
Then we have an output value

12:10.230 --> 12:12.450
and now we're talking deep learning.

12:12.450 --> 12:14.400
So that's what deep learning is all about

12:14.400 --> 12:15.900
on a very abstract level.

12:15.900 --> 12:18.360
In the further tutorials we're going to dissect

12:18.360 --> 12:20.490
and dive deep into deep learning

12:20.490 --> 12:21.390
and by the end of it

12:21.390 --> 12:23.460
you will know what deep learning is all about

12:23.460 --> 12:26.430
and you'll know how to apply it in your projects.

12:26.430 --> 12:27.810
Super excited about this.

12:27.810 --> 12:29.760
Can't wait to get started

12:29.760 --> 12:31.890
and I look forward to seeing you on the next tutorial.

12:31.890 --> 12:33.903
Until then, enjoy deep learning.
