WEBVTT

00:00.000 --> 00:01.530
-: In the last section,

00:01.530 --> 00:03.930
we continue to talk a little bit about images

00:03.930 --> 00:05.550
but we're still surprising light

00:05.550 --> 00:08.370
on some of the details around exactly what a container is.

00:08.370 --> 00:09.203
So, in this section,

00:09.203 --> 00:11.040
I'm gonna give you a behind the scenes look

00:11.040 --> 00:12.330
at what a container is,

00:12.330 --> 00:15.180
and how it is created on your machine.

00:15.180 --> 00:16.470
Now, to understand the container,

00:16.470 --> 00:18.870
you first need to have a little bit of background

00:18.870 --> 00:22.380
on exactly how your operating system runs on your computer.

00:22.380 --> 00:24.150
So, I'm gonna first give you a quick overview

00:24.150 --> 00:25.653
of your operating system.

00:26.850 --> 00:28.620
Okay, so this is a quick overview

00:28.620 --> 00:31.350
of the operating system on your computer.

00:31.350 --> 00:34.440
Most operating systems have something called a kernel.

00:34.440 --> 00:37.500
This kernel is a running software process

00:37.500 --> 00:40.290
that governs access between all the programs

00:40.290 --> 00:41.790
that are running on your computer

00:41.790 --> 00:43.620
and all the physical hardware

00:43.620 --> 00:45.960
that is connected to your computer as well.

00:45.960 --> 00:47.490
So, up here at the top of this diagram,

00:47.490 --> 00:49.620
We have different programs that your computer is running

00:49.620 --> 00:53.298
such as Chrome, or Terminal, Spotify, or Node.js.

00:53.298 --> 00:56.100
If you've ever made use of Node.js before

00:56.100 --> 00:58.500
and you've written a file to the hard drive,

00:58.500 --> 01:00.720
it's technically not Node.jS

01:00.720 --> 01:03.810
that is speaking directly to the physical device.

01:03.810 --> 01:06.247
Instead, Node.js says to your kernel,

01:06.247 --> 01:08.910
"Hey, I want to write a file to the hard drive."

01:08.910 --> 01:10.530
The kernel then takes that information

01:10.530 --> 01:12.960
and eventually persist it to the hard disc.

01:12.960 --> 01:15.390
So, the kernel is always kind of this intermediate layer

01:15.390 --> 01:17.520
that governs access between these programs

01:17.520 --> 01:19.470
and your actual hard drive.

01:19.470 --> 01:21.120
The other important thing to understand here

01:21.120 --> 01:24.720
is that these running programs interact with the kernel

01:24.720 --> 01:27.270
through things called system calls.

01:27.270 --> 01:30.162
These are essentially like function invocations.

01:30.162 --> 01:32.767
The kernel exposes different endpoints to say,

01:32.767 --> 01:35.370
"Hey, if you want to write a file to the hard drive,

01:35.370 --> 01:38.010
call this endpoint or this function right here."

01:38.010 --> 01:39.600
It takes some amount of information

01:39.600 --> 01:41.310
and then that information will be eventually written

01:41.310 --> 01:44.283
to the hard disc, or memory, or whatever else is required.

01:45.750 --> 01:48.000
Now, thinking about this entire system right here,

01:48.000 --> 01:51.180
I wanna post a kind of hypothetical situation to you.

01:51.180 --> 01:53.790
I want you to imagine for just a second

01:53.790 --> 01:57.480
that you and I have two programs running on our computer.

01:57.480 --> 02:00.030
Maybe one of them is Chrome, like Chrome the web browser

02:00.030 --> 02:01.620
and the other is Node.js,

02:01.620 --> 02:04.200
the JavaScript Service-side run time.

02:04.200 --> 02:06.780
I want you to imagine that we're in a crazy world

02:06.780 --> 02:09.150
where Chrome, in order to work properly,

02:09.150 --> 02:12.210
has to have Python version 2 installed

02:12.210 --> 02:15.510
and Node.js has to have version 3 installed.

02:15.510 --> 02:17.220
However, on our hard disc,

02:17.220 --> 02:20.250
we only have access to Python version 2.

02:20.250 --> 02:21.900
And for whatever crazy reason,

02:21.900 --> 02:25.290
we are not allowed to have two identical installations

02:25.290 --> 02:26.913
of Python at the same time.

02:27.930 --> 02:30.660
So, as it stands right now, Chrome would work properly

02:30.660 --> 02:32.460
because it has access to version 2,

02:32.460 --> 02:33.780
but Node.js would not

02:33.780 --> 02:36.120
because we do not have a version or a copy

02:36.120 --> 02:38.160
of Python version 3.

02:38.160 --> 02:41.010
Again, this is a completely make belief situation.

02:41.010 --> 02:43.590
I just want you to kind of consider this for a second

02:43.590 --> 02:46.390
'cause this is kind of leading into what a container is.

02:47.340 --> 02:49.140
So, how could we solve this issue?

02:49.140 --> 02:51.150
Well, one way to do it would be used

02:51.150 --> 02:53.970
to make use of a operating system feature

02:53.970 --> 02:56.280
known as namespacing.

02:56.280 --> 02:57.300
With namespacing,

02:57.300 --> 02:59.670
we can look at all of the different hardware resources

02:59.670 --> 03:00.930
connected to our computer

03:00.930 --> 03:03.540
and we can essentially segment out portions

03:03.540 --> 03:05.130
of those resources.

03:05.130 --> 03:07.500
So, we could create a segment of our hard disc

03:07.500 --> 03:11.160
specifically dedicated to housing Python version 2.

03:11.160 --> 03:12.810
And we could make a second segment

03:12.810 --> 03:17.070
specifically dedicated to a housing Python version 3.

03:17.070 --> 03:19.350
Then, to make sure that Chrome has access

03:19.350 --> 03:20.430
to this segment over here

03:20.430 --> 03:23.490
and Node.js has access to this segment over here,

03:23.490 --> 03:26.640
anytime that either of them issues a system call

03:26.640 --> 03:28.920
to read information off the hard drive,

03:28.920 --> 03:31.380
the kernel will look at that incoming system call

03:31.380 --> 03:34.560
and try to figure out which process it is coming from.

03:34.560 --> 03:35.857
So, the kernel could say,

03:35.857 --> 03:38.850
"Okay, if Chrome is trying to read some information

03:38.850 --> 03:41.640
off the hard drive, I'm gonna direct that call

03:41.640 --> 03:44.970
over to this little segment of the hard disc over here,

03:44.970 --> 03:49.380
the segment that has Python version 2 and Node.js."

03:49.380 --> 03:51.810
Anytime that makes a system call to read the hard drive,

03:51.810 --> 03:54.630
the kernel can redirect that over to this segment

03:54.630 --> 03:56.370
for Python version 3.

03:56.370 --> 03:58.800
And so by making use of this kind of namespacing

03:58.800 --> 04:00.270
or segmenting feature,

04:00.270 --> 04:02.460
we can have the ability to make sure that Chrome

04:02.460 --> 04:05.700
and Node.js are able to work on the same machine.

04:05.700 --> 04:06.930
Now again, in reality,

04:06.930 --> 04:09.300
neither of these actually needed installation of Python.

04:09.300 --> 04:10.923
This is just a quick example.

04:12.090 --> 04:15.817
So, this entire process of kind of segmenting a hard,

04:15.817 --> 04:18.330
excuse me, a hardware resource based on the process

04:18.330 --> 04:21.930
that is asking for it is known as namespacing.

04:21.930 --> 04:24.660
With name spacing, we are allowed to isolate resources

04:24.660 --> 04:27.630
per a process or a group of processes,

04:27.630 --> 04:29.115
and we're essentially saying that anytime

04:29.115 --> 04:32.100
this particular process asks for a resource,

04:32.100 --> 04:35.010
we're gonna direct it to this one little specific area

04:35.010 --> 04:36.990
of the given piece of hardware.

04:36.990 --> 04:39.210
Now, namespacing is not only used for hardware,

04:39.210 --> 04:42.450
it can be also used for software elements as well.

04:42.450 --> 04:44.760
So for example, we can namespace a process

04:44.760 --> 04:48.210
to restrict the area of a hard drive that'd be available,

04:48.210 --> 04:50.610
or the network devices that are available,

04:50.610 --> 04:53.460
or the ability to talk to in other processes,

04:53.460 --> 04:55.890
or the ability to see other processes.

04:55.890 --> 04:57.660
These are all things that we can use namespacing

04:57.660 --> 05:00.120
for to essentially limit the resources

05:00.120 --> 05:02.430
or kind of redirect request for resource

05:02.430 --> 05:04.680
from a particular process.

05:04.680 --> 05:07.680
Very closely related to this idea of namespacing

05:07.680 --> 05:10.170
is another feature called control groups.

05:10.170 --> 05:12.960
A control group can be used to limit the amount

05:12.960 --> 05:15.717
of resources that a particular process can use.

05:15.717 --> 05:17.437
So, namespacing is for saying,

05:17.437 --> 05:20.430
"Hey, this area of the hard drive is for this process."

05:20.430 --> 05:23.850
A control group can be used to limit the amount of memory

05:23.850 --> 05:26.490
that a process can use, the amount of CPU,

05:26.490 --> 05:28.835
the amount of hard drive input, input.

05:28.835 --> 05:29.940
Or excuse me, input output,

05:29.940 --> 05:33.270
and the amount of network bandwidth as well.

05:33.270 --> 05:35.130
So, these two features put together can be used

05:35.130 --> 05:37.830
to really kind of isolate a single process

05:37.830 --> 05:40.800
and limit the amount of resources it can talk to,

05:40.800 --> 05:43.440
and the amount of bandwidth essentially,

05:43.440 --> 05:45.243
that it can make use of.

05:46.560 --> 05:47.670
Now, as you might imagine,

05:47.670 --> 05:49.980
this entire kind of little section right here,

05:49.980 --> 05:53.070
this entire vertical of a running process,

05:53.070 --> 05:56.100
plus this little segment of a resource that it can talk to

05:56.100 --> 05:58.737
is what we refer to as a container.

05:58.737 --> 06:00.037
And so, when people say,

06:00.037 --> 06:02.190
"Oh yeah, I have a Docker Container."

06:02.190 --> 06:03.630
You really should not think of these

06:03.630 --> 06:05.970
as being like a physical construct

06:05.970 --> 06:07.800
that exists inside of your computer.

06:07.800 --> 06:10.140
Instead, a container is really a process

06:10.140 --> 06:13.710
or a set of processes that have a grouping of resources

06:13.710 --> 06:15.543
specifically assigned to it.

06:16.770 --> 06:17.820
And so, this is the diagram

06:17.820 --> 06:19.200
that we're gonna be looking at quite a bit

06:19.200 --> 06:21.180
anytime that we think about a container.

06:21.180 --> 06:22.860
We've got some running process

06:22.860 --> 06:25.620
that sends a system call to a kernel.

06:25.620 --> 06:28.410
The kernel is going to look at that incoming system call

06:28.410 --> 06:33.150
and direct it to a very specific portion of the hard drive,

06:33.150 --> 06:36.093
the RAM, CPU or whatever else it might need.

06:37.029 --> 06:39.690
And a portion of each of these resources

06:39.690 --> 06:42.750
is made available to that singular process.

06:42.750 --> 06:44.497
Now, the last question you might have here is,

06:44.497 --> 06:46.620
"Okay. Well, I get what a container is,

06:46.620 --> 06:48.360
but with that in mind,

06:48.360 --> 06:51.630
what is the real relation between one of those containers

06:51.630 --> 06:54.090
or that kind of singular process

06:54.090 --> 06:56.580
and grouping of resources to an image?

06:56.580 --> 06:59.550
How is that single file eventually create this container?"

06:59.550 --> 07:00.383
That's a good question.

07:00.383 --> 07:01.420
One more quick diagram.

07:02.850 --> 07:04.590
Anytime that we talk about an image,

07:04.590 --> 07:07.830
we're really talking about a file system snapshot.

07:07.830 --> 07:10.590
So, this is essentially kind of like a copy paste

07:10.590 --> 07:13.860
of a very specific set of directories or files.

07:13.860 --> 07:15.270
And so we might have an image

07:15.270 --> 07:18.660
that contains just Chrome and Python.

07:18.660 --> 07:22.230
An image will also contain a specific startup command.

07:22.230 --> 07:23.850
So, here's what happens behind the scenes

07:23.850 --> 07:26.850
when we take an image and turn it into a container.

07:26.850 --> 07:27.683
First off,

07:27.683 --> 07:29.700
the kernel is going to isolate a little section

07:29.700 --> 07:30.533
of the hard drive

07:30.533 --> 07:33.480
and make it available to just this container.

07:33.480 --> 07:34.560
And so we can kind of imagine

07:34.560 --> 07:38.100
that after that little subset is created,

07:38.100 --> 07:41.130
the file snapshot inside the image is taken

07:41.130 --> 07:44.370
and placed into that little segment of the hard drive.

07:44.370 --> 07:45.210
And so, now,

07:45.210 --> 07:48.720
inside of this very specific grouping of resources,

07:48.720 --> 07:50.850
we've got a little section of the hard drive

07:50.850 --> 07:53.280
that has just Chrome and Python installed

07:53.280 --> 07:55.413
and essentially, nothing else.

07:56.730 --> 07:58.770
The startup command is then executed

07:58.770 --> 07:59.850
which we can kind of imagine

07:59.850 --> 08:01.380
this case is like startup Chrome,

08:01.380 --> 08:03.330
just Chrome for me.

08:03.330 --> 08:05.212
And so Chrome is invoked,

08:05.212 --> 08:07.740
we created a new instance of that process

08:07.740 --> 08:10.380
and that created process is then isolated

08:10.380 --> 08:13.980
to this set of resources inside the container.

08:13.980 --> 08:15.090
So, that's pretty much it.

08:15.090 --> 08:18.150
That is the relationship between a container and an image,

08:18.150 --> 08:20.670
and it's how an image is eventually taken

08:20.670 --> 08:23.160
and turned into a running container.

08:23.160 --> 08:25.440
Now, there's still a tremendous amount more to learn

08:25.440 --> 08:26.700
about containers and images.

08:26.700 --> 08:27.900
So, let's take a quick break

08:27.900 --> 08:29.553
and continue in the next section.
