WEBVTT

00:00.020 --> 00:00.920
Welcome back everybody.

00:00.920 --> 00:07.220
I always praise Kubernetes because it makes it so easy to manage our cloud native apps.

00:07.220 --> 00:12.740
If you really learn how to navigate the Kubernetes ecosystem, you can harness a very powerful platform

00:12.740 --> 00:17.300
because it just provides so many benefits out of the box.

00:17.300 --> 00:22.430
One of the core benefits that we witnessed was resiliency and self-healing.

00:22.460 --> 00:30.650
You'll remember that we saw a scenario that when the process, when the application itself terminates,

00:30.650 --> 00:37.280
when it gets fully killed, for whatever reason, whether that's an out of memory exception or an internal

00:37.280 --> 00:43.430
null pointer exception or a stack overflow exception, whatever it may be, when the application gets

00:43.430 --> 00:50.090
fully killed off when the process is dead, Kubernetes will detect that the process was terminated,

00:50.090 --> 00:55.700
so it will restart the container and try to bring it back to life.

00:55.700 --> 01:02.130
Resurrect it back to a healthy state where it can continue to serve traffic.

01:02.130 --> 01:08.460
That's why when we deploy containers to Kubernetes, they automatically become resilient and self-healing.

01:09.570 --> 01:15.420
But now what about scenarios when the application hasn't actually been fully terminated?

01:15.420 --> 01:17.880
It's still technically running.

01:17.880 --> 01:24.600
It's still consuming CPU and memory, but the application itself isn't operational.

01:24.600 --> 01:26.190
It's not live.

01:26.190 --> 01:33.900
In which case what we can do is employ a liveness probe, a Kubernetes liveness probe that continuously

01:33.900 --> 01:42.480
probes the application to see if it's actually operational, and if the liveness probe deems the application

01:42.480 --> 01:49.830
to be not operational, then it's going to restart the container in order to resurrect the application

01:49.830 --> 01:51.240
back to a healthy state.

01:51.240 --> 01:56.940
So the liveness probe allows Kubernetes to go beyond just restarting containers.

01:56.940 --> 02:03.440
If the app gets fully killed off, it can actually check if the app itself is working, and restart

02:03.440 --> 02:05.060
the container accordingly.

02:06.080 --> 02:10.580
But now what about scenarios where the app is fully operational?

02:10.580 --> 02:17.180
Everything is fine, but the external components that it relies on, they themselves are down.

02:17.900 --> 02:24.860
That could be an external database or whatever that the app relies on in order to serve traffic.

02:24.980 --> 02:29.930
In that case, what we can do is employ readiness probes.

02:30.770 --> 02:39.530
A Kubernetes readiness probe will continuously probe the application to check and see if it's ready

02:39.530 --> 02:40.640
to serve traffic.

02:40.670 --> 02:46.700
Okay, so an application could be operational but not ready to serve traffic.

02:46.700 --> 02:52.610
If the readiness probe deems the application to be not ready, then what it can do is prevent any traffic

02:52.610 --> 02:56.030
from reaching the pod that the application is in.

02:57.320 --> 03:05.400
If you look at a diagram the developer who programmed the containerized application could be a developer.

03:05.430 --> 03:07.950
A team of developers doesn't matter.

03:08.880 --> 03:14.640
They have complete authority over how the liveness endpoint gets programmed.

03:14.760 --> 03:19.800
So it could be programmed to be exposed on the path.

03:19.830 --> 03:22.560
Slash health slash liveness.

03:22.590 --> 03:24.180
Doesn't really matter.

03:24.330 --> 03:31.170
Ultimately, what this liveness endpoint does is it checks to see if everything within the application

03:31.170 --> 03:35.520
is functioning properly, if everything internal to the app is fine.

03:35.550 --> 03:43.470
So that could be internal components like internal queues that process asynchronous operations, that

03:43.470 --> 03:50.400
could be internal caches that um, as a store frequently accessed data doesn't really matter.

03:50.400 --> 03:52.110
Depends on what the application is.

03:52.110 --> 03:58.080
Using the liveness endpoint will check if all of the internal components within the app are actually

03:58.080 --> 03:58.950
responsive.

03:58.950 --> 04:02.380
It's going to see if any of the background tasks.

04:02.380 --> 04:06.610
Any of the vital background tasks running in the app are actually active.

04:06.640 --> 04:11.320
It could check for resource thresholds like disk space or heap size.

04:11.350 --> 04:16.420
If everything is fine, we didn't reach a 90% threshold or whatever.

04:16.420 --> 04:23.620
The application can be deemed healthy and fully operational, but if any of these checks were to fail,

04:23.650 --> 04:30.940
then what the liveness endpoint is going to do is send back a negative response, an internal server

04:30.940 --> 04:37.960
error with a status of 500, and when the when the liveness probe receives a negative response, it's

04:37.960 --> 04:42.580
going to deem the application to be not operational, not working.

04:42.580 --> 04:49.690
And essentially we've signaled to the liveness probe that, hey, please can you restart our application

04:49.690 --> 04:52.690
so that we can potentially bring it back to a healthy state?

04:52.690 --> 05:01.210
And this automated way of restarting the app based on criterias that we use to determine if the app

05:01.210 --> 05:05.670
is healthy or not is what makes Kubernetes really, really powerful.

05:05.670 --> 05:13.590
We can automate this self-healing based on criterias that we determine essential for the application

05:13.590 --> 05:14.220
to run.

05:14.250 --> 05:14.760
These.

05:14.910 --> 05:20.490
These criterias can vary from one application to the next, and it's up to the team of developers to

05:20.520 --> 05:26.430
determine what constitutes a healthy application and operational application.

05:27.870 --> 05:35.400
The readiness endpoint will check if any external components that the app relies on is actually functioning.

05:35.460 --> 05:44.370
If the app relies on external dependencies like databases, um, Kafka data stores, or MQ brokers,

05:44.370 --> 05:44.880
whatever.

05:44.880 --> 05:50.580
If any of these external dependencies isn't functioning and the app can't fully serve requests without

05:50.580 --> 05:51.060
them.

05:51.060 --> 05:57.990
Again, we send back that negative response, telling the readiness probe to prevent any traffic from

05:57.990 --> 06:01.050
reaching the pod that this container is in.

06:01.080 --> 06:08.530
Okay, but if everything is fine, the readiness endpoint will continuously send back the readiness

06:08.530 --> 06:08.830
probe.

06:08.830 --> 06:10.210
Positive responses.

06:10.210 --> 06:11.410
Everything is fine.

06:11.440 --> 06:12.430
Don't do anything.

06:12.460 --> 06:21.070
Okay, so it's pretty cool how, using our own criterias, we can determine if an app is live, otherwise

06:21.070 --> 06:22.210
restart it.

06:22.210 --> 06:27.040
If an app is ready to serve traffic, otherwise prevent any traffic to it.

06:27.250 --> 06:28.240
That's enough theory.

06:28.240 --> 06:29.950
Let's go ahead and implement it.

06:29.980 --> 06:33.820
So what I want to do is copy and paste section five.

06:33.850 --> 06:41.230
We'll call it section six and two.

06:41.590 --> 06:44.920
We're not quite ready to work with the stateful app.

06:44.920 --> 06:50.590
So we'll revert this to be stateless okay.

06:53.050 --> 06:57.220
Um yeah that's pretty much it.

06:57.970 --> 07:05.090
Um, I can see the out of section five CD into section six.

07:05.750 --> 07:07.430
I'm going to say cube CTL.

07:07.430 --> 07:13.370
Let's just delete all the pods for now with dash dash all and the namespace.

07:13.370 --> 07:14.570
Great submission.

07:14.570 --> 07:16.220
Why am I saying pods?

07:16.220 --> 07:18.260
I'm going to delete all the deployments.

07:19.790 --> 07:21.590
I just want to start off on a clean state.

07:21.620 --> 07:29.090
Slate will keep the services because I don't anticipate anything will change and by virtue of deleting

07:29.090 --> 07:34.160
all the deployments, all of their corresponding pods will get deleted as well.

07:34.160 --> 07:35.750
We can verify this.

07:36.050 --> 07:38.270
They're terminating okay.

07:39.290 --> 07:41.240
So now in the deployment.

07:41.420 --> 07:46.310
So we've got a great submission API app running within the confines of a container.

07:46.310 --> 07:51.620
And this app I programmed it to have a liveness and readiness endpoint.

07:51.620 --> 07:57.740
So it's got all of these endpoints with business logic for storing great data and retrieving great data.

07:57.740 --> 08:04.800
And it's also got these endpoints that pretty much monitor the health of the app and monitor the external

08:04.800 --> 08:08.070
dependencies that the app relies on in order to function.

08:08.100 --> 08:11.220
Okay, so we've got the liveness and readiness endpoints.

08:11.250 --> 08:14.580
Let's probe them using Kubernetes.

08:14.610 --> 08:17.040
Liveness and readiness probes.

08:17.040 --> 08:20.640
Right here I'll specify a liveness probe.

08:21.540 --> 08:29.490
And this probe is going to make continuous get requests to the liveness endpoint exposed by this great

08:29.490 --> 08:30.780
submission API.

08:30.810 --> 08:31.560
Okay.

08:31.590 --> 08:38.370
It's going to make continuous get requests to the path slash Health's.

08:38.490 --> 08:39.270
Okay.

08:39.270 --> 08:48.180
So I programmed this application to expose an endpoint at the path slash Health's on on port 3000.

08:50.010 --> 08:56.430
If the app's endpoint sends back a positive response to our liveness probe, if it keeps sending positive

08:56.430 --> 08:58.350
responses, then nothing happens.

08:58.350 --> 08:59.130
The app is healthy.

08:59.160 --> 08:59.820
We're good.

08:59.850 --> 09:06.800
If it sends back negative responses or at least one negative response, the liveness probe will restart

09:06.830 --> 09:17.990
the container, and we want to modify this probe to have an initial delay of at least, um, five seconds

09:17.990 --> 09:21.350
while the application is starting up.

09:21.380 --> 09:21.830
You know what?

09:21.830 --> 09:25.220
Let's do 15 seconds.

09:25.220 --> 09:31.580
So when the liveness probe, if it tries to make a request in the first second, for example, then

09:31.580 --> 09:36.830
what could happen is if the app isn't actually started up, the liveness probe is going to be like,

09:36.830 --> 09:39.080
whoa, I couldn't reach the health endpoint.

09:39.080 --> 09:40.730
We need to restart the container.

09:40.730 --> 09:45.800
And so it's going to go through this vicious process of always restarting the container over and over

09:45.800 --> 09:46.520
and over and over.

09:46.520 --> 09:47.990
And you don't want to do that.

09:47.990 --> 09:54.230
You want to specify a reasonable initial delay, giving the application time to start up before we start

09:54.230 --> 09:55.430
probing its endpoint.

09:55.460 --> 10:00.740
We don't want the liveness probe to think that the app was created, but the health's endpoint was simply

10:00.740 --> 10:01.550
unresponsive.

10:01.580 --> 10:01.910
Right.

10:01.960 --> 10:04.420
And then we can specify a period in seconds.

10:04.420 --> 10:11.440
We can say after the initial delay keep probing this endpoint every five seconds.

10:12.220 --> 10:13.030
All right.

10:13.060 --> 10:14.260
That's pretty much it.

10:15.460 --> 10:18.850
We've been able to leverage the liveness probe.

10:18.850 --> 10:25.270
If our health endpoint ever sends back a negative response our container will automatically restart.

10:26.230 --> 10:31.210
Uh, we can set up a readiness endpoint readiness probe as well.

10:32.980 --> 10:41.530
That continuously makes get requests to the Great submission API at its path slash Redis.

10:41.560 --> 10:50.440
Okay, I programmed this app to have a liveness, a readiness endpoint at the path slash Redis, and

10:50.440 --> 10:53.020
the app is running on port 3000.

10:54.940 --> 11:01.450
And whether or not we put an initial delay for the readiness probe isn't as important as setting it

11:01.450 --> 11:09.830
up for the liveness probe because, um, while the app hasn't started up yet, um, if the readiness

11:09.860 --> 11:16.430
probe simply can't access this path at all, then it's just going to deem the app not ready to receive

11:16.460 --> 11:19.400
traffic, which isn't a big deal.

11:19.400 --> 11:25.340
And when the app does start up, when it's able to make these requests, it'll deem the app to be ready

11:25.340 --> 11:28.640
to receive traffic and allow requests to go in.

11:28.670 --> 11:30.530
Okay, you know what?

11:30.530 --> 11:32.960
Let me put an initial delay just to show you something.

11:32.960 --> 11:35.210
I'm going to put an initial delay in seconds.

11:35.810 --> 11:39.920
Uh, we'll say ten and we'll put a period of five seconds.

11:39.920 --> 11:42.860
So every five seconds check.

11:43.730 --> 11:50.210
Um, so every five seconds we want the readiness probe to probe the readiness endpoint of our great

11:50.210 --> 11:51.530
submission API.

11:51.770 --> 11:55.160
What I'm going to do is copy the following.

11:55.700 --> 12:02.020
But for the great submission portal deployment, right under where do we put it?

12:02.020 --> 12:03.270
Let's be consistent.

12:04.320 --> 12:05.490
Right here.

12:08.610 --> 12:09.240
Nope.

12:11.550 --> 12:13.650
I forgot one thing.

12:15.630 --> 12:16.410
Okay.

12:17.610 --> 12:19.080
I'm not going to mess around here.

12:19.110 --> 12:22.110
Let's just remove the initial delay from the start.

12:22.350 --> 12:32.880
Um, the port is 5001, so I also programmed the great submission portal to have a liveness endpoint

12:32.880 --> 12:35.700
exposed at the path Slash Health's.

12:41.580 --> 12:43.080
And that's it.

12:44.460 --> 12:48.840
I'm just going to start by deploying the Great Submission API deployment to keep things simple.

12:48.870 --> 12:50.400
Great submission.

12:50.430 --> 13:01.890
Uh kubectl apply dash f great submission API deployment.yaml unknown field spec dot temp liveness probe

13:01.890 --> 13:03.280
dot Initial delay.

13:03.310 --> 13:04.930
Seconds.

13:17.470 --> 13:19.810
Uh, man.

13:20.830 --> 13:26.680
Sorry if you got frustrated watching this video the whole time while I clearly misspelled initial.

13:27.730 --> 13:29.560
Okay, let's try this again.

13:30.190 --> 13:35.290
If I say kubectl get pods dash n great submission.

13:36.700 --> 13:46.060
Um, notice how the pod itself is running, but it's deemed not ready to receive traffic, mainly because

13:46.090 --> 13:54.700
our readiness probe is waiting 10s before it can probe the application for for readiness.

13:54.700 --> 14:02.630
And while the app while both pods are deemed not ready, uh, the readiness probe will block any traffic

14:02.630 --> 14:03.830
from going into them.

14:03.860 --> 14:10.190
Okay, now 10s have passed and I'm pretty sure the readiness probe has probed these endpoints by now.

14:10.190 --> 14:14.870
So if I say kubectl get pods again now you can see that both pods are ready.

14:14.900 --> 14:15.290
All right.

14:15.290 --> 14:20.000
We can do the same thing for the great submission portal kubectl apply dash f.

14:20.030 --> 14:22.670
Great submission portal deployment.

14:26.180 --> 14:31.820
Kubectl get pods dash n great submission.

14:31.940 --> 14:36.320
And we only want to query the pods that have the following label.

14:38.840 --> 14:41.060
App dot Kubernetes.io slash instance.

14:41.060 --> 14:42.860
Great submission portal.

14:46.220 --> 14:47.870
Put an equal sign here.

14:48.350 --> 14:52.460
And once again we see that our great submission portal application is running.

14:52.460 --> 14:55.970
And in this case our readiness probe didn't have an initial delay.

14:55.970 --> 15:01.850
So it was able to quickly probe the readiness endpoint, which keeps sending back positive responses.

15:01.850 --> 15:06.820
So the readiness probe has no reason to deem the app not ready.

15:06.820 --> 15:13.060
And as you can see, none of our applications are being restarted, which means they are live as well.

15:13.060 --> 15:14.650
They're fully operational.

15:14.650 --> 15:18.280
The liveness endpoint is not sending back any red flags.

15:18.310 --> 15:21.580
That's it for liveness and readiness.

15:21.580 --> 15:27.310
The implementation of the liveness and readiness endpoints really depends on the team of developers

15:27.310 --> 15:34.390
and how they design the app, and what they feel are the most appropriate criteria for an for their

15:34.420 --> 15:37.150
app to be deemed live or ready.

15:37.180 --> 15:44.080
All we have to do from the Kubernetes side is set up these liveness and readiness probes in order to

15:44.110 --> 15:47.350
instrument Kubernetes to restart the app accordingly.

15:47.350 --> 15:54.250
If the liveness endpoint sends back red flags or prevent any traffic from going into the pod, if the

15:54.250 --> 16:00.130
readiness endpoint programmed by these developers sends back any red flags, that's it for liveness

16:00.130 --> 16:00.850
and readiness.

16:00.850 --> 16:02.710
I'll see you in the next one.