Understanding Kubernetes Probes and Health Checks in 5 Minutes

Kubernetes is an open-source container orchestration platform that greatly simplifies the creation and management of applications.

Distributed systems like Kubernetes can be challenging to manage because they involve many active components, all of which must work properly for the entire system to run smoothly. Even if a small component fails, it needs to be detected and repaired.

These operations also need to be automated. Kubernetes allows us to accomplish this with readiness probes and liveness probes. This article discusses these probes in detail, starting with an overview of health checks.

01 What is a Health Check?

A health check is a simple way for a system to know whether an application instance is running normally. If your application instance is not running correctly, other services will not access it or send requests to it. Instead, requests will be sent to another ready instance, or you should retry sending the request.

The system should be able to keep your application in a healthy state. By default, after all containers in a pod are started, Kubernetes begins sending traffic to that pod. If a container crashes, Kubernetes will restart it. This default behavior should be sufficient to get started. Since Kubernetes helps create custom health checks, improving deployment robustness becomes relatively straightforward. Now, let’s discuss the pod lifecycle.

02 Pod Lifecycle

Kubernetes pods follow a defined lifecycle. These are the different stages:

• A pod starts in the pending stage after it is first created. The scheduler tries to determine where to place the pod. If the scheduler cannot find a node to place the pod, it will remain in the pending state. (To check why a pod is pending, run kubectl describe pod <pod name> command).

• Once the pod is scheduled, it enters the container creating stage, during which it pulls the required images for the application, followed by container startup.

• Once the container is in the pod, it enters the running stage, during which it continues to run until the program completes successfully or terminates.

To check the pod’s status, run kubectl get pod command and check the STATUS column. As you can see, in this case, all pods are in running state. Additionally, READY indicates that the pod is ready to accept user traffic.

# kubectl get podNAME READY STATUS RESTARTS AGEmy-nginx-6b74b79f57-fldq6 1/1 Running 0 20smy-nginx-6b74b79f57-n67wp 1/1 Running 0 20smy-nginx-6b74b79f57-r6pcq 1/1 Running 0 20s

Different Types of Probes in Kubernetes

Kubernetes provides the following types of health checks:

• Readiness Probe: This probe tells you when your application is ready to serve traffic. Kubernetes ensures that the readiness probe passes before allowing traffic to be sent to the pod. If the readiness probe fails, Kubernetes will not send traffic to the pod until the probe passes.

• Liveness Probe: The liveness probe allows Kubernetes to know whether your application is healthy. If the application is healthy, Kubernetes will not interfere with the pod’s operation; if it is unhealthy, Kubernetes will destroy the pod and start a new pod to replace it.

To further understand this, consider a practical scenario. Your application needs some time to warm up or download application content from an external source like GitHub. Your application will not accept traffic until it is fully ready. By default, once the process inside the container starts, Kubernetes begins sending traffic. By using the readiness probe, Kubernetes waits until the application is fully started before allowing the service to send traffic to the new replica.

Consider another scenario: your application crashes due to a code error (possibly an extreme case) and hangs indefinitely, stopping it from serving requests. Since your process continues to run by default, Kubernetes will send traffic to the broken pod. Using the liveness probe, Kubernetes will detect that the application is no longer serving requests and will restart the failed pod by default.

Now let’s look at how to define probes. There are three types of probes:

• HTTP

• TCP

• Command

Note: You can start with defining either the readiness probe or the liveness probe, as both implementations require similar templates. For example, if we define the liveness probe first, we can use it to define the readiness probe, and vice versa.

• HTTP Probe (httpGet): This is the most common type of probe. Even if your application is not an HTTP server, you can typically create a lightweight HTTP server within your application to respond to the liveness probe. Kubernetes will ping a specified path (like /healthz) on a specific port (in this case, port 8080). If it receives an HTTP response in the 200 or 300 range, it will be marked as healthy. (To learn more about HTTP response codes, refer to this link https://developer.mozilla.org/en-US/docs/Web/HTTP/Status). Otherwise, it will be marked as unhealthy. Here’s how to define an HTTP liveness probe:

livenessProbe:httpGet:path: /healthzport: 8080

The definition of the HTTP readiness probe is similar to that of the HTTP liveness probe; you just need to replace liveness with readiness.

readinessProbe:httpGet:path: /healthzport: 8080

• TCP Probe (tcpSocket): With TCP probes, Kubernetes will attempt to establish a TCP connection on a specified port (like port 8080 in the example below). If a connection can be established, the container is considered healthy. If not, it is considered a failure. These probes can be handy when HTTP or command probes are not working correctly. For example, FTP services can use this type of probe.

readinessProbe:tcpSocket:port: 8080

• Command Probe (exec command): In the case of command probes, Kubernetes will run a command inside your container. If the command returns an exit code of 0, the container will be marked as healthy. Otherwise, it will be marked as unhealthy. This type of probe is useful when you cannot or do not want to run an HTTP server, but you can run commands to check if the application is healthy. In the example below, we check if the file /tmp/healthy exists; if the command returns an exit code of 0, the container will be marked as healthy; otherwise, it will be marked as unhealthy.

livenessProbe:exec:command:- cat- /tmp/healthy

Probes can be configured in various ways based on the frequency of execution, success and failure thresholds, and the time to wait for a response.

• initialDelaySeconds (default 0): If you know your application needs n seconds (like 30 seconds) to warm up, you can use initialDelaySeconds to add a delay (in seconds) before the first check is executed.

• periodSeconds (default 10): If you want to specify the frequency of checks, you can use periodSeconds to define it.

• timeoutSeconds (default 1): This defines the maximum number of seconds before the probe operation times out.

• successThreshold (default 1): This is the number of attempts before the probe is considered successful after a failure.

• failureThreshold (default 3): In case the probe fails, Kubernetes will try multiple times before marking the probe as failed.

Note: By default, if the application is still not ready after 3 attempts, the probe will stop. If it’s a liveness probe, it will restart the container. If it’s a readiness probe, it will mark the pod as unhealthy.

For more information about probe configuration, refer to this link: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes.

Now, let’s combine everything we have discussed so far. The key point to note here is the use of readinessProbe with httpGet. The first check will be executed after a delay of 10 seconds, and then repeated every 5 seconds.

apiVersion: v1kind: Pod metadata:labels:run: nginxname: nginxspec:containers:- image: nginxname: nginxreadinessProbe:httpGet:path: /port: 80initialDelaySeconds: 10periodSeconds: 5

• To create a pod, use the kubectl create command and specify the YAML manifest file with the -f flag. You can name this file anything you like, but the extension should be .yaml.

kubectl create -f readinessprobe.yamlpod/nginx created

• If you check the pod’s status now, it should show the status as Running under the STATUS column. But if you check the READY column, it will still show 0/1, indicating that it is not ready to accept new connections.

kubectl get podNAME READY STATUS RESTARTS AGEnginx 0/1 Running 0 16s

• Since we set the initial delay to 1 second, check the status a few seconds later. Now, the pod should be running.

kubectl get podNAME READY STATUS RESTARTS AGEnginx 1/1 Running 0 28s

• To check the detailed status of all parameters used when defining the readiness probe (like initialDelaySeconds, periodSeconds, etc.), run the kubectl describe command.

kubectl describe pod nginx |grep -i readinessReadiness: http-get http://:80/ delay=10s timeout=1s period=5s #success=1 #failure=3

Let’s further deepen our understanding of the concepts of liveness probes and readiness probes with an example. Starting with the liveness probe, in the example below, we execute the command:

touch healthy; sleep 20; rm -rf healthy; sleep 600’.

With this command, we create a file named “healthy” using the touch command. This file will exist in the container for the first 20 seconds, then we delete it using the rm -rf command. Finally, the container will sleep for 600 seconds.

We then define the liveness probe. It checks for the existence of the file using the cat healthy command. It completes this check after an initial delay of 5 seconds. We further define the periodSeconds parameter, which executes the liveness probe every 5 seconds. Once we delete the file, the probe will be in a failed state after 20 seconds.

apiVersion: v1kind: Podmetadata:labels:name: liveness-probe-execspec:containers:- name: liveness-probeimage: busyboxargs:- /bin/sh- -c- touch healthy; sleep 20; rm -rf healthy; sleep 600livenessProbe:exec:command:- cat- healthyinitialDelaySeconds: 5periodSeconds: 5

• To create the pod, store the above code in a file ending with .yaml (like liveness-probe.yaml), and then execute the kubectl create command with -f <file name> to create the pod.

# kubectl create -f liveness-probe.yamlpod/liveness-probe-exec created

• Running the kubectl get events command will show that the liveness probe has failed, and the container has been killed and restarted.

54s Normal Scheduled pod/liveness-probe-exec Successfully assigned default/liveness-probe-exec to controlplane53s Normal Pulling pod/liveness-probe-exec Pulling image "busybox"52s Normal Pulled pod/liveness-probe-exec Successfully pulled image "busybox" in 384.330188ms52s Normal Created pod/liveness-probe-exec Created container liveness-probe52s Normal Started pod/liveness-probe-exec Started container liveness-probe18s Warning Unhealthy pod/liveness-probe-exec Liveness probe failed: cat: can't open 'healthy': No such file or directory18s Normal Killing pod/liveness-probe-exec Container liveness-probe failed liveness probe, will be restarted

• You can also validate using the kubectl get pods command; as you can see in the restart column, the container restarted once.

# kubectl get podsNAME READY STATUS RESTARTS AGEliveness-probe-exec 1/1 Running 1 24s

• Now that you understand how the liveness probe works, try adjusting the above example to define it as a readiness probe to understand how the readiness probe works. In the example below, we execute the command (sleep 20; touch healthy; sleep 600) inside the container, which first sleeps for 20 seconds, creates a file, and then sleeps for 600 seconds. Since the initial delay is set to 15 seconds, the first check will execute after a delay of 15 seconds.

apiVersion: v1kind: Podmetadata:labels:name: readiness-probe-execspec:containers:- name: readiness-probeimage: busyboxargs:- /bin/sh- -c- sleep 20;touch healthy;sleep 600readinessProbe:exec:command:- cat- healthyinitialDelaySeconds: 15periodSeconds: 5

• To create the pod, store the above code in a file ending with .yaml, and execute the kubectl create command, which will create the pod.

# kubectl create -f readiness-probe.yamlpod/readiness-probe-exec created

• If you run the kubectl get events command here, you will see the probe failed because the file does not exist.

63s Normal Scheduled pod/readiness-probe-exec Successfully assigned default/readiness-probe-exec to controlplane62s Normal Pulling pod/readiness-probe-exec Pulling image "busybox"62s Normal Pulled pod/readiness-probe-exec Successfully pulled image "busybox" in 156.57701ms61s Normal Created pod/readiness-probe-exec Created container readiness-probe61s Normal Started pod/readiness-probe-exec Started container readiness-probe42s Warning Unhealthy pod/readiness-probe-exec Readiness probe failed: cat: can't open 'healthy': No such file or directoryIf you check the status of the container initially, it is not in a ready state.# kubectl get podsNAME READY STATUS RESTARTS AGEreadiness-probe-exec 0/1 Running 0 5s

• However, if you check it after 20 seconds, it should be in a running state.

# kubectl get podsNAME READY STATUS RESTARTS AGE readiness-probe-exec 1/1 Running 0 27s

Conclusion

Any distributed system requires health checks, and Kubernetes is no exception. Using health checks provides a solid foundation for your Kubernetes services, enhancing reliability and increasing uptime.

Reference: https://www.kubernetes.org.cn/9833.html

Understanding Kubernetes Probes and Health Checks in 5 Minutes

Quick Learning K8s Study Group, Join the Big Guys and Roll!

Scan the Code to Add Me WeChat and Join the Group to Connect with the Big Guys!

01

What is a Health Check?

02

Pod Lifecycle

Different Types of Probes in Kubernetes

Conclusion

Related posts

Leave a Comment Cancel reply