K8s liveness probe behavior when the pod contains more than one container?

3/11/2021

Scenario: A K8S pod has more than one container and liveness/readiness probes are configured for each of the containers. Now if the liveness probe is succeeding on some containers and failing on few containers, what will k8s do. 1) will it restart only the failing containers
OR 2) will it restart the entire pod.

-- samshers
kubernetes
kubernetes-pod
livenessprobe
readinessprobe

2 Answers

3/11/2021

Will restart the container.

As far the k8s doc:
The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready.

To perform a probe, the kubelet sends an HTTP GET request to the server that is running in the container and listening on port 8080. If the handler for the server's /healthz path returns a success code, the kubelet considers the container to be alive and healthy. If the handler returns a failure code, the kubelet kills the container and restarts it.

Whilst a Pod is running, the kubelet is able to restart containers to handle some kind of faults. Within a Pod, Kubernetes tracks different container states and determines what action to take to make the Pod healthy again.

You can see the pod events to see whether container restarted or not.

Ref: k8s doc and probes

-- Sahadat Hossain
Source: StackOverflow

3/12/2021

if the liveness probe is succeeding on some containers and failing on few containers, what will k8s do?

It will restart only the failing containers.

In Pod Lifecycle - Container Probes you have listed all 3 probes: liviness, readiness and startup.

livenessProbe: Indicates whether the container is running. If the liveness probe fails, the kubelet kills the container, and the container is subjected to its restart policy. If a Container does not provide a liveness probe, the default state is Success.

In Configure Liveness, Readiness and Startup Probes - Define a liveness command you have example and it's mentioned that:

If the command succeeds, it returns 0, and the kubelet considers the container to be alive and healthy. If the command returns a non-zero value, the kubelet kills the container and restarts it.

The same situation is in case HTTP request liveness probe:

If the handler for the server's /healthz path returns a success code, the kubelet considers the container to be alive and healthy. If the handler returns a failure code, the kubelet kills the container and restarts it.

And with TCP liveness probe:

The kubelet will run the first liveness probe 15 seconds after the container starts. Just like the readiness probe, this will attempt to connect to the goproxy container on port 8080. If the liveness probe fails, the container will be restarted.

Tests

If you would like create own test you can use this example of HTTP Liveness probe:

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http-probe
spec:
  containers:
  - name: liveness
    image: k8s.gcr.io/liveness
    args:
    - /server
    readinessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: X-Custom-Header
          value: Awesome
      initialDelaySeconds: 0
      periodSeconds: 5      
      timeoutSeconds: 5
      successThreshold: 1
      failureThreshold: 3
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
        httpHeaders:
        - name: X-Custom-Header
          value: Awesome
      initialDelaySeconds: 5
      periodSeconds: 10     
      successThreshold: 1
      failureThreshold: 3 
  - name: nginx
    image: nginx

After a while you will be able to see that the container was restarted, restart count was increased, but the pod is still existing as Age is still counting.

$ kubectl get po -w
NAME                  READY   STATUS    RESTARTS   AGE
liveness-http-probe   2/2     Running   0          20s
liveness-http-probe   1/2     Running   0          23s
liveness-http-probe   1/2     Running   1          42s
liveness-http-probe   2/2     Running   1          43s
liveness-http-probe   1/2     Running   1          63s
...
liveness-http-probe   1/2     Running   5          3m23s
liveness-http-probe   2/2     Running   5          3m23s
liveness-http-probe   1/2     Running   5          3m43s
liveness-http-probe   1/2     CrashLoopBackOff   5          4m1s
liveness-http-probe   1/2     Running            6          5m25s
liveness-http-probe   2/2     Running            6          5m28s
liveness-http-probe   1/2     Running            6          5m48s
liveness-http-probe   1/2     CrashLoopBackOff   6          6m2s
liveness-http-probe   1/2     Running            7          8m46s
liveness-http-probe   2/2     Running            7          8m48s
...
liveness-http-probe   2/2     Running   11         21m
liveness-http-probe   1/2     Running   11         21m
liveness-http-probe   1/2     CrashLoopBackOff   11         22m
liveness-http-probe   1/2     Running            12         27m
...
liveness-http-probe   1/2     Running            13         28m
liveness-http-probe   1/2     CrashLoopBackOff   13         28m

And in the pod description you will see duplicate Warnings like (x8 over 28m), (x84 over 24m) or (x2 over 28m).

  Normal   Pulling    28m (x2 over 28m)     kubelet            Pulling image "k8s.gcr.io/liveness"
  Normal   Killing    28m                   kubelet            Container liveness failed liveness probe, will be restarted
  Normal   Started    28m (x2 over 28m)     kubelet            Started container liveness
  Normal   Created    28m (x2 over 28m)     kubelet            Created container liveness
  Normal   Pulled     28m                   kubelet            Successfully pulled image "k8s.gcr.io/liveness" in 561.418121ms
  Warning  Unhealthy  27m (x8 over 28m)     kubelet            Readiness probe failed: HTTP probe failed with statuscode: 500
  Warning  Unhealthy  27m (x4 over 28m)     kubelet            Liveness probe failed: HTTP probe failed with statuscode: 500
  Normal   Pulled     13m (x2 over 14m)     kubelet            (combined from similar events): Successfully pulled image "k8s.gcr.io/liveness" in 508.892628ms
  Warning  BackOff    3m45s (x84 over 24m)  kubelet            Back-off restarting failed container

Lately I did some tests with liveness and readiness probes in thread - Liveness Probe, Readiness Probe not called in expected duration. It can provide you additional information.

-- PjoterS
Source: StackOverflow