Scenario: A K8S pod has more than one container and liveness/readiness probes are configured for each of the containers. Now if the liveness probe is succeeding on some containers and failing on few containers, what will k8s do.
1) will it restart only the failing containers
OR
2) will it restart the entire pod.
Will restart the container.
As far the k8s doc:
The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready.
To perform a probe, the kubelet sends an HTTP GET request to the server that is running in the container and listening on port 8080. If the handler for the server's /healthz path returns a success code, the kubelet considers the container to be alive and healthy. If the handler returns a failure code, the kubelet kills the container and restarts it.
Whilst a Pod is running, the kubelet is able to restart containers to handle some kind of faults. Within a Pod, Kubernetes tracks different container states and determines what action to take to make the Pod healthy again.
You can see the pod events to see whether container restarted or not.
if the liveness probe is succeeding on some containers and failing on few containers, what will k8s do?
It will restart only the failing containers.
In Pod Lifecycle - Container Probes you have listed all 3 probes: liviness
, readiness
and startup
.
livenessProbe
: Indicates whether thecontainer
is running. If the liveness probe fails, the kubelet kills the container, and the container is subjected to itsrestart policy
. If a Container does not provide a liveness probe, the default state is Success.
In Configure Liveness, Readiness and Startup Probes - Define a liveness command you have example and it's mentioned that:
If the command succeeds, it returns 0, and the kubelet considers the container to be alive and healthy. If the command returns a non-zero value, the kubelet kills the container and restarts it.
The same situation is in case HTTP request liveness probe:
If the handler for the server's /healthz path returns a success code, the kubelet considers the container to be alive and healthy. If the handler returns a failure code, the kubelet kills the container and restarts it.
And with TCP liveness probe:
The kubelet will run the first liveness probe 15 seconds after the container starts. Just like the readiness probe, this will attempt to connect to the goproxy container on port 8080. If the liveness probe fails, the container will be restarted.
If you would like create own test you can use this example of HTTP Liveness probe:
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-http-probe
spec:
containers:
- name: liveness
image: k8s.gcr.io/liveness
args:
- /server
readinessProbe:
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: X-Custom-Header
value: Awesome
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: X-Custom-Header
value: Awesome
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
- name: nginx
image: nginx
After a while you will be able to see that the container
was restarted, restart count
was increased, but the pod
is still existing as Age
is still counting.
$ kubectl get po -w
NAME READY STATUS RESTARTS AGE
liveness-http-probe 2/2 Running 0 20s
liveness-http-probe 1/2 Running 0 23s
liveness-http-probe 1/2 Running 1 42s
liveness-http-probe 2/2 Running 1 43s
liveness-http-probe 1/2 Running 1 63s
...
liveness-http-probe 1/2 Running 5 3m23s
liveness-http-probe 2/2 Running 5 3m23s
liveness-http-probe 1/2 Running 5 3m43s
liveness-http-probe 1/2 CrashLoopBackOff 5 4m1s
liveness-http-probe 1/2 Running 6 5m25s
liveness-http-probe 2/2 Running 6 5m28s
liveness-http-probe 1/2 Running 6 5m48s
liveness-http-probe 1/2 CrashLoopBackOff 6 6m2s
liveness-http-probe 1/2 Running 7 8m46s
liveness-http-probe 2/2 Running 7 8m48s
...
liveness-http-probe 2/2 Running 11 21m
liveness-http-probe 1/2 Running 11 21m
liveness-http-probe 1/2 CrashLoopBackOff 11 22m
liveness-http-probe 1/2 Running 12 27m
...
liveness-http-probe 1/2 Running 13 28m
liveness-http-probe 1/2 CrashLoopBackOff 13 28m
And in the pod description you will see duplicate Warnings
like (x8 over 28m)
, (x84 over 24m)
or (x2 over 28m)
.
Normal Pulling 28m (x2 over 28m) kubelet Pulling image "k8s.gcr.io/liveness"
Normal Killing 28m kubelet Container liveness failed liveness probe, will be restarted
Normal Started 28m (x2 over 28m) kubelet Started container liveness
Normal Created 28m (x2 over 28m) kubelet Created container liveness
Normal Pulled 28m kubelet Successfully pulled image "k8s.gcr.io/liveness" in 561.418121ms
Warning Unhealthy 27m (x8 over 28m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Warning Unhealthy 27m (x4 over 28m) kubelet Liveness probe failed: HTTP probe failed with statuscode: 500
Normal Pulled 13m (x2 over 14m) kubelet (combined from similar events): Successfully pulled image "k8s.gcr.io/liveness" in 508.892628ms
Warning BackOff 3m45s (x84 over 24m) kubelet Back-off restarting failed container
Lately I did some tests with liveness
and readiness
probes in thread - Liveness Probe, Readiness Probe not called in expected duration. It can provide you additional information.