kubernetes: when pod in CrashLoopBackOff status, related events won't update?

7/15/2021

I'm testing kubernetes behavior when pod getting error.

I now have a pod in CrashLoopBackOff status caused by liveness probe failed, from what I can see in kubernetes events, pod turns into CrashLoopBackOff after 3 times try and begin to back off restarting, but the related Liveness probe failed events won't update?

➜  ~ kubectl describe pods/my-nginx-liveness-err-59fb55cf4d-c6p8l
Name:         my-nginx-liveness-err-59fb55cf4d-c6p8l
Namespace:    default
Priority:     0
Node:         minikube/192.168.99.100
Start Time:   Thu, 15 Jul 2021 12:29:16 +0800
Labels:       pod-template-hash=59fb55cf4d
              run=my-nginx-liveness-err
Annotations:  <none>
Status:       Running
IP:           172.17.0.3
IPs:
  IP:           172.17.0.3
Controlled By:  ReplicaSet/my-nginx-liveness-err-59fb55cf4d
Containers:
  my-nginx-liveness-err:
    Container ID:   docker://edc363b76811fdb1ccacdc553d8de77e9d7455bb0d0fb3cff43eafcd12ee8a92
    Image:          nginx
    Image ID:       docker-pullable://nginx@sha256:353c20f74d9b6aee359f30e8e4f69c3d7eaea2f610681c4a95849a2fd7c497f9
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 15 Jul 2021 13:01:36 +0800
      Finished:     Thu, 15 Jul 2021 13:02:06 +0800
    Ready:          False
    Restart Count:  15
    Liveness:       http-get http://:8080/ delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-r7mh4 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-r7mh4:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  37m                   default-scheduler  Successfully assigned default/my-nginx-liveness-err-59fb55cf4d-c6p8l to minikube
  Normal   Created    35m (x4 over 37m)     kubelet            Created container my-nginx-liveness-err
  Normal   Started    35m (x4 over 37m)     kubelet            Started container my-nginx-liveness-err
  Normal   Killing    35m (x3 over 36m)     kubelet            Container my-nginx-liveness-err failed liveness probe, will be restarted
  Normal   Pulled     31m (x7 over 37m)     kubelet            Container image "nginx" already present on machine
  Warning  Unhealthy  16m (x32 over 36m)    kubelet            Liveness probe failed: Get "http://172.17.0.3:8080/": dial tcp 172.17.0.3:8080: connect: connection refused
  Warning  BackOff    118s (x134 over 34m)  kubelet            Back-off restarting failed container

BackOff event updated 118s ago, but Unhealthy event updated 16m ago?

and why I'm getting only 15 times Restart Count while BackOff events with 134 times?

I'm using minikube and my deployment is like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-nginx-liveness-err
spec:
  selector:
    matchLabels:
      run: my-nginx-liveness-err
  replicas: 1
  template:
    metadata:
      labels:
        run: my-nginx-liveness-err
    spec:
      containers:
      - name: my-nginx-liveness-err
        image: nginx
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /
            port: 8080
-- Sean Yu
kubernetes

1 Answer

7/15/2021

I think you might be confusing Status Conditions and Events. Events don't "update", they just exist. It's a stream of event data from the controllers for debugging or alerting on. The Age column is the relative timestamp to the most recent instance of that event type and you can see if does some basic de-duplication. Events also age out after a few hours to keep the database from exploding.

So your issue has nothing to do with the liveness probe, your container is crashing on startup.

-- coderanger
Source: StackOverflow