kubernetes pod restart count shows inconsistent values when kubectl get pod -w is run

1/9/2020

I have been playing around with minikube and after a set of operations, the output of kubectl get pod -w is like this-

nginx   1/1   Running   2     10m
nginx   1/1   Running   3     10m
nginx   0/1   Completed   2     10m
nginx   0/1   CrashLoopBackOff   2     11m
nginx   1/1   Running   3     11m
nginx   1/1   Running   3     12m

I don't understand the count shown at line 3 and 4. What does restart count convey exactly?

-- Shadja Chaudhari
kubernetes
pod

2 Answers

1/9/2020

About the CrashLoopBackOff Status:

A CrashloopBackOff means that you have a pod starting, crashing, starting again, and then crashing again.

Failed containers that are restarted by the kubelet are restarted with an exponential back-off delay (10s, 20s, 40s …) capped at five minutes, and is reset after ten minutes of successful execution.

CrashLoopBackOff events occurs for different reasons, most of te cases related to the following: - The application inside the container keeps crashing - Some parameter of the pod or container have been configured incorrectly - An error made during the deployment

Whenever you face a CrashLoopBackOff do a kubectl describe to investigate:

kubectl describe pod POD_NAME --namespace NAMESPACE_NAME

user@minikube:~$ kubectl describe pod ubuntu-5d4bb4fd84-8gl67 --namespace default
Name:         ubuntu-5d4bb4fd84-8gl67
Namespace:    default
Priority:     0
Node:         minikube/192.168.39.216
Start Time:   Thu, 09 Jan 2020 09:51:03 +0000
Labels:       app=ubuntu
              pod-template-hash=5d4bb4fd84
Status:       Running
Controlled By:  ReplicaSet/ubuntu-5d4bb4fd84
Containers:
  ubuntu:
    Container ID:   docker://c4c0295e1e050b5e395fc7b368a8170f863159879821dd2562bc2938d17fc6fc
    Image:          ubuntu
    Image ID:       docker-pullable://ubuntu@sha256:250cc6f3f3ffc5cdaa9d8f4946ac79821aafb4d3afc93928f0de9336eba21aa4
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 09 Jan 2020 09:54:37 +0000
      Finished:     Thu, 09 Jan 2020 09:54:37 +0000
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 09 Jan 2020 09:53:05 +0000
      Finished:     Thu, 09 Jan 2020 09:53:05 +0000
    Ready:          False
    Restart Count:  5
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xxxst (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-xxxst:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-xxxst
    Optional:    false
QoS Class:       BestEffort
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  7m16s                  default-scheduler  Successfully assigned default/ubuntu-5d4bb4fd84-8gl67 to minikube
  Normal   Created    5m59s (x4 over 6m52s)  kubelet, minikube  Created container ubuntu
  Normal   Started    5m58s (x4 over 6m52s)  kubelet, minikube  Started container ubuntu
  Normal   Pulling    5m17s (x5 over 7m5s)   kubelet, minikube  Pulling image "ubuntu"
  Normal   Pulled     5m15s (x5 over 6m52s)  kubelet, minikube  Successfully pulled image "ubuntu"
  Warning  BackOff    2m2s (x24 over 6m43s)  kubelet, minikube  Back-off restarting failed container

The Events section will provide you with detailed explanation on what happened.

-- willrof
Source: StackOverflow

1/9/2020

RestartCount represents the number of times the container inside a pod has been restarted, it is based on the number of dead containers that have not yet been removed. Note that this is calculated from dead containers. 

-w on the command is for watch flag and various headers are as listed below

$ kubectl get pods -w
NAME      READY   STATUS    RESTARTS   AGE
nginx     1/1     Running   0          21m

To get detailed output use -o wide flag

$ kubectl get pods -o wide
NAME      READY   STATUS    RESTARTS   AGE   IP            NODE                    NOMINATED NODE   READINESS GATES
nginx   1/1     Running   1          21h   10.244.2.36   worker-node-2   <none>           <none>

So the READY field represents the containers inside the pods and can be seen in detailed by describe pod command. Refer POD Lifecycle

$ kubectl describe pod nginx| grep -i -A6 "Conditions"
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True

RESTARTS Field is tracked under Restart Count , grep it from pod description as below.

$ kubectl describe pod nginx | grep -i "Restart"
    Restart Count:  0

So as a test we now try to restart the above container and see what field are updated. We find the node where our container is running and kill the container from node using docker command and it should be restarted automatically by kubernetes

$ kubectl get pods -o wide
NAME     READY   STATUS    RESTARTS   AGE   IP            NODE                    NOMINATED NODE   READINESS GATES
nginx   1/1     Running   0          21h   10.244.2.36   worker-node-2   <none>           <none>
ubuntu@worker-node-2:~$ sudo docker ps  -a | grep -i nginx
4c8e2e6bf67c        nginx                    "nginx -g 'daemon of…"   22 hours ago        Up 22 hours                                  


ubuntu@worker-node-2:~$ sudo docker kill 4c8e2e6bf67c
4c8e2e6bf67c
  1. POD Status is changed to ERROR
  2. READY count goes to 0/1
ubuntu@cluster-master:~$ kubectl get pods -o wide
NAME   READY   STATUS   RESTARTS   AGE   IP            NODE            NOMINATED NODE   READINESS GATES
nginx   0/1      Error    0          21h   10.244.2.36  worker-node-2   <none>           <none>

Once POD recovers the failed container.

  1. READY count is 1/1 again
  2. STATUS changes back to running
  3. RESTARTS count is incremented by 1
ubuntu@cluster-master:~$ kubectl get pods -o wide
NAME    READY   STATUS    RESTARTS   AGE   IP            NODE            NOMINATED NODE   READINESS GATES
nginx   1/1     Running   1          21h   10.244.2.36   worker-node-2   <none>           <none>

Check restart by describe command as well

$ kubectl describe pods nginx  | grep -i "Restart"
    Restart Count:  1

The values in your output are not inconsistent .. that is how the pod with a restartPolicy of Always will work it will try to bring back the failed container until CrashLoopBackOff limit is reached.

Refer POD State Examples

Pod is running and has one Container. Container exits with success. Log completion event.

If restartPolicy is:

  1. Always: Restart Container; Pod phase stays Running.
  2. OnFailure: Pod phase becomes Succeeded.
  3. Never: Pod phase becomes Succeeded.
-- DT.
Source: StackOverflow