On what basis restart count in kubernetes increase

6/20/2018

I have a kubernetes cluster running fine. It has 4 workers and 1 master with the dashboard to view the status. After running it for sometime, I looked at the Restart count of a node and it was 8. I immediately ran the describe command to get any events but there was no events for that pod. However when I checked the logs of the containers, I found out that the node itself was powered down and up 4 times but dont know why it didnt had any events.

In another node, while looking at the restart count, I got event as Sandbox changed which means probably the node was powered down for sometime and thus the master lost connection to it and so incremented the restart count by 2.

  1. I wanted to know how can we get the logs/debug related to this restart count to know why it was restarted.
  2. Whenever a pod is recreated, does it takes up a new name.? If so, how can we get the events of the previous pod.
  3. Does sandbox changed event actually means that master actually lost connection.?
-- S Andrew
kubernetes

1 Answer

6/20/2018

Step by step:

  1. I'd check the kubelet and docker daemon logs, these restarts should appear somewhere in the logs and hopefully more info about what causes them.

  2. Yes, the pod's name is unique thus it change everytime a pod is destroyed and recreated. You can try to find the pod with kubectl get po -a. Other solution is to get all events with kubectl get events and then filter to find your pod's events.

  3. I've seen this error before and in my case it meant problem with the docker daemon networking. But I searched a bit in google and I saw many other reasons. Again, try to analyse the docker daemon and kubelet logs, and also dmesg. If you have doubts please add a link to the logs in your question and I'll try to help.

-- Ignacio Millán
Source: StackOverflow