I have a kubernetes cluster running fine. It has 4 workers and 1 master with the dashboard to view the status. After running it for sometime, I looked at the Restart
count of a node and it was 8
. I immediately ran the describe command to get any events but there was no events for that pod. However when I checked the logs of the containers, I found out that the node itself was powered down and up 4 times but dont know why it didnt had any events.
In another node, while looking at the restart count, I got event as Sandbox changed
which means probably the node was powered down for sometime and thus the master
lost connection to it and so incremented the restart count by 2.
Does sandbox changed event
actually means that master actually lost connection.?Step by step:
I'd check the kubelet and docker daemon logs, these restarts should appear somewhere in the logs and hopefully more info about what causes them.
Yes, the pod's name is unique thus it change everytime a pod is destroyed and recreated. You can try to find the pod with kubectl get po -a
. Other solution is to get all events with kubectl get events
and then filter to find your pod's events.
I've seen this error before and in my case it meant problem with the docker daemon networking. But I searched a bit in google and I saw many other reasons. Again, try to analyse the docker daemon and kubelet logs, and also dmesg. If you have doubts please add a link to the logs in your question and I'll try to help.