Pod controlled by StatefulSet is stuck in ContainerCreating
state
kubectl get pods
md-0 1/1 Running 0 4h 10.242.208.59 node-5
md-1 1/1 Running 0 4h 10.242.160.36 node-6
md-2 0/1 ContainerCreating 0 4h <none> node-6
kubectl describe pod md-2
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreatePodSandBox 2m (x68 over 4h) kubelet, node-6 Failed create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded
kubectl describe statefulset md
Replicas: 3 desired | 3 total
Pods Status: 2 Running / 1 Waiting / 0 Succeeded / 0 Failed
...
Events: <none>
kubelet log from node-6
RunPodSandbox from runtime service failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
CreatePodSandbox for pod "md-2_exc(a995dd3d-158d-11e9-967b-6cb311235088)" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
createPodSandbox for pod "md-2_exc(a995dd3d-158d-11e9-967b-6cb311235088)" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded
Error syncing pod a995dd3d-158d-11e9-967b-6cb311235088 ("md-2_exc(a995dd3d-158d-11e9-967b-6cb311235088)"), skipping: failed to "CreatePodSandbox" for "md-2_exc(a995dd3d-158d-11e9-967b-6cb311235088)" with CreatePodSandboxError: "CreatePodSandbox for pod \"md-2_exc(a995dd3d-158d-11e9-967b-6cb311235088)\" failed: rpc error: code = DeadlineExceeded desc = context deadline exceeded"
I have two other StatefulSets and they works as expected. For some reason this one is broken. Also direct kubectl run
and docker run
works fine.
update 2019-01-18
After restoration of change's timeline I see that this specific pod was deleted with docker command bypassing kubernetes.
Probably this somehow corrupted kubernetes' state or something.
After numerous searching, asking and troubleshooting I still could not find what's exactly wrong. So I had to restart kubelet (systemctl restart kubelet
) on node where pod was assigned. And the issue is gone.
I hoped to understand how to check what exactly wrong with kubernetes (or kubelet?) but could not find any clues. And kubelet behavior remains black box in this case.
As alexar mentioned in update:
After restoration of change's timeline I see that this specific pod was deleted with docker command bypassing kubernetes.
Probably this somehow corrupted kubernetes' state or something.
After numerous searching, asking and troubleshooting I still could not find what's exactly wrong. So I had to restart kubelet (systemctl restart kubelet) on node where pod was assigned. And the issue is gone.