Kubernetes not scheduling failed pod on other node

3/11/2017

I have 4 nodes kubernetes cluster. My application running with 2 replica instances. I am using deployment resource with replica set. As per the documentation , replica set always ensure that specified no. Of application instances will be running.If I delete the one pod instance, then it will be restarted on the same or different node.But when I simulated the failure of a pod instance by stopping docker engine on one node. Kubectl shows status as error for the pod instance but do not restart the pod on another node. Is it the expected behaviour or am I missing something.

-- Anand Shaw
kubernetes

2 Answers

3/11/2017

AFAIS Kubernetes changed that behavior with version 1.5. If I interpret the docs correctly, the Pods of the failed node is still registered in the apiserver, since it abruptly died and wasn't able to unregister the pods. Since the Pod is still registered, the ReplicaSet doesn't replace it.

The reason for this is, that Kubernetes cannot tell if it is a network error (eg split-brain) or a node failure. With StatefulSets being introduced, Kubernetes needs to make sure that no Pod is started more than one time.

This maybe sounds like a bug, but if you have a properly configured cloud-provider (eg for GCE or AWS), Kubernetes can see if that Node is still running. When you would shut down that node, the controller should unregister the Node and its Pods and then create a new Pod on another Node. Together with a Node health check and a Node replacement, the cluster is able to heal itself.

How the cloud-provider is configured depends highly on your Kubernetes setup.

-- svenwltr
Source: StackOverflow

4/3/2017

Just wait for about 5 mins of bringing down the node or docker on it. Kubernetes marks the status of all the pods which were running on that node as 'Unknown' and will bring them up on the remaining active eligible nodes. Once the failed node comes back up, the pods on that node would be deleted if K8S already has them replaced on other node(s).

-- msbl3004
Source: StackOverflow