Kubernetes pod in Terminating state till worker up again

5/6/2020

I'm running a Kubernetes cluster with one node manager and 4 nodes workers. when i start a pod , this one is correctly assigned to one worker and it start to run. When i shutdown the worker where the pod was assigned, the manager detect the node NotReady after 40 seconds, and after 2 second the pod become Terminating. i set this toleration for my pod:

spec:
  tolerations:
    - key: "node.kubernetes.io/unreachable"
      operator: "Exists"
      effect: "NoExecute"
      tolerationSeconds: 2
    - key: "node.kubernetes.io/not-ready"
      operator: "Exists"
      effect: "NoExecute"
      tolerationSeconds: 2

so the behavior is what i expected. What i'm not expecting is that the pod remain in Terminating status till the worker come back to Ready. When the worker is up again the pod is deleted from my system. My expectation is once the tolerationSeconds expired the pod has to be scheduled on a different worker and run again. Below the cluster with the versions:

docker1   Ready    <none>   21d   v1.17.4   192.168.1.2   <none>        CentOS Linux 7 (Core)   5.5.9-1.el7.elrepo.x86_64    docker://19.3.8
docker2   Ready    <none>   21d   v1.17.4   192.168.1.3   <none>        CentOS Linux 7 (Core)   5.5.11-1.el7.elrepo.x86_64   docker://19.3.8
docker3   Ready    <none>   21d   v1.17.4   192.168.1.4   <none>        CentOS Linux 7 (Core)   5.6.4-1.el7.elrepo.x86_64    docker://19.3.8
docker4   Ready    <none>   19d   v1.17.4   192.168.1.5   <none>        CentOS Linux 7 (Core)   5.6.4-1.el7.elrepo.x86_64    docker://19.3.8
manager   Ready    master   22d   v1.17.4   192.168.1.1   <none>        CentOS Linux 7 (Core)   5.5.9-1.el7.elrepo.x86_64    docker://19.3.8

can anyone suggest me what i am missing or if this one is the correct behavior ?

-- Daniele_r81
docker
kubernetes

0 Answers