--default-unreachable-toleration-seconds and --default-not-ready-toleration-seconds are not working

4/6/2020

I updated kube-apiserver service with default toleration limits as 60 sec.

--default-unreachable-toleration-seconds=60 
--default-not-ready-toleration-seconds=60

I stopped kubelet in one of my node, it went to not-ready state.

But the problem I am facing is the pods are getting evicted only after 5 minutes from node moved to not-ready state. The expectation is pods should be evicted in 60 seconds.

My kubernetes version is 1.15.

I restarted the kube-controller-manager and kube-apiserver after I made this change.

What is wrong with the procedure?

-- Karthik
kubernetes
kubernetes-pod

1 Answer

4/6/2020

With those two default toleration parameters, Kubernetes will automatically add a toleration configuration to every pod. Therefore it is very important to redeploy all pods to ensure the toleration is added to all of your pods.

Alternative to changing parameters of kubernetes control plane components which makes it global change for all pod you can also modify the toleration time per pod using the following configuration.

tolerations:
  - key: "node.kubernetes.io/unreachable"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 60
  - key: "node.kubernetes.io/not-ready"
    operator: "Exists"
    effect: "NoExecute"
    tolerationSeconds: 60

https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/#taint-based-evictions

-- Arghya Sadhu
Source: StackOverflow