kube-dns. High Availability. Error handling in kuberntes

2/27/2019

I have a kubernetes cluster with several nodes. I have kube-dns running in 3 nodes.

The issue I'm having is that if 1 of those 3 nodes goes down the requests between my pods/containers start to fail more or less 1 of 3 times.

This is because when the container resolve a k8s service hostname it calls the kube-dns service to resolve that hostname and the kube-dns k8s services has three endpoints but one of those three endpoints is not valid as the node is down. K8s does not update the service until it detects the node is down. (Currently I have that time set to 60 seconds).

Any idea about how to mitigate this? Is there any kind of retry that could be configured outside the application? Something in the container or at k8s level.

Thank you.

-- Jxadro
kube-dns
kubernetes

1 Answer

3/5/2019

The main contributor for communication between underlying Kubernetes resources on the particular Node and kube-apiserver is kubelet. Its role can be determined as a Node agent. Therefore, kubelet plays a significant role in the cluster life cycle, due to primary duties like managing liveness and readiness probes for the nested objects, updating ETCD storage in order to write metadata for the resources and periodically refreshing own health status to kube-apiserver, specified by --node-status-update-frequency flag in kubelet configuration.

--node-status-update-frequency duration Specifies how often kubelet posts node status to master. Note: be cautious when changing the constant, it must work with nodeMonitorGracePeriod in nodecontroller. (default 10s)

However, there is a specific component in Kubernetes called Node controller. One of the essential roles of Node controller is to check the status of the involved workers by controlling relevant heartbeat from kubelet. There are some specific flags that describe this behavior and by default these flags have been included in kube-controller-manager configuration:

  • --node-monitor-period - Check kubelet status with specified time interval (default value 5s);
  • --node-monitor-grace-period - The time that Kubernetes controller manager considers healthy status of Kubelet (default value 40s);
  • --pod-eviction-timeout - The grace timeout for deleting pods on failed nodes (default value 5m).

Whenever you want to mitigate DNS Pods outage, in case a Node goes down, you should consider these options. You can also take a look at DNS horizontal autoscaller in order to align to stable replica count for DNS Pods, however it brings some additional logic structure to be implemented, which can consume more compute resources on the cluster engine.

-- mk_sta
Source: StackOverflow