How to kickoff the dead replicas of Kubernetes Deployment

4/17/2017

Now we have deployed services as Kubernetes Deployments with multiple replicas. Once the server crashes, Kubernetes will migrate its containers to another available server which tasks about 3~5 minutes.

While migrating, the client can access the the Deployment service because we still have other running replicas. But sometimes the requests fail because the load balancer redirect to the dead or migrating containers.

It would be great if Kubernetes could kickoff the dead replicas automatically and add them once they run in other servers. Otherwise, we need to setup LB like haproxy to do the same job with multiple Deployment instances.

-- tobe
deployment
kubernetes
load-balancing

2 Answers

4/17/2017

You need to configure health checking to have properly working load balancing for a Service. Please have a read of:

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

The kubelet uses readiness probes to know when a Container is ready to start accepting traffic. A Pod is considered ready when all of its Containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.

-- Janos Lenart
Source: StackOverflow

4/17/2017

1、kubelet

--node-status-update-frequency duration Specifies how often kubelet posts node status to master. Note: be cautious when changing the constant, it must work with nodeMonitorGracePeriod in nodecontroller. Default: 10s (default 10s)

2、controller-manager

--node-monitor-grace-period duration Amount of time which we allow running Node to be unresponsive before marking it unhealthy. Must be N times more than kubelet's nodeStatusUpdateFrequency, where N means number of retries allowed for kubelet to post node status. (default 40s)

--pod-eviction-timeout duration The grace period for deleting pods on failed nodes. (default 5m0s)

-- x1957
Source: StackOverflow