Automatic Pod Deletion Delay in Kubernetes

2/22/2019

Is there is a way to automatically delay all Kubernetes pod deletion requests such that the endpoint deregistration is signaled, but the pod's SIGTERM is delayed by several seconds?

It would be preferable, but not required, if the delay only affected pods with an Endpoint/Service.

Background:

It is well established that some traffic can continue to a Pod after a pod has been sent the SIGTERM termination signal due to the asynchronous nature of endpoint deregistration and the deletion signal. The recommended mitigation is to introduce a few seconds delay in the pod's preStop lifecycle hook by invoking sleep.

The difficulty rapidly arises where the pod's deployment may be done via helm or other upstream source, or else there are large numbers of deployments and containers to be managed. Modifying many deployments in such a way may be difficult, or even impossible (e.g. the container may not have a sleep binary, shell, or anything but the application executable).

I briefly explored a mutating admission controller, but that seems unworkable to dynamically add a preStop hook, as all images do not have a /bin/sleep or already have a preStop that could need image-specific knowledge to merge.

(Of course, all of this could be avoided if the K8S API made the endpoint deregistration synchronous with a timeout to avoid deadlock (hint, hint), but I haven't seen any discussions of such a change. Yes, there are tons of reasons why this isn't synchronous, but that doesn't mean something can't be done.)

-- Eldstone
kubernetes

1 Answer

4/3/2019

Kubernetes lifecycle has following steps.

  • Pod is set to the “Terminating” State and removed from the endpoints list of all Services
  • preStop hook is executed
  • SIGTERM signal is sent to the pod
  • Kubernetes waits for a grace period, default is 30 seconds
  • SIGKILL signal is sent to pod, and the pod is removed

Grace period is what you need. It's important to node that this grace period is happening in parallel to the preStop hook and the SIGTERM signal. Also Kubernetes does not wait for the preStop hook to finish.

So for example you could set the terminationGracePeriodSeconds: 90 and this might look like the following:

spec: 
   terminationGracePeriodSeconds: 90
   containers:
       - name: myApplication

You can read the Kubernetes docs regarding Termination of Pods. I also recommend great blog post Kubernetes best practices: terminating with grace.

-- Crou
Source: StackOverflow