How to send traffic from Service to Pod while Pod is in termination grace period

5/9/2018

I have a deployment (A pods) with a Service and HorizontalPodAutoscaler attached. I want to be able to control the scale down process and do some cleanup before the pod shutdown. Problem is, the cleanup can take a lot of time and for it to complete some other service (B pods) should be able to access the pod trying to shut down.

To accomplish this I set the deployment A to have a long spec.terminationGracePeriodSeconds value. When A pod gets the SIGTERM it starts finishing up and closing the process when it's done.

From the point pod A get the SIGTERM it is no longer receives connections from pod B because the service removes it's IP from the endpoint - making it impossible for pod A to finish it's cleanup.

Tried using ClusterIP and Headless services, both acts the same.

How can I make the service continue sending traffic to pod A even after it got the SIGTERM? I don't mind requests from B pods getting errors when trying to get to A pods.

-- Idan
kubernetes

1 Answer

5/9/2018

There is no way to do that because of the termination process design.

Here is the extract from the documentation of the termination process:

  1. User sends command to delete Pod, with default grace period (30s)

  2. The Pod in the API server is updated with the time beyond which the Pod is considered “dead” along with the grace period.

  3. Pod shows up as “Terminating” when listed in client commands

  4. (simultaneous with 3) When the Kubelet sees that a Pod has been marked as terminating because the time in 2 has been set, it begins the pod shutdown process.

    1. If the pod has defined a preStop hook, it is invoked inside of the pod. If the preStop hook is still running after the grace period expires, step 2 is then invoked with a small (2 second) extended grace period.
    2. The processes in the Pod are sent the TERM signal.
  5. (simultaneous with 3) Pod is removed from endpoints list for service, and are no longer considered part of the set of running pods for replication controllers. Pods that shutdown slowly can continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.

  6. When the grace period expires, any processes still running in the Pod are killed with SIGKILL.

  7. The Kubelet will finish deleting the Pod on the API server by setting grace period 0 (immediate deletion). The Pod disappears from the API and is no longer visible from the client.

So, the Pod will be deregistered in the Service while resolving 'SIGTERM' signal and you have no options to avoid it.

-- Anton Kostenko
Source: StackOverflow