Do Kubernetes pods still receive requests after receiving SIGTERM?

11/11/2016

I want to implement graceful shutdown in a Kubernetes Pod. I know I need to listen for SIGTERM, which indicates the start of the shutdown procedure. But what exactly do I do when I receive it?

At the very least I must wait for all running request to finish before exiting. But can the pod still receive new requests after receiving the SIGTERM? (It's exposed using a service.) I can't find any explicit documentation on this.

The docs state:

Pod is removed from endpoints list for service, and are no longer considered part of the set of running pods for replication controllers. Pods that shutdown slowly can continue to serve traffic as load balancers (like the service proxy) remove them from their rotations.

So that seems to imply that new requests can still come in. So how long should I continue to expect new requests before graceful termination? Do I simply ignore the SIGTERM, continue to serve requests as usual and wait for the eventual SIGKILL?

I suppose ensuring future readiness checks fail and then waiting longer than the period with which they occur before terminating might work?

I'm on Kubernetes 1.2.5, if that makes any difference, and am talking about rolling updates in particular, but also scaling replication controllers down generally.

-- Mr. Wonko
kubernetes

3 Answers

11/15/2017

I recently faced similar problem, I used simple preStop hook, which introduces some delay(sleep) between start of termination and receiving SIGTERM to underlying process

lifecycle:
        preStop:
          exec:
            command:
              - "sleep"
              - "60"

This delay helps,

  1. Load balancer to remove(sync) the pod being terminated

  2. Gives chance to terminating pod to complete requests received before termination

  3. Fulfill the requests received by terminating pod between termination and load balancer update(sync)

PreStop can be made more intelligent for pod with unpredictable time of serving

-- JRomio
Source: StackOverflow

11/28/2016

I ran some experiments to find out exactly what happens.

The pod will briefly (<1s) continue receiving requests after shutdown is initiated, so you need to either catch SIGTERM or install a preStop hook so you can wait for them (and finish serving current requests).

However, once the shutdown has been initiated, the readiness probe no longer matters, you don't need to change its state to stop receiving requests. (But before that a failing readiness probe will lead to your pod receiving no more traffic.)

-- Mr. Wonko
Source: StackOverflow

11/16/2016

You should use the preStop hook along with a livenessProbe health check if you want to cleanly drain traffic before shutting down the pod.

Ideally, what you would have is a preStop hook that forces the pod into an unhealthy livenessProbe check so the pod will be removed from the load balancer and then gracefully shuts down.

This isn't pretty but the example worked in my simple tests.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx
spec:
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        livenessProbe:
          exec:
            command:
            - cat
            - /usr/share/nginx/html/50x.html
          initialDelaySeconds: 15
          timeoutSeconds: 1
        ports:
        - containerPort: 80
        lifecycle:
          preStop:
            exec:
              # SIGTERM triggers a quick exit; fail health check and gracefully terminate instead
              command: ["/bin/rm","-f","/usr/share/nginx/html/50x.html",";","sleep","2",";","/usr/sbin/nginx","-s","quit"]

From this example the livenessProbe looks for the /usr/share/nginx/html/50x.html file. As long as that file exists the pod is healthy. When the pod is going to be shut down the preStop hook is fired which removes that file. This should trigger the pod to be removed from an external load balancer on the next health check (1 sec). The preStop command then sleeps 2 seconds (to make sure the next health check fired) and tells nginx to gracefully stop -s quiet. The preStop command is supposed to complete within 30 seconds before the pod is force killed (SIGTERM) but that should give plenty of time for nginx to drain connections.

-- Justin Garrison
Source: StackOverflow