Kubernetes Delay pod Termination - FailedPreStopHook

6/12/2020

I have k8s cluster where I deploy Spring application using Helm. <br> I would like to set up "grace period" to let old container finish their jobs before being terminated and replaced with the new pod. deployment.yml

      terminationGracePeriodSeconds: 600    # ~ 10 minutes
      containers:
        - name: receiver
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sleep","600"]

But I can observe strange error in kubectl get events:

3m15s       Warning   FailedPreStopHook   pod/robot-7bd4c6956f-ltbpn                Exec lifecycle hook ([/bin/sleep 600]) for Container "receiver" in Pod "robot-7bd4c6956f-ltbpn_rpa-uat(b0d17f4f-4adf-4b8b-a4df-fd84f694b92c)" failed - error: command '/bin/sleep 600' exited with 137: , message: ""

Does anyone know how to make container / pod wait those 600 seconds?

-- Filip Niko
kubernetes
kubernetes-helm

1 Answer

6/22/2020


In k8s docs on lifecycles you can read:

PreStop - This hook is called immediately before a container is terminated ...

This means that when pod termination starts, this prestop hook is executed before SIGTERM is sent to the container.

At the same time when preStop hook is started, k8s is starting countdown timer to wait terminationGracePeriodSeconds number of seconds before sending SIGKILL to container.

Notice that in your case when your preStop sleeps 600s and terminationGracePeriodSeconds is also set to 600s may cause race condition.

Have a look at this piece of code in kubernetes source code:

select {
case <-time.After(time.Duration(gracePeriod) * time.Second):
	klog.V(2).Infof("preStop hook for container %q did not complete in %d seconds", containerID, gracePeriod)
case <-done:
	klog.V(3).Infof("preStop hook for container %q completed", containerID)
}

As you can see, kubelet waits for whatever happens first. But in your case both of these cases take 600s and they finish more or less in the same time and this may lead to race condition. So either preStop finishes successfully first and then countdown finishes killing the pod or countdown timer finishes first, sends SIGKILL to the container killing everything running inside, that also means killing preStop and then preStop finishes with error Warning FailedPreStopHook. Also notice that exited with 137 means that process got killed by k8s with SIGKILL (137 - 128 = 9, where 9 is SIGKILL number).

And the most important part is that your application didn't even know that it is about to get terminated.


OK, so what can you do? How to properly use preStop?

preStop should be used to either notify main process running in container that it is about to get terminated and it should start preparing for it or to e.g. inform other members of app cluster that this instance of the app will be terminated. It can also be used as you described (with sleep) to wait for a while so that changes in iptables have time to propagate. You may have not been aware of it but as soon as termination process starts, k8s is reconfiguring network so that new connections are not created. This is why sometimes small delay is used to give k8s time to propagate the changes and to allow the app to respond to ongoing requests before its terminated.

So the best thing you could do is making your application aware of termination process and make it react to SIGTERM signal gracefully. Using long sleep won't solve the problem you are facing.


Also, here is some information about springboot you may find usefull.

-- acid_fuji
Source: StackOverflow