The pod PreStop hook is hanging in Terminating status and is killed only after terminationGracePeriodSeconds

9/20/2020

I have a preStop hook defined in a statefulset pod resource that runs a bash script to make sure not to kill the pod until few processes finishes/cancels/errors within an application. I don't have the terminationGracePeriodSeconds defined. Now when I delete the pod, I tested that the script that is part of preStop hook is run as expected. But after adding terminationGracePeriodSeconds for 10 min, first the bash script is run as part of preStop hook successfully for couple of minutes and it is supposed to kill the pod. But the pod is hanging in TERMINATING status and it is killed only after 10 min. 1. Why is the pod is hanging? Unable to find an answer for this. 2. When the terminationGracePeriodSeconds was not added, the flow was working as expected by killing the pod as soon as finishing the script or within 30 sec which is the terminationGracePeriodSeconds. But when I added the grace period of 10 min or more, it is waiting until that time and then killing the pod.

How to solve this issue. Is there a way to send SIGTERM or SIGKILL to the pod. Any ideas? Thank you in advance!

STATEFULSET.YAML

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: appx
  name: appx
spec:
  serviceName: appx
  replicas: 1
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: appx
  template:
    metadata:
      labels:
        app: appx
    spec:
      #removed some of the sensitive info
      terminationGracePeriodSeconds: 600
      containers:
        - image: appx
          imagePullPolicy: IfNotPresent
          name: appx
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 30 && bash /var/tmp/runscript.sh; sleep10"]

KUBECTL DESCRIBE POD

**kubectl describe pod appx**
Name:           appx
Namespace:      default
Priority:       0
Node:           docker-desktop/192.168.65.3
Start Time:     Mon, 21 Sep 2020 07:30:55 -0500
Labels:         app=appx
Annotations:    <none>
Status:         Running
IP:             x.x.x.x
Controlled By:  StatefulSet/appx
Containers:
  appx:
    Container ID:   docker://dfdgfgfgfgfgfgfg
    Image:          appx
    Image ID:       docker://sha256:49dfgfgfgfgfgfgfgfgfg96a6fc
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 21 Sep 2020 07:30:56 -0500
    Ready:          True
    Restart Count:  0
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  appx-token-xj6q9:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  appx-token-fhfdlf
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age    From                     Message
  ----    ------     ----   ----                     -------
  Normal  Scheduled  2m43s  default-scheduler        Successfully assigned default/appx to docker-desktop
  Normal  Pulled     2m42s  kubelet, docker-desktop  Container image "appx" already present on machine
  Normal  Created    2m42s  kubelet, docker-desktop  Created container appx
  Normal  Started    2m42s  kubelet, docker-desktop  Started container appx
-- acm
kubernetes
sigterm

1 Answer

9/22/2020

preStop hook and terminationGracePeriodSeconds are asynchronous. It means that as soon as the kubelet sees that a Pod has been marked as terminating, the kubelet begins the local Pod shutdown process. This means that if container doesn`t terminate within the grace period, a SIGKILL singal will be sent and the container will be killed regardless of wehther the commands in the preStop hook are completed.

  1. When the terminationGracePeriodSeconds was not added, the flow was working as expected by killing the pod as soon as finishing the script or within 30 sec which is the terminationGracePeriodSeconds. But when I added the grace period of 10 min or more, it is waiting until that time and then killing the pod.

terminationGracePeriodSeconds grace period is always added. It just defaults to the 30 seconds as I already mentioned in the comments. So, what if terminationGracePeriodSeconds is less than the time to complete the preStop hook?

Then the container will terminate at the end of the terminationGracePeriodSeconds and the preStop hook will not finish/run.

When terminationGracePeriodSeconds is set to 600s, the preStop hook script is hanging (currently unclear whether it ever worked as it wasn’t properly tested with the default 30s terminationGracePeriodSeconds due to preemptive termination). It means that some processes are not handling SIGTERM correctly which is currently not corrected for in the preStop hook, meaning that the container is instead waiting for the SIGKILL to be sent after the 10 min terminationGracePeriod ends.

If you take a look here you will find out that even though the user specified a preStop hook, they needed to SIGTERM nginx for a graceful shutdown.

In the case whereas you have set terminationGracePeriodSeconds to 10 minutes, even though your preStop hook executed successfully Kubernetes waited 10 minutes before terminating your container because that is exactly what you told him to do. Termination signal is being sent by kubelet, but it is not being passed to the application insider the container. Most common reason for that is when your container runs a shell which runs the application process the signal might be consumed/interrupted by shell itself instead of passed towards the child process. Also, since it is unclear what your runscript.sh is doing it is difficult to make any other suggestions to what processes are failing to handle SIGTERM.

What you can do in this case? The options for ending sooner are :

  • Decrease terminationGracePeriodSeconds
  • Send a signal for a graceful shutdown by ensuring SIGTERM is handled correctly and all running processes are listening for termination. Examples of how to do this are here. You can see that they use the “quit” command for NGINX.

For more information about you can find great articles here and here.

-- acid_fuji
Source: StackOverflow