Kubernetes doesn't wait for all websocket connections to close before terminating during rolling update

4/12/2019

I am trying to achieve 0 downtime during rolling update with EKS (AWS K8s service).

I have one WebSocket server and I want to ensure during the rolling update of this server, existing connections will be kept until the WebSockets are closed after the work is done.

I thought K8s rolling update feature would help me with this but it did not. I tried and it simply killed the pod while there were still connections to the WebSocket.

If I understand the document correctly, then the pod termination goes like this:

  1. User signals pod deletion to K8s API
  2. K8s stops routing new traffic to this pod and sends the SIGTERM signal
  3. The application MUST handle this signal and start graceful termination of itself in a specified grace-period (default to 30s)
  4. After that, K8s sends a SIGKILL signal to force terminate the pod.

If my above understanding is correct, clearly there is no way to tell K8s to:

  1. Don't interrupt current connections
  2. Let them run for as long as they need (they will eventually close but the period varies greatly)
  3. Once all connections are closed, terminate the pod

Question: Is there any ways at all to make sure K8s:

  1. Doesn't interrupt WebSocket connection
  2. Doesn't force the application to kill the connection in a specific grace-period
  3. Detects when all WebSocket connections are closed and kill the pod

If anyone can assist me that would be greatly appreciated.

-- Tran Triet
kubernetes
websocket

2 Answers

4/12/2019

For mission critical application, go for customised blue-green deployments.

First deploy new version deployment with new selector and when all POD replicas are UP and ready to serve traffic, switch the service selector to point to new version deployment.

After this send the kill switch to older version which gracefully handles and disconnect all clients. So all new reconnections are forwarded to new version which is already set to serve traffic.

-- Akash Sharma
Source: StackOverflow

4/12/2019

You can use the lifecycle hook in kubernetes pod lifecycle which have a preStop hook. This hook will run just before the termination of your pod.

lifecycle:
  postStart:
    exec:
      command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
  preStop:
    exec:
      command: ["/bin/sh","-c","nginx -s quit; while killall -0 nginx; do sleep 1; done"]

You can have your script to kill all the connections and wait for the connection to terminate and then only the pod will terminate.

-- Prafull Ladha
Source: StackOverflow