Kubernetes pods graceful shutdown with TCP connections (Spring boot)

2/18/2020

I am hosting my services on azure cloud, sometimes I get "BackendConnectionFailure" without any apparent reason, after investigation I found a correlation between this exception and autoscale (scaling down) almost at the same second in most of the cases.

According to documentation termination grace period by default is 30 seconds, which is the case. The pod will be marked terminating and the loadbalancer will not consider it anymore, so receiving no more requests. According to this if my service takes far less time than 30 seconds, I should not need prestop hook or any special implementation in my application (please correct me if I am wrong).

If the previous paragraph is correct, why does this exception occur relatively frequent? My thought is when the pod is marked terminating and the loadbalancer does not forward anymore requests to the pod while it should do.

Edit 1:

The Architecture is simply like this

Client -> Firewall(azure) -> API(azure APIM) -> Microservices(Spring boot) -> backend(third party) or azure RDB depending on the service

I think the Exception comes from APIM, I found two patterns for this exception:

  1. Message The underlying connection was closed: The connection was closed unexpectedly. Exception type BackendConnectionFailure Failed method forward-request

Response time 10.0 s

  1. Message The underlying connection was closed: A connection that was expected to be kept alive was closed by the server. Exception type BackendConnectionFailure Failed method forward-request

Response time 3.6 ms

-- omar
azure-kubernetes
horizontal-pod-autoscaling
kubernetes
spring-boot

3 Answers

2/19/2020

Spring Boot doesn't do graceful termination by default.

The Spring Boot app and it's application container (not linux container) are in control of what happens to existing connections during the termination grace period. The protocols being used and how a client reacts to a "close" also have a part to play.

If you get to the end of the grace period, then everything gets a hard reset.

Kubernetes

When a pod is deleted in k8s, the Pod Endpoint removal from Services is triggered at the same time as the SIGTERM signal to the container(s).

At this point the cluster nodes will be reconfigured to remove any rules directing new traffic to the Pod. Any existing TCP connections to the Pod/containers will remain in connection tracking until they are closed (by the client, server or network stack).

For HTTP Keep Alive or HTTP/2 services, the client will continue hitting the same Pod Endpoint until it is told to close the connection (or it is forcibly reset)

App

The basic rules are, on SIGTERM the application should:

  • Allow running transactions to complete
  • Do any application cleanup required
  • Stop accepting new connections, just in case
  • Close any inactive connections it can (keep alive requests, websockets)

Some circumstances you might not be able to handle (depends on the client)

  • A keep alive connection that doesn't complete a request in the grace period, can't get a Connection: close header. It will need a TCP level FIN close.
  • A slow client with a long transfer, in a one way HTTP transfer these will have to be waited for or forcibly closed.

Although keepalive clients should respect a TCP FIN close, every client reacts differently. Microsoft APIM might be sensitive and produce the error even though there was no real world impact. It's best to load test your setup while scaling to see if there is a real world impact.

For more spring boot info see:

https://github.com/spring-projects/spring-boot/issues/4657 https://github.com/corentin59/spring-boot-graceful-shutdown https://github.com/SchweizerischeBundesbahnen/springboot-graceful-shutdown

-- Matt
Source: StackOverflow

2/18/2020

When your applications receives a SIGTERM (from the Pod termination) it needs to first stop reporting it is ready (fail the readinessProbe) but still serve requests as they come in from clients. After a certain time (depending on your readinessProbe settings) you can shut down the application.

For Spring Boot there is a small library doing exactly that: springboot-graceful-shutdown

-- koe
Source: StackOverflow

2/18/2020

You can use a preStop sleep if needed. While the pod is removed from the service endpoints immediately, it still takes time (10-100ms) for the endpoint update to be sent to every node and for them to update iptables.

-- coderanger
Source: StackOverflow