Does Kubernetes support connection draining?

11/2/2016

Does Kubernetes support connection draining?

For example, my deployment rolls out a new version of my web app container. In connection draining mode Kubernetes should spin up a new container from the new image and route all new traffic coming to my service to this new instance. The old instance should remain alive long enough to send a response for existing connections.

-- abovesun
kubernetes

2 Answers

6/3/2017

Kubernetes does support connection draining, but how it happens is controlled by the Pods, and is called graceful termination.

Graceful Termination

Let's take an example of a set of Pods serving traffic through a Service. This is a simplified example, the full details can be found in the documentation.

  1. The system (or a user) notifies the API that the Pod needs to stop.
  2. The Pod is set to the Terminating state. This removes it from a Service serving traffic. Existing connections are maintained, but new connections should stop as soon as the load balancers recognize the change.
  3. The system sends SIGTERM to all containers in the Pod.
  4. The system waits terminationGracePeriodSeconds (default 30s), or until the Pod completes on it's own.
  5. If containers in the Pod are still running, they are sent SIGKILL and terminated immediately. At this point the Pod is forcefully terminated if it is still running.

This not only covers the simple termination case, but the exact same process is used in rolling update deployments, each Pod is terminated in the exact same way and is given the opportunity to clean up.

Using Graceful Termination For Connection Draining

If you do not handle SIGTERM in your app, your Pods will immediately terminate, since the default action of SIGTERM is to terminate the process immediately, and the grace period is not used since the Pod exits on its own.

If you need "connection draining", this is the basic way you would implement it in Kubernetes:

  1. Handle the SIGTERM signal, and clean up your connections in a way that your application decides. This may simply be "do nothing" to allow in-flight connections to clear out. Long running connections may be terminated in a way that is (more) friendly to client applications.
  2. Set the terminationGracePeriodSeconds long enough for your Pod to clean up after itself.
-- Kekoa
Source: StackOverflow

11/3/2016

No, deployments do not support connection draining per se. Draining connections effectively happen as old pods stop & new pods start, clients connected to old pods will have to reconnect to new pods. As clients connect to the service, it's all transparent to clients. You do need to ensure that your application can handle different versions running concurrently, but that is a good idea anyway as it minimises downtime in upgrades & allows you to perform things like A/B testing.

There are a couple of different strategies which will let you tweak how your upgrades take place: deployments support two update strategies: Recreate or RollingUpdate.

With Recreate, old pods are stopped before new pods are started. This leads to a period of downtime but ensures that all clients connect to either the old or the new version - there will never be a time when both old & new pods are servicing clients at the same time. If downtime is acceptable to you then this may be an option to you.

Most of the time, however, downtime is unacceptable for a service so RollingUpdate is more appropriate. This starts up new pods & as it does so it stops old pods. As old pods are stopped, clients connected to them have to reconnect. Eventually there will be no old pods & all clients will have reconnected to new pods.

While there is no option to do connection draining as you suggest, you can configure the rolling update via maxUnavailable and maxSurge options. From http://kubernetes.io/docs/user-guide/deployments/#rolling-update-deployment:

.spec.strategy.rollingUpdate.maxUnavailable is an optional field that specifies the maximum number of Pods that can be unavailable during the update process. The value can be an absolute number (e.g. 5) or a percentage of desired Pods (e.g. 10%). The absolute number is calculated from percentage by rounding up. This can not be 0 if .spec.strategy.rollingUpdate.maxSurge is 0. By default, a fixed value of 1 is used. For example, when this value is set to 30%, the old Replica Set can be scaled down to 70% of desired Pods immediately when the rolling update starts. Once new Pods are ready, old Replica Set can be scaled down further, followed by scaling up the new Replica Set, ensuring that the total number of Pods available at all times during the update is at least 70% of the desired Pods.

.spec.strategy.rollingUpdate.maxSurge is an optional field that specifies the maximum number of Pods that can be created above the desired number of Pods. Value can be an absolute number (e.g. 5) or a percentage of desired Pods (e.g. 10%). This can not be 0 if MaxUnavailable is 0. The absolute number is calculated from percentage by rounding up. By default, a value of 1 is used. For example, when this value is set to 30%, the new Replica Set can be scaled up immediately when the rolling update starts, such that the total number of old and new Pods do not exceed 130% of desired Pods. Once old Pods have been killed, the new Replica Set can be scaled up further, ensuring that the total number of Pods running at any time during the update is at most 130% of desired Pods.

Hope that helps.

-- Jimmi Dyson
Source: StackOverflow