Scaling Down Replicas On Completion Without Termination

9/19/2019

I'm using Kubernetes replicas to run 'run-to-completion' tasks.

I currently increase the number of replicas when we have a work item to complete on a queue and the container immediately consumes the item. As the desired replicas has never changed, Kubernetes kicks off a new replica which then finds that there's nothing to do and so terminates immediately (and repeat). If I scale down the replicas before a container has finished (i.e. as soon as a queue item is consumed), one of the replicas that are doing work terminates prematurely.

Is there a way to reduce the replicas and not enforce termination?

-- Matt Brown
amazon-eks
aws-eks
kubernetes
message-queue
queue

2 Answers

9/21/2019

If you find yourself in the same situation as me, have a look at Argo: https://argoproj.github.io/argo/

-- Matt Brown
Source: StackOverflow

9/19/2019

You can add the termination grace period for all pods so when container terminates it's wait for some time.

you can also add the SIG term in lifecycle also that particular pod donot take next item from queue.

lifecycle:
          preStop:
            exec:
              # SIGTERM triggers a quick exit; gracefully terminate instead
              command: ["/usr/sbin/nginx","-s","quit"]

you can also have look at this : https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-terminating-with-grace

-- Harsh Manvar
Source: StackOverflow