I'm using Kubernetes replicas to run 'run-to-completion' tasks.
I currently increase the number of replicas when we have a work item to complete on a queue and the container immediately consumes the item. As the desired replicas has never changed, Kubernetes kicks off a new replica which then finds that there's nothing to do and so terminates immediately (and repeat). If I scale down the replicas before a container has finished (i.e. as soon as a queue item is consumed), one of the replicas that are doing work terminates prematurely.
Is there a way to reduce the replicas and not enforce termination?
If you find yourself in the same situation as me, have a look at Argo: https://argoproj.github.io/argo/
You can add the termination grace period
for all pods so when container terminates it's wait for some time.
you can also add the SIG term in lifecycle also that particular pod donot take next item from queue.
lifecycle:
preStop:
exec:
# SIGTERM triggers a quick exit; gracefully terminate instead
command: ["/usr/sbin/nginx","-s","quit"]
you can also have look at this : https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-terminating-with-grace