How do I prevent independent jobs (which run to completion) from being evicted by the scheduler/autoscaler?

4/24/2019

I have a K8s cluster which runs independent jobs (each job has one pod) and I expect them to run to completion. The scheduler, however, sometimes reschedules them on a different node. My jobs need to be single-run, and restarting them on a different node is not an acceptable outcome for me.

I was looking at Pod disruption budgets (PDB), but from what I understand their selectors apply to a label of pods. Since every one of my job is different and has a separate label, how do I use PDB to tell K8s that all of my pods have a maxUnavailable of 0?

I have also used this annotation

"cluster-autoscaler.kubernetes.io/safe-to-evict": false

but this does not affect pod evictions on resource pressures.

Ideally, I should be able to tell K8s that none of my Pods should be evicted unless they are complete.

-- 20kLeagues
google-kubernetes-engine
kubernetes
kubernetes-jobs

1 Answer

4/24/2019

You should specify resources in order for your jobs to become Guaranteed quality of service:

resources:
  limits:
    memory: "200Mi"
    cpu: "700m"
  requests:
    memory: "200Mi"
    cpu: "700m"

Requests should be equal to limits - then your pod will become Guaranteed and will not be anymore evicted.

Read more: https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod

-- Vasily Angapov
Source: StackOverflow