Disable auto rescheduling for a pod

11/22/2018

On k8s cluster (GCP) during nodes auto-scaling, my pods are rescheduled automatically. The main problem that they perform computations and keep results in memory during auto-scaling. Because of rescheduling, pods lose all results and tasks.

I want to disable rescheduling for specified pods. I know a few possible solutions:

  • nodeSelector (not very flexible due to the dynamic nature of a cluster)
  • pod disruption budget PDB

I have tried PDB and set minAvailable = 1 but it didn't work. I found that you can also set maxUnavailable=0, will it more effective? I didn't understand exactly the behaviour if maxUnavailable when it's set to 0. Could you explain it more? Thank you!

Link for more details - https://github.com/dask/dask-kubernetes/issues/112

-- Vladyslav Moisieienkov
google-kubernetes-engine
kubernetes

2 Answers

12/17/2018

Are you specifying resource requests and limits?

-- Yuvi
Source: StackOverflow

11/23/2018

Setting max unavailable to 0 is a way to go and also, using nodepools can be a good workaround.

gcloud container node-pools create <nodepool> --node-taints=app=dask-scheduler:NoSchedule
gcloud container node-pools create <nodepool> --node-labels app=dask-scheduler

This will create the nodepool with the label app=dask-scheduler, after in the pod spec, you can do this:

nodeSelector:
  app: dask-scheduler

And put the dask scheduler on a node-pool that doesn't autoscale.

There's an object called PDB where in its spec you can set maxUnavailable in the example of maxUnavailable=1, this means if you had 100 pods defined, always make sure there is only one removed/drained/re-scheduled at a time in the case of maxUnavailable, if you have 2 pods, and you set maxUnavailable to 0, it will never remove your pods. It being the scheduler

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: zk-pdb
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: zookeeper
-- Milad
Source: StackOverflow