How to limit amount of pods with attached managed disks per node

5/25/2018

Imagine there is a cluster with lots of different deployments running on it. Some pods uses PersistentVolumes (Azure Disks). There is a limit in Azure how much disks can be mounted to a VM and this leads to errors on scheduling like

Status=409 Code="OperationNotAllowed" Message="The maximum number of data disks allowed to be attached to a VM of this size is 8

Pods stay in

Waiting: Container creating

state forever, however some nodes were having much less pods with attached disks at the moment of scheduling. It would be great to limit amount of pods with attached disks per node so this error will never happen. I believe

podAntiAffinity

is what I need and I know I can restrict pods with same label from scheduling on same node, but I don't know how to allow it until node has maximum amount of pods with disks.

My installation is AKS.

az acs create \ --orchestrator-type=kubernetes \ --orchestrator-version 1.7.9 \ --resource-group <resource_group_here> \ --name=<name_here> \ ...

-- Geslot
azure
kubernetes
limit
quotas

1 Answer

7/25/2018

KUBE_MAX_PD_VOLS is what you are looking for. By default it's value is 16 for Azure Disks. So you can either use instances which has same limit of attached disks (16) or set it to preferrable value. You can see where it's declared at github

You should set this environment variable in your scheduler declaration. I found my scheduler declaration in /etc/kubernetes/manifests/kube-scheduler.yaml. This is what it looks now: apiVersion: "v1" kind: "Pod" metadata: name: "kube-scheduler" ... spec: containers: - name: "kube-scheduler" ... env: - name: KUBE_MAX_PD_VOLS value: "8" ...

Note spec.containers.env.KUBE_MAX_PD_VOLS setting - it prevents from scheduling more than 8 disks on each node.

This way pods spread among nodes without any issues, pods which cannot fit stays in Pending state until they find enough nodes to fit in.

-- Geslot
Source: StackOverflow