I have a kubernetes cluster version 1.12 deployed to aws with kops
The cluster has several nodes marked with a label 'example.com/myLabel' that takes the values a, b, c, d
For example:
Node name example.com/myLabel instance1 a instance2 b instance3 c instance4 d
And there is a test deployment
apiVersion: apps/v1 kind: Deployment metadata: name: test-scheduler spec: replicas: 6 selector: matchLabels: app: test-scheduler template: metadata: labels: app: test-scheduler spec: tolerations: - key: spot operator: Exists affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: example.com/myLabel operator: In values: - a weight: 40 - preference: matchExpressions: - key: example.com/myLabel operator: In values: - b weight: 35 - preference: matchExpressions: - key: example.com/myLabel operator: In values: - c weight: 30 - preference: matchExpressions: - key: example.com/myLabel operator: In values: - d weight: 25 containers: - name: a resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" image: busybox command: - 'sleep' - '99999'
According to the documentation, nodeAffinity must exist for each node that can be used for a scheduled pod and the node having the biggest weight sum is chosen.
I expect all pods to be scheduled to node instance1 with label ‘a’, but in my case, the nodes are chosen randomly.
For example, here are the 5 nodes planned for 6 pods from the deployment, including another1 and another2 nodes, which do not contain my label at all (there is another node with this label with the value 'd'):
NODE LABEL another1 NONE node1 a node2 b node3 c another2 NONE
All nodes have capacity, they are available and can run pods
I have 2 questions
Why does this happen?
Where does the k8s scheduler log information on how a node is assigned for a pod? Events do not contain this information and scheduler logs on masters are empty
UPDATE:
My nodes contains correctly labels
example.com/myLabel=a
example.com/myLabel=b
example.com/myLabel=c
example.com/myLabel=d
If you put on your nodes a label with only the value it won't work, you have to put a label on each node with the key=value
of your label, for example from one of my clusters on GCP I obtine this with executing kubectl describe
on one node:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/fluentd-ds-ready=true
beta.kubernetes.io/instance-type=n1-standard-2
beta.kubernetes.io/os=linux
You have to put your labels correctly by as:
example.com/myLabel=a
With that, your nodes are correctly classified
preferredDuringSchedulingIgnoredDuringExecution just means that the scheduler will add the weight you set to the algorithm it uses to choose which node to schedule to. This is not a hard rule but a preferred rule.
With the weights you set, you will get a somewhat even spread. You would need to have a very large sample size before you would start to see the spread you are aiming for.
Keep in mind that the "weight" is not just taken by the affinity you set, other factors of the nodes have their own weight as well. If you want to see the effect more clearly, use a greater weight difference between each affinity