K8s pods affinity & anti-affinity, soft (preferredDuringScheduling) not respected in 1.4?

2/16/2017

I am experimenting with K8s 1.4 pod affinity-antiaffinity. I am trying to get K8s to cluster together pods of the same service on the same node as much as possible (i.e. only go to next node if not possible to put more on the node where the service is already in). To do so, I setup:

  1. A hard (requiredDuringScheduling) anti-affinity to exclude running where a diffrent service is already running (pod_label_xyz not in [value-a])

  2. A soft (preferredDuringScheduling) affinity to try to run where the same service (pod_label_xyz in [value-a]) - weight 100

  3. A soft (preferredDuringScheduling) anti-affinity to try not to run where the same service is not already running (pod_label_xyz not present) - weight 100

When there is 5 nodes and 3 services pod_label_xyz & (value-a, value-b, value-c) with 1 pod each created with a replication controller, the first pods get scheduled properly and when scaling up any of them, the 1st hard rule is respected by K8s. However, the 2nd and 3rd (which is actually redundant of the 2nd) is not respected. I see that when I scale up K8s tries to push pods to empty node (not used by any other service) even though there is capacity to schedule more where the service is already running. In fact if I scale up even more, new pods get created on the original node as well as the new (previously unused nodes).

Please advise if I am missing something

Thank you

Here is the annotation i used

scheduler.alpha.kubernetes.io/affinity: >
    {  
       "podAffinity":{  
          "preferredDuringSchedulingIgnoredDuringExecution":[  
             {  
                "weight":100,
                "podAffinityTerm":{  
                   "labelSelector":{  
                      "matchExpressions":[  
                         {  
                            "key":"pod_label_xyz",
                            "operator":"Exists"
                         },
                         {  
                            "key":"pod_label_xyz",
                            "operator":"In",
                            "values":[  
                               "value-a"
                            ]
                         }
                      ]
                   },
                   "namespaces":[  
                      "sspni-882-frj"
                   ],
                   "topologyKey":"kubernetes.io/hostname"
                }
             }
          ]
       },
       "podAntiAffinity":{  
          "requiredDuringSchedulingIgnoredDuringExecution":[  
             {  
                "labelSelector":{  
                   "matchExpressions":[  
                      {  
                         "key":"pod_label_xyz",
                         "operator":"Exists"
                      },
                      {  
                         "key":"pod_label_xyz",
                         "operator":"NotIn",
                         "values":[  
                            "value-a"
                         ]
                      }
                   ]
                },
                "namespaces":[  
                   "sspni-882-frj"
                ],
                "topologyKey":"kubernetes.io/hostname"
             }
          ],
          "preferredDuringSchedulingIgnoredDuringExecution":[  
             {  
                "weight":100,
                "podAffinityTerm":{  
                   "labelSelector":{  
                      "matchExpressions":[  
                         {  
                            "key":"pod_label_xyz",
                            "operator":"DoesNotExist"
                         }
                      ]
                   },
                   "namespaces":[  
                      "sspni-882-frj"
                   ],
                   "topologyKey":"kubernetes.io/hostname"
                }
             }
          ]
       }
    }
-- fares
kubernetes

0 Answers