Preferred inter-pod affinity never being respected in kubernetes

4/28/2020

I have a jenkins pod, having the label app: jenkins-master

This resides on jenkins namespace.

I want an nginx pod of a deployment (on another namespace, default) to be collocated to the above pod.

So I add the following in its spec:

spec:
  affinity:
    podAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          namespaces:
            - all
          labelSelector:
            matchExpressions:
            - key: app
              operator: In
              values:
              - jenkins-master
          topologyKey: "kubernetes.io/os"

I have a GKE cluster of 8 nodes.

Out of 5-6 times I have created/deleted the deployment, the nginx pod actually never landed on the same node as jenkins-master.

I know it is preferred scheduling, but is this behavior normal?

Working on GKE with "v1.15.9-gke.24"

edit 1: I have changed the topologyKey: "kubernetes.io/hostname" as suggested in a couple of answers below but that didn't help much either.

edit 2: These are the allocated resources for the node that jenkins-master pod is scheduled on

  Resource                   Requests      Limits
  cpu                        1691m (43%)   5013m (127%)
  memory                     4456Mi (33%)  8902Mi (66%)

Since scheduling is based on requests, I don't understand how the following deployment fails to collocate, the requests I am making are minimal

resources:
  limits:
    memory: "1Gi"
    cpu: "100m"
   requests:
    memory: "100Mi"
    cpu: "50m"
-- pkaramol
kubernetes

2 Answers

4/28/2020

I assume you know that topology refers to some labels that are given to the nodes automatically upon initialization of the cluster.

So, topology groups nodes as one (through these labels), so when you say topologyKey: "kubernetes.io/os", you are saying choose a node that is part of this group, and schedule the pod on it. Since probably all your nodes have the same OS, to your scheduler it is a valid node to run on. So, yes, it is intended behavior.

Note that this is still a preference, but it will still try to schedule on the right node, if there are enough resources.

What you have to do is what is suggesting omricoco; topologyKey: "kubernetes.io/hostname". You need to let the scheduler group by hostname, so you will have only 1 node per group, and the pod to be scheduled will be on the same node.

-- suren
Source: StackOverflow

4/28/2020

I think you made a mistake using the topologyKey: "kubernetes.io/os" which is used if you are mixing operating systems in your cluster (for example: mixing Linux and Windows nodes).

You should be using: topologyKey: "kubernetes.io/hostname", where Kubelet populates this label with the hostname.

-- omricoco
Source: StackOverflow