[k8s]I try to assign one pod to the normal pod,and the others on spot node by using podAntiAffinity

4/28/2020

I have 6 nodes,all of them have labels "group:emp",4 of them have labels "iKind:spot",2 of them have labels "ikind:normal".

I use the deployment yaml to assign one pod to the normal pod and others on the spot node, but it didn't work.

I start to increase the num of the pod from 1 to 6,but when it comes to 2,all the pod are assigned on th spot node

kind: Deployment
apiVersion: apps/v1
metadata:
  name: pod-test
  namespace: emp
  labels:
    app: pod-test
spec:
  replicas: 2 
  selector:
    matchLabels:
      app: pod-test
  strategy:
    type: RollingUpdate 
    rollingUpdate:
      maxSurge: 1 
      maxUnavailable: 0 
  template:
    metadata:
      labels:
        app: pod-test
    spec:
      containers:
        - name: pod-test
          image: k8s.gcr.io/busybox
          args: ["sh","-c","sleep 60000"]
          imagePullPolicy: Always
          resources:
            requests:
              cpu: 10m
              memory: 100Mi
            limits:
              cpu: 100m
              memory: 200Mi
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: group
                    operator: In
                    values:
                      - emp
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 70
            preference:
              matchExpressions:
              - key: ikind
                operator: In
                values:
                - spot
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - pod-test
              topologyKey: ikind
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - pod-test
            topologyKey: "kubernetes.io/hostname"
      restartPolicy: Always
      terminationGracePeriodSeconds: 10
      dnsPolicy: ClusterFirst
      schedulerName: default-scheduler
      ```
-- scut_yk
affinity
kubernetes
kubernetes-pod

3 Answers

4/28/2020

I add the node prefer matchExpressions to normal and give weight 30,and it works. In order to avoid the influence of the node nums,i change the weight of the normal and spot.

When replicas is 1,there is 1 pod in normal node

When replicas is 2,there is 1 pod in normal node and 1 pod in spot node

When replicas is 3,there is 2 pod in normal node and 1 pod in spot node

preferredDuringSchedulingIgnoredDuringExecution: - weight: 70 preference: matchExpressions: - key: ikind operator: In values: - normal - weight: 30 preference: matchExpressions: - key: ikind operator: In values: - spot

-- scut_yk
Source: StackOverflow

4/28/2020

If you want to deploy pods on all nodes then you have to change your preferredDuringSchedulingIgnoredDuringExecution.

Change

preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 70
            preference:
              matchExpressions:
              - key: ikind
                operator: In
                values:
                - spot

to

preferredDuringSchedulingIgnoredDuringExecution:
              - weight: 70
                preference:
                  matchExpressions:
                  - key: ikind
                    operator: In
                    values:
                    - spot
                    - normal

Now it will be deployed on both nodes, with ikind:spot and ikind:normal, before it was only spot.

I have tested it on 3 gke nodes and everything seems working just fine.

pod-test-54dc97fbcb-9hvvm   1/1     Running       gke-cluster-1-default-pool-1ffaf1b8-gmhb   <none>           <none>
pod-test-54dc97fbcb-k2hv2   1/1     Running       gke-cluster-1-default-pool-1ffaf1b8-gmhb   <none>           <none>
pod-test-54dc97fbcb-nqd97   1/1     Running       gke-cluster-1-default-pool-1ffaf1b8-7c25   <none>           <none>
pod-test-54dc97fbcb-zq9df   1/1     Running       gke-cluster-1-default-pool-1ffaf1b8-jk6t   <none>           <none>
pod-test-54dc97fbcb-zvwhk   1/1     Running        gke-cluster-1-default-pool-1ffaf1b8-7c25   <none>           <none>

It's well described here

apiVersion: v1
kind: Pod
metadata:
  name: with-node-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/e2e-az-name
            operator: In
            values:
            - e2e-az1
            - e2e-az2
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: another-node-label-key
            operator: In
            values:
            - another-node-label-value
  containers:
  - name: with-node-affinity
    image: k8s.gcr.io/pause:2.0

This node affinity rule says the pod can only be placed on a node with a label whose key is kubernetes.io/e2e-az-name and whose value is either e2e-az1 or e2e-az2. In addition, among nodes that meet that criteria, nodes with a label whose key is another-node-label-key and whose value is another-node-label-value should be preferred.

-- jt97
Source: StackOverflow

4/28/2020

Since both the spot nodes and normal nodes have the group=emp label kubernetes scheduler could select a candidate node with label group=emp which could be spot node or normal node and after that it will apply the preferredDuringSchedulingIgnoredDuringExecution to prefer the spot node to schedule the pod. Depending on the capacity on the spot node scheduler may not be able to schedule it on spot node. Hence as an alternate choice scheduler is scheduling the pod on normal node.So you could either have different label for normal node and spot node and select based on the label of spot node or you can use requiredDuringSchedulingIgnoredDuringExecution. This will put a hard constraint on the scheduler to schedule the pod on spot node only.

-- Arghya Sadhu
Source: StackOverflow