Deployment affinity

6/9/2020

I have k8s cluster with 3 nodes

I would like that a sample deployment with 3 replicas as the follwing , so each pod got schduled in a different node ?

apiVersion: apps/v1 kind: Deployment metadata:   name: tomcat-deployment   labels:
    app: tomcat spec:   replicas: 3   selector:
    matchLabels:
      app: tomcat   template:
    metadata:
      labels:
        app: tomcat
    spec:
      containers:
      - name: tomcat
        image: tomcat:9.0
        ports:
        - containerPort: 80
-- Recruitment Expert
affinity
kubernetes

2 Answers

6/10/2020

You can use podAntiAffinity to make sure that the same pods of a deployment should never run on the same node(depends on topology). Check the following document Assigning Pods to Nodes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tomcat-deployment
  labels:
    app: tomcat
spec:
  replicas: 3
  selector:
    matchLabels:
      app: tomcat
  template:
    metadata:
      labels:
        app: tomcat
    spec:
      containers:
      - name: tomcat
        image: tomcat:9.0
        ports:
        - containerPort: 80
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: "app"
                operator: In
                values:
                - tomcat
            topologyKey: "kubernetes.io/hostname"
-- Hang Du
Source: StackOverflow

6/10/2020

Kubernetes scheduler will by default try to schedule deployment replicas on different nodes if possible (as long as a node satisfies momory/cpu requirements).

If don't, 2 (or more) pod replicas can get scheduled on one node and you can use several techniques to prevent this.

One of these techniques is called pod affinity. In k8s documentation you can read:

Inter-pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes. The rules are of the form "this pod should (or, in the case of anti-affinity, should not) run in an X if that X is already running one or more pods that meet rule Y"

With podaffinity you need to be aware that if a pod for some reason cannot be scheduled on a node (lack of resources or tainted node) and it will end up in pending state.

You should also remember that when running 3 node cluster (1 master + 2 workers) it is common to have NoSchedule taint on master node (which is typical for clusters created with e.g. kubeadm) that disallows scheduling pods on master node.

If this applies to you and you still want to schedult pods on mater node, you need to either delete NoSchedule taint:

kubectl taint nodes $(hostname) node-role.kubernetes.io/master:NoSchedule-

Or use toleration:

apiVersion: extensions/v1beta1
kind: Deployment
  spec:
    spec:
      tolerations:
        - key: "node-role.kubernetes.io/master"
          effect: "NoSchedule"
          operator: "Exists"

In comments @suren mentioned DaemonSets which can be used in some cases but when scaling your cluster, your application will scale with it, and it may not be desired.

-- Matt
Source: StackOverflow