How to fix "pods is not balanced" in Kubernetes cluster

6/27/2019

Pods doesn't balance in node pool. why doesn't spread to each node?

I have 9 instance in 1 node pool. In the past, I’ve tried add to 12 instance. Pods doesn't balance.

image description here Would like to know if there is any solution that can help solve this problem and used 9 instance in 1 node pool?

-- jirawat paiboon
cloud
google-cloud-platform
infrastructure
kubernetes

2 Answers

6/27/2019

You should look into inter-pod anti-affinity. This feature allows you to constrain where your pods should not be scheduled based on the labels of the pods running on a node. In your case, given your app has label app-label, you can use it to ensure pods do not get scheduled on nodes that have pods with the label app-label. For example:

apiVersion: apps/v1
kind: Deployment
...
spec:
  selector:
    matchLabels:
      label-key: label-value
  template:
    metadata:
      labels:
        label-key: label-value
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: label-key
                operator: In
                values:
                - label-value
            topologyKey: "kubernetes.io/hostname"
...

PS: If you use requiredDuringSchedulingIgnoredDuringExecution, you can have at most as many pods as you have nodes. If you expect to have more pods than nodes available, you will have to use preferredDuringSchedulingIgnoredDuringExecution, which makes antiaffinity be a preference, rather than an obligation.

-- Alassane Ndiaye
Source: StackOverflow

6/27/2019

Pods are scheduled to run on nodes by the kube-scheduler. And once they are scheduled, they are not rescheduled unless they are removed.

So if you add more nodes, the already running pods won't reschedule.

There is a project in incubator that solves exactly this problem.

https://github.com/kubernetes-incubator/descheduler

Scheduling in Kubernetes is the process of binding pending pods to nodes, and is performed by a component of Kubernetes called kube-scheduler. The scheduler's decisions, whether or where a pod can or can not be scheduled, are guided by its configurable policy which comprises of set of rules, called predicates and priorities. The scheduler's decisions are influenced by its view of a Kubernetes cluster at that point of time when a new pod appears first time for scheduling. As Kubernetes clusters are very dynamic and their state change over time, there may be desired to move already running pods to some other nodes for various reasons:

  • Some nodes are under or over utilized.
  • The original scheduling decision does not hold true any more, as taints or labels are added to or removed from nodes, pod/node
    affinity requirements are not satisfied any more.
  • Some nodes failed and their pods moved to other nodes.
  • New nodes are added to clusters.
-- Abhyudit Jain
Source: StackOverflow