Pods doesn't balance in node pool. why doesn't spread to each node?
I have 9 instance in 1 node pool. In the past, I’ve tried add to 12 instance. Pods doesn't balance.
image description here Would like to know if there is any solution that can help solve this problem and used 9 instance in 1 node pool?
You should look into inter-pod anti-affinity. This feature allows you to constrain where your pods should not be scheduled based on the labels of the pods running on a node. In your case, given your app has label app-label
, you can use it to ensure pods do not get scheduled on nodes that have pods with the label app-label
. For example:
apiVersion: apps/v1
kind: Deployment
...
spec:
selector:
matchLabels:
label-key: label-value
template:
metadata:
labels:
label-key: label-value
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: label-key
operator: In
values:
- label-value
topologyKey: "kubernetes.io/hostname"
...
PS: If you use requiredDuringSchedulingIgnoredDuringExecution
, you can have at most as many pods as you have nodes. If you expect to have more pods than nodes available, you will have to use preferredDuringSchedulingIgnoredDuringExecution
, which makes antiaffinity be a preference, rather than an obligation.
Pods are scheduled to run on nodes by the kube-scheduler. And once they are scheduled, they are not rescheduled unless they are removed.
So if you add more nodes, the already running pods won't reschedule.
There is a project in incubator that solves exactly this problem.
https://github.com/kubernetes-incubator/descheduler
Scheduling in Kubernetes is the process of binding pending pods to nodes, and is performed by a component of Kubernetes called kube-scheduler. The scheduler's decisions, whether or where a pod can or can not be scheduled, are guided by its configurable policy which comprises of set of rules, called predicates and priorities. The scheduler's decisions are influenced by its view of a Kubernetes cluster at that point of time when a new pod appears first time for scheduling. As Kubernetes clusters are very dynamic and their state change over time, there may be desired to move already running pods to some other nodes for various reasons:
- Some nodes are under or over utilized.
- The original scheduling decision does not hold true any more, as taints or labels are added to or removed from nodes, pod/node
affinity requirements are not satisfied any more.- Some nodes failed and their pods moved to other nodes.
- New nodes are added to clusters.