I am trying to scale up my Kubernetes cluster that I started on Azure. I have several deployments running on it, and one of them is causing most of the CPU power. So, I decided to scale it up. However, when I increase the number of nodes on Azure, most of the load is still on the three nodes I started with, while the others have little to none.
Can anyone help me with this? Am I missing something?
Pods don't get rescheduled until they terminate and are replaced. You can slowly kill some of your pods and they should move over to the new nodes.
This looks to be weird. do one thing to move the work load off from those three nodes. Apply a taint on those nodes such that pod doesnt get scheduled. use the below command
kubectl taint nodes <node-name> key=value:NoSchedule
delete the pod from the tainted node. The pod should be scheduled on other nodes.
This is something which AKS manages on its own (if we don't provide any custom scheduling rules), if you want to place the load to other nodes, you can do this.
kubectl cordon AKS_AGENTPOOL_NODE_NAME
Try checking then
kubectl get pods -o wide --all-namespaces | grep AKS_AGENTPOOL_NODE_NAME
Try killing few of them and you can see the load will get transferred to other nodes.