GKE cluster on GCP doesn't scale nodes properly

5/19/2020

when pods scale it's almost instantinous (you delete pod, new one spawns) or when you setup HPA (HorizontalPodAutoscaling) with 50% cpu or other custom metric it just spawns new pods. The problem I have right now is even with enabled Node autoscaling on cluster they refuse to do so. Even when I set that I wanted 3nodes per zone (9total) it stays at 6, and with 1-5 or 3-5 autoscaling enabled pods are still throwing this error, any thoughts ?

Warning  FailedScheduling   57s (x28 over 39m)     default-scheduler   0/6 nodes are available: 1 Insufficient memory, 5 node(s) didn't match node selector.

enter image description here

-- potatopotato
google-cloud-platform
google-kubernetes-engine
kubernetes

1 Answer

5/19/2020

One possible reason could be that you're using a fixed nodeSelector for all pods. Take a look at the error message:

... 5 node(s) didn't match node selector.

Out of 6 nodes, your pods are trying to be scheduled on a single node skipping the other 5 nodes. As you've enabled auto-scaling of nodes, maybe 6 nodes are enough to house all of your running resources that's why there are only 6 nodes.

I will suggest you use Affinity and Anti-affinity to distribute your pods over multiple zones instead of using fixed nodeSlector.

-- Kamol Hasan
Source: StackOverflow