Did you find this behavior before?
I have a GKE cluster with 5 nodes, I have autoscaling enable as you can see below
autoscaling:
enabled: true
maxNodeCount: 9
minNodeCount: 1
config:
diskSizeGb: 100
diskType: pd-standard
imageType: COS
machineType: n1-standard-1
oauthScopes:
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/service.management.readonly
- https://www.googleapis.com/auth/trace.append
serviceAccount: default
initialNodeCount: 1
instanceGroupUrls:
- xxx
management:
autoRepair: true
autoUpgrade: true
name: default-pool
podIpv4CidrSize: 24
selfLink: xxxx
status: RUNNING
version: 1.13.7-gke.8
However when I'm trying to deploy one service I receive this error
Warning FailedScheduling 106s default-scheduler 0/5 nodes are available: 3 Insufficient cpu, 4 node(s) didn't match node selector.
Warning FailedScheduling 30s (x3 over 106s) default-scheduler 0/5 nodes are available: 4 node(s) didn't match node selector, 5 Insufficient cpu.
Normal NotTriggerScaleUp 0s (x11 over 104s) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 node(s) didn't match node selector
And if I see the stats of my resources I don't see problem with CPU, right?
kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
gke-pre-cluster-1-default-pool-17d2178b-4g9f 106m 11% 1871Mi 70%
gke-pre-cluster-1-default-pool-17d2178b-g8l1 209m 22% 3042Mi 115%
gke-pre-cluster-1-default-pool-17d2178b-grvg 167m 17% 2661Mi 100%
gke-pre-cluster-1-default-pool-17d2178b-l9gt 122m 12% 2564Mi 97%
gke-pre-cluster-1-default-pool-17d2178b-ppfw 159m 16% 2830Mi 107%
So... if the problem seems is not cpu with this message?
And the other thing is... why if there are a problem with resources don't scale up automatically?
Please anyone found this before can explain me? I don't understand.
Thank you so much
Could you check if you have this entry "ZONE_RESOURCE_POOL_EXHAUSTED" in StackDriver logging?
It probably that zone you are using with your kubernetes cluster is with problems.
Regards.