Pod in status pending but autoscale is enabled, why doesn't work?

9/16/2019

Did you find this behavior before?

I have a GKE cluster with 5 nodes, I have autoscaling enable as you can see below

    autoscaling:
  enabled: true
  maxNodeCount: 9
  minNodeCount: 1
config:
  diskSizeGb: 100
  diskType: pd-standard
  imageType: COS
  machineType: n1-standard-1
  oauthScopes:
  - https://www.googleapis.com/auth/devstorage.read_only
  - https://www.googleapis.com/auth/logging.write
  - https://www.googleapis.com/auth/monitoring
  - https://www.googleapis.com/auth/servicecontrol
  - https://www.googleapis.com/auth/service.management.readonly
  - https://www.googleapis.com/auth/trace.append
  serviceAccount: default
initialNodeCount: 1
instanceGroupUrls:
- xxx
management:
  autoRepair: true
  autoUpgrade: true
name: default-pool
podIpv4CidrSize: 24
selfLink: xxxx
status: RUNNING
version: 1.13.7-gke.8

However when I'm trying to deploy one service I receive this error

 Warning  FailedScheduling   106s                default-scheduler   0/5 nodes are available: 3 Insufficient cpu, 4 node(s) didn't match node selector.
  Warning  FailedScheduling   30s (x3 over 106s)  default-scheduler   0/5 nodes are available: 4 node(s) didn't match node selector, 5 Insufficient cpu.
  Normal   NotTriggerScaleUp  0s (x11 over 104s)  cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 node(s) didn't match node selector

And if I see the stats of my resources I don't see problem with CPU, right?

kubectl top node
NAME                                                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
gke-pre-cluster-1-default-pool-17d2178b-4g9f   106m         11%    1871Mi          70%       
gke-pre-cluster-1-default-pool-17d2178b-g8l1   209m         22%    3042Mi          115%      
gke-pre-cluster-1-default-pool-17d2178b-grvg   167m         17%    2661Mi          100%      
gke-pre-cluster-1-default-pool-17d2178b-l9gt   122m         12%    2564Mi          97%       
gke-pre-cluster-1-default-pool-17d2178b-ppfw   159m         16%    2830Mi          107%   

So... if the problem seems is not cpu with this message?

And the other thing is... why if there are a problem with resources don't scale up automatically?

Please anyone found this before can explain me? I don't understand.

Thank you so much

-- David Oceans
autoscaling
google-kubernetes-engine
kubernetes

1 Answer

9/16/2019

Could you check if you have this entry "ZONE_RESOURCE_POOL_EXHAUSTED" in StackDriver logging?

It probably that zone you are using with your kubernetes cluster is with problems.

Regards.

-- Alfredo F.
Source: StackOverflow