GKE doesn't downscale node to zero

3/30/2020

GKE cluster is configured with cluster/node auto provisioning.

I have created a default node-pool on which system specific pods can be run. Whenever pods with GPU are requested, GKE automatically creates new GPU enabled node pool which is fine.

But, whenever I delete such pods, GKE doesn't downscale newly create node pool to zero instances. Instead, one instance keeps running. If no GPU requested, node pool supposed to go to minimum size i.e. zero.

NOTE:

  • For GPU drivers, a Daemonset has been created under 'kube-system' namespace, Pods for this Daemonsets run on each GPU enabled node.

I edited this Daemonset and also added label '"cluster-autoscaler.kubernetes.io/safe-to-evict": "true" ' to pods.

Can someone help how to downscale newly create node pool to zero nodes?

UPDATE:

Pods that are running on new nodes are:

fluentd-gcp (From DaemonSet)

kube-proxy

nvidia-gpu-device-plugin (From DaemonSet)

Aren't these pods should get evicted ?

-- AVJ
autoscaling
google-cloud-platform
google-kubernetes-engine
kubernetes

1 Answer

4/5/2020

GKE by default keeps an extra node resource for quick pod scheduling. This is default behavior controlled by auto scaling policy.

This behavior can be changed by setting policy to 'optimize-utilization'.

https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler

-- AVJ
Source: StackOverflow