We use GKE in one of our service which is autoscaled. The workload is variable and based on the workload the cluster scales upto hundreds of nodes. However i see that when the workload goes down many of the nodes which are idle still are alive for very long time and hence increasing our bill. Is there a setting we can do where we can specify a time after which a node will be terminated and removed from the cluster?
The Kubernetes scaling-down process typically includes a delay as a protection from peak traffic spikes that can eventually occurs while performing the resize.
As well, there are several aspect about the autoscaler to consider. Please check the following docs for details:
Furthermore, when using the GKE autoscaler, there are some constraints to take into account:
- When scaling down, cluster autoscaler honors a graceful termination period of 10 minutes for rescheduling the node's Pods onto a different node before forcibly terminating the node.
- Occasionally, cluster autoscaler cannot scale down completely and an extra node exists after scaling down. This can occur when required system Pods are scheduled onto different nodes, because there is no trigger for any of those Pods to be moved to a different node. See I have a couple of nodes with low utilization, but they are not scaled down. Why?. To work around this limitation, you can configure a Pod disruption budget.
Disclaimer: Comments and opinions are my own and not the views of my employer.