GKE Cluster autoscaler profile for older luster

11/3/2020

Now in GKE there is new tab while creating new K8s cluster

Automation - Set cluster-level criteria for automatic maintenance, autoscaling, and auto-provisioning. Edit the node pool for automation like auto-scaling, auto-upgrades, and repair.

it has two options - Balanced (default) & Optimize utilization (beta)

cant we set this for older cluster any work around?

we are running old GKE version 1.14 we want to auto-scale cluster when 70% of resource utilization of existing nodes.

Currently, we have 2 different pools - only one has auto node provisioning enable but during peak hour if HPA scales POD, New node taking some time to join the cluster and sometimes exiting node start crashing due to resource pressure.

-- chagan
google-cloud-platform
google-kubernetes-engine
kubernetes
kubernetes-pod

1 Answer

11/18/2020

You can set the autoscaling profile by going into:

  • GCP Cloud Console (Web UI) -> Kubernetes Engine -> CLUSTER-NAME -> Edit -> Autoscaling profile

Autoscaling profile

This screenshot was made on GKE version 1.14.10-gke.50

You can also run:

  • gcloud beta container clusters update CLUSTER-NAME --autoscaling-profile optimize-utilization

The official documentation states:

You can specify which autoscaling profile to use when making such decisions. The currently available profiles are:

  • balanced: The default profile.
  • optimize-utilization: Prioritize optimizing utilization over keeping spare resources in the cluster. When selected, the cluster autoscaler scales down the cluster more aggressively: it can remove more nodes, and remove nodes faster. This profile has been optimized for use with batch workloads that are not sensitive to start-up latency. We do not currently recommend using this profile with serving workloads.

-- Cloud.google.com: Kubernetes Engine: Cluster autoscaler: Autoscaling profiles

This setting (optimize-utilization) could not be the best option when using it for serving workloads. It will more aggressively try to scale-down (remove a node). It will automatically reduce the amount of available resources your cluster is having and could be more vulnerable to workload spikes.


Answering the part of the question:

we are running old GKE version 1.14 we want to auto-scale cluster when 70% of resource utilization of existing nodes.

As stated in the documentation:

Cluster autoscaler increases or decreases the size of the node pool automatically, based on the resource requests (rather than actual resource utilization) of Pods running on that node pool's nodes. It periodically checks the status of Pods and nodes, and takes action:

  • If Pods are unschedulable because there are not enough nodes in the node pool, cluster autoscaler adds nodes, up to the maximum size of the node pool.

-- Cloud.google.com: Kubernetes Engine: Cluster autoscaler: How cluster autoscaler works

You can't directly scale the cluster based on the percentage of resource utilization (70%). Autoscaler bases on inability of the cluster to schedule pods on currently existing nodes.

You can scale the amount of replicas of your Deployment by CPU usage with Horizontal Pod Autoscaler. This Pods could have a buffer to handle increased amount of traffic and after a specific threshold they could spawn new Pods where the CA( Cluster autoscaler) would send a request for a new node (if new Pods are unschedulable). This buffer would be the mechanism to prevent sudden spikes that application couldn't manage.

The buffer part and over-provisioning explained in details in:



There is an extensive documentation about running cost effective apps on GKE:

I encourage you to check above link as there are a lot of tips and insights on (scaling, over-provisioning, workload spikes, HPA, VPA,etc.)

Additional resources:

-- Dawid Kruk
Source: StackOverflow