How kubernetes HPA with 2 or more metrics behaves - especially the no.of replicas calculation?

1/22/2019

We have configured to use 2 metrics for HPA

  1. CPU Utilization
  2. App specific custom metrics

When testing, we observed the scaling happening, but calculation of no.of replicas is not very clear. I am not able to locate any documentation on this.

Questions:

  1. Can someone point to documentation or code on the calculation part?
  2. Is it a good practice to use multiple metrics for scaling?

Thanks in Advance!

-- arunk2
autoscaling
google-kubernetes-engine
kubernetes

1 Answer

1/22/2019

From https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#how-does-the-horizontal-pod-autoscaler-work

If multiple metrics are specified in a HorizontalPodAutoscaler, this calculation is done for each metric, and then the largest of the desired replica counts is chosen. If any of those metrics cannot be converted into a desired replica count (e.g. due to an error fetching the metrics from the metrics APIs), scaling is skipped.

Finally, just before HPA scales the target, the scale recommendation is recorded. The controller considers all recommendations within a configurable window choosing the highest recommendation from within that window. This value can be configured using the --horizontal-pod-autoscaler-downscale-stabilization-window flag, which defaults to 5 minutes. This means that scaledowns will occur gradually, smoothing out the impact of rapidly fluctuating metric values

-- Janos Lenart
Source: StackOverflow