Kubernetes HPA (with custom metrics) scaling policies

6/25/2020

Starting from Kubernetes v1.18 the v2beta2 API allows scaling behavior to be configured through the Horizontal Pod Autoscalar (HPA) behavior field. I'm planning to apply HPA with custom metrics to a StatefulSet.

The use case I'm looking at is scaling out using a custom metric (e.g. number of user sessions on my application), but the HPA will not scale down at all. This use case is also described by K8s SIG-Autoscaling enhancements - "Configurable scale velocity for HPA >> Story 4: Scale Up As Usual, Do Not Scale Down".

behavior:
  scaleDown:
    policies:
    - type: pods
      value: 0

The user sessions could stay active for minutes to hours. Starting with 1 replica of the StatefulSet, as the number of user sessions hit an upper limit (exposed using Prometheus collector and later configured using HPA custom metric option), the application pods will scale-out. The new pods will start serving new users.

Since this is a StatefulSet and cannot just abruptly scale down, I'm seeking help on ways to scale down when the user sessions on the new replicas go down to 0. The above link says that the scale down can be controlled by a separate process. Not sure how to do this? Looking for some pointers.

Thanks.

-- smulkutk
horizontal-pod-autoscaling
kubernetes
kubernetes-pod
kubernetes-statefulset

1 Answer

6/25/2020

You can use periodSeconds and stabilizationWindowSeconds values to manage how much time will pass between termination of pods, for example:

  behavior:
    scaleDown:
      stabilizationWindowSeconds: 10
      policies:
      - type: Pods
        value: 1
        periodSeconds: 20

This way it will scale down 1 pod every ~30 seconds (or whatever value will be used in periodSeconds and stabilizationWindowSeconds). Time may vary depending on stabilizationWindowSeconds values over time.

periodSeconds describes how much time will pass between termination of each pod, maximum value is 1800 second (30 minutes).

stabilizationWindowSeconds when metrics indicate that target should be scaled down, this algorithm takes a look into previously calculated desired states and uses highest value from specified interval. For scale down default value is 300, maximum value is 3600 (one hour).

-- kool
Source: StackOverflow