We are using the hpa feature of kubernetes and we want to control scale up and down time. After going through kubernetes document, I got know below properties will help to do that, but not much information with example and hard to understand. Could any one please explain below properties with example or additional information like how we can control scale up and down time in kubernetes.
--horizontal-pod-autoscaler-initial-readiness-delay
--horizontal-pod-autoscaler-cpu-initialization-period
--horizontal-pod-autoscaler-downscale-stabilization default in kubernates.
To get definitive answers for what these flags do, the best way is to directly look at the source code.
Here are pointers to the relevant source code files:
--horizontal-pod-autoscaler-initial-readiness-delay
: kubernetes/pkg/controller/podautoscaler/replica_calculator.go
--horizontal-pod-autoscaler-cpu-initialization-period
: kubernetes/pkg/controller/podautoscaler/replica_calculator.go
--horizontal-pod-autoscaler-downscale-stabilization
: kubernetes/pkg/controller/podautoscaler/horizontal.go
In general, the --horizontal-pod-autoscaler-downscale-stabilization
allows to define the maximum frequency of downscale operations. With the default of 5 minutes, if the HPA scales down your app, it will not do another downscale for 5 minutes, even if the metrics suggest to scale down. This is to prevent reacting too quickly to short drops in traffic that would then need to be undone with expensive scale up operations just a very short time later.
The other two flags mainly define when a Pod should be considered ready or should started being monitored.
Also look at the --horizontal-pod-autoscaler-sync-period duration
and --horizontal-pod-autoscaler-tolerance
flags (all the flags are defined here).