I have a deployment that I want it (and only it) to have a higher delay when it scales up.
The reason is that it is an initiator for many other services, and if it scales up to fast it starts suffocating and crashing the system, I want it to scale, let the other deployments scale in response, and then scale some more.
(This is a back office streaming process so I do not need seconds response time, I need to scale properly)
I saw that there is --horizontal-pod-autoscaler-downscale-delay
for the kube controller manager and the scaleup
one was even removed in 1.12
So what gives?, it looks like a reasonable request from the autoscaler... how am I to achieve this
Assuming you are not using any managed K8s service (GKE, AKS, EKS, etc.) in which case you have no access to control plane, the flag you might be interested in is
--horizontal-pod-autoscaler-sync-period
which is describes the period, in which the controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition. Default value is 15s.
You can find kube-controller-manager yaml file in /etc/kubernetes/manifests/