my kubernetes cluster does not scale down

3/27/2019

I have kuberentes cluster. One master and one worker. I install metric-server for auto scaling and then i run stress test

$ kubectl run autoscale-test --image=ubuntu:16.04 --requests=cpu=1000m --command sleep 1800
deployment "autoscale-test" created
$ kubectl autoscale deployment autoscale-test --cpu-percent=25 --min=1 --max=5
deployment "autoscale-test" autoscaled
$ kubectl get hpa
NAME             REFERENCE                   TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
autoscale-test   Deployment/autoscale-test   0% / 25%   1         5         1          1m
$ kubectl get pod
NAME                              READY     STATUS    RESTARTS   AGE
autoscale-test-59d66dcbf7-9fqr8   1/1       Running   0          9m
kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- apt-get update
kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- apt-get install stress

$ kubectl exec autoscale-test-59d66dcbf7-9fqr8 -- stress --cpu 2 --timeout 600s &
stress: info: [227] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd

everything works fine and the pod was auto scaled but after that the pod that was created by autoscale is still running and they do not terminate after the stress test the hpa shows that the 0% of cpu is in use but the 5 autoscaled pod still running

#kubectl get hpa
NAME             REFERENCE                   TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
autoscale-test   Deployment/autoscale-test   0%/25%    1         5         5          74m

#kubectl get pods --all-namespaces
NAMESPACE     NAME                             READY   STATUS    RESTARTS   AGE
default       autoscale-test-8f4d84bbf-7ddjw   1/1     Running   0          61m
default       autoscale-test-8f4d84bbf-bmr59   1/1     Running   0          61m
default       autoscale-test-8f4d84bbf-cxt26   1/1     Running   0          61m
default       autoscale-test-8f4d84bbf-x9jws   1/1     Running   0          61m
default       autoscale-test-8f4d84bbf-zbhvk   1/1     Running   0          71m

I wait for an hour but nothing happen

-- yasin lachini
kubernetes

1 Answer

3/27/2019

From the documentation:

--horizontal-pod-autoscaler-downscale-delay: The value for this option is a duration that specifies how long the autoscaler has to wait before another downscale operation can be performed after the current one has completed. The default value is 5 minutes (5m0s).

Note: When tuning these parameter values, a cluster operator should be aware of the possible consequences. If the delay (cooldown) value is set too long, there could be complaints that the Horizontal Pod Autoscaler is not responsive to workload changes. However, if the delay value is set too short, the scale of the replicas set may keep thrashing as usual.

Finally, just before HPA scales the target, the scale recommendation is recorded. The controller considers all recommendations within a configurable window choosing the highest recommendation from within that window. This value can be configured using the --horizontal-pod-autoscaler-downscale-stabilization-window flag, which defaults to 5 minutes. This means that scaledowns will occur gradually, smoothing out the impact of rapidly fluctuating metric values.

-- Ijaz Ahmad Khan
Source: StackOverflow