I have an issue with HPA configuration, based on HTTP requests rate. I am using a rate based on a Prometheus metric - sum(rate(http_server_requests_seconds_count[5m]))
- but at start-up HPA is auto-scaling to the maximum number of pods despite no HTTP requests being received. See extract below from kubectl describe hpa showing that it is scaling on the metric and this happens within seconds of the deployment.
Normal SuccessfulRescale 23m (x4 over 128m) horizontal-pod-autoscaler New size: 2; reason: pods metric rate_5m_http_server_requests_seconds_count above target
Normal SuccessfulRescale 23m (x4 over 128m) horizontal-pod-autoscaler New size: 3; reason: pods metric rate_5m_http_server_requests_seconds_count above target
Is it possible to tell Kubernetes not to scale for the first N seconds/minutes or is there another way around this problem?