I have the below Horizontal Pod Autoscaller configuration on Google Kubernets Engine to scale a deployment by a custom metric - RabbitMQ messages ready count
for a specific queue: foo-queue
.
It picks up the metric value correctly.
When inserting 2 messages it scales the deployment to the maximum 10 replicas. I expect it to scale to 2 replicas since the targetValue is 1 and there are 2 messages ready.
Why does it scale so aggressively?
HPA configuration:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: foo-hpa
namespace: development
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: foo
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: "custom.googleapis.com|rabbitmq_queue_messages_ready"
metricSelector:
matchLabels:
metric.labels.queue: foo-queue
targetValue: 1
I think you did a great job explaining how targetValue
works with HorizontalPodAutoscalers. However, based on your question, I think you're looking for targetAverageValue
instead of targetValue
.
In the Kubernetes docs on HPAs, it mentions that using targetAverageValue
instructs Kubernetes to scale pods based on the average metric exposed by all Pods under the autoscaler. While the docs aren't explicit about it, an external metric (like the number of jobs waiting in a message queue) counts as a single data point. By scaling on an external metric with targetAverageValue
, you can create an autoscaler that scales the number of Pods to match a ratio of Pods to jobs.
Back to your example:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: foo-hpa
namespace: development
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: foo
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: "custom.googleapis.com|rabbitmq_queue_messages_ready"
metricSelector:
matchLabels:
metric.labels.queue: foo-queue
# Aim for one Pod per message in the queue
targetAverageValue: 1
will cause the HPA to try keeping one Pod around for every message in your queue (with a max of 10 pods).
As an aside, targeting one Pod per message is probably going to cause you to start and stop Pods constantly. If you end up starting a ton of Pods and process all of the messages in the queue, Kubernetes will scale your Pods down to 1. Depending on how long it takes to start your Pods and how long it takes to process your messages, you may have lower average message latency by specifying a higher targetAverageValue
. Ideally, given a constant amount of traffic, you should aim to have a constant number of Pods processing messages (which requires you to process messages at about the same rate that they are enqueued).
According to https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
From the most basic perspective, the Horizontal Pod Autoscaler controller operates on the ratio between desired metric value and current metric value:
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
From the above I understand that as long as the queue has messages the k8 HPA will continue to scale up since currentReplicas
is part of the desiredReplicas
calculation.
For example if:
currentReplicas
= 1
currentMetricValue
/ desiredMetricValue
= 2/1
then:
desiredReplicas
= 2
If the metric stay the same in the next hpa cycle currentReplicas
will become 2 and desiredReplicas
will be raised to 4
Try to follow this instruction that describes horizontal autoscale settings for RabbitMQ
in k8s
Kubernetes Workers Autoscaling based on RabbitMQ queue size
In particular, targetValue: 20
of metric rabbitmq_queue_messages_ready
is recommended instead of targetValue: 1
:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: workers-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: my-workers
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: "custom.googleapis.com|rabbitmq_queue_messages_ready"
metricSelector:
matchLabels:
metric.labels.queue: myqueue
**targetValue: 20
Now our deployment my-workers will grow if RabbitMQ queue myqueue has more than 20 non-processed jobs in total