My application is running as a container on top of kubernetes
.
The application consume messages from rabbitmq
.
I can't predict the exact amount of cpu
so I don't want to use it as autoscale limit, though I did set the prefetch
to something that looks normal.
Is there a way to follow the number of messages in the queue,
and once there are too much to tell k8s
to autoscale?
Or maybe set the autoscale to follow message rate?
You could either use Horizontal Pod Autoscaler with custom metrics, which need to be provided by some custom metrics API server (boilerplate: https://github.com/kubernetes-incubator/custom-metrics-apiserver).
Or use a custom autoscaler (probably outdated: https://github.com/onfido/k8s-rabbit-pod-autoscaler).
So you either provide the metrics to HPA or you run some application that has the metrics and sends scaling requirements to the kubernetes API.
I wasn't able to find much content on this which didn't involve using an external source such as StackDriver.
I spent several days working through all the issues, and wrote up a demo app with code on how to do it. I hope it will help someone:
This tutorial looks like ready answer to your question. It describes how to set up Horizontal Pod Autoscaler based on RabbitMQ queue size.