How to auto-scale Kubernetes Pods based on number of tasks in celery task queue?

9/23/2019

I have a celery worker deployed on Kubernetes pods which executes a task (not very CPU intensive but takes some time to complete due to some HTTP calls). Is there any way to autoscale the pods in K8s based on the number of tasks in the task queue?

-- Gullible Worm
celery
kubernetes

2 Answers

9/23/2019

Yes, by using the Kubernetes metrics registry and Horizontal Pod Autoscaler.

First, you need to collect the "queue length" metric from Celery and expose it through one of the Kubernetes metric APIs. You can do this with a Prometheus-based pipeline:

  1. Since Celery doesn't expose Prometheus metrics, you need to install an exporter that exposes some information about Celery (including the queue length) as Prometheus metrics. For example, this exporter.
  2. Install Prometheus in your cluster and configure it to collect the metric corresponding to the task queue length from the Celery exporter.
  3. Install the Prometheus Adapter in your cluster and configure it to expose the "queue length" metric through the Custom Metrics API by pulling its value from Prometheus.

Now you can configure the Horizontal Pod Autoscaler to query this metric from the Custom Metrics API and autoscale your app based on it.

For example, to scale the app between 1 and 10 replicas based on a target value for the queue length of 5:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Object
      object:
        metric:
          name: mycelery_queue_length
        target:
          type: value
          value: 5
        describedObject:
          apiVersion: apps/v1
          kind: Deployment
          name: mycelery
-- weibeld
Source: StackOverflow

9/23/2019

There is two parts to solve this problem: You need to collect the metrics from celery and make them available to the Kubernetes API (as custom metrics API). Then the HorizontalPodAutoscaler can query those metrics in order to scale based on custom metrics.

You can use Prometheus (for example) to collect metrics from Celery. Then, you can expose the metrics to Kubernetes with the Prometheus Adapter. Now the metrics available in prometheus are available to Kubernetes.

You can now define a HorizontalPodAutoscaler for your application:

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2alpha1
metadata:
name: sample-metrics-app-hpa
spec:
    scaleTargetRef:
        kind: Deployment
        name: sample-metrics-app
    minReplicas: 2
    maxReplicas: 10
    metrics:
    - type: Object
    object:
        target:
        kind: Service
        name: sample-metrics-app
        metricName: celery_queue_length
        targetValue: 100
-- Blokje5
Source: StackOverflow