Using Celery with multiple workers in different pods

9/25/2017

What I'm trying to do is using Celery with Kubernetes. I'm using Redis as the message broker in a different pod and I have multiple pods for each queue of Celery.

Imagine if I have 3 queues, I would have 3 different pods (i.e workers) that can accept and handle the requests.

Everything is working fine so far but my question is, what would happen if I clone the pod of one of queues to have two pods for one single queue?

I think client (i.e Django) creates a new message using Redis to send to the worker and start the job but it's not clear to me what would happen because I have two pods listening to the same queue? Does the first pod accept the request and start the job and prevents the other pod to accept the request?

(I tried to search a bit on the documentation of Celery to see if I can find any clues but I couldn't. That's why I'm asking this question)

-- Afshin Mehrabani
celery
kubernetes

2 Answers

9/26/2017

A task message is not removed from the queue until that message has been acknowledged by a worker. A worker can reserve many messages in advance and even if the worker is killed – by power failure or some other reason – the message will be redelivered to another worker.

More: http://docs.celeryproject.org/en/latest/userguide/tasks.html

The two workers (pods) will receive tasks and complete them independently. It's like have a single pod, but processing task at twice the speed.

-- danielepolencic
Source: StackOverflow

9/29/2017

I guess you are using basic task type, which employs 'direct' queue type, not 'fanout' or 'topic' queue, the latter two have much difference, which will not be discussed here.

While using Redis as broker transport, celery/kombu use a Redis list object as a storage of queue (source), use command LPUSH to publish message, BRPOP to consume the message.

In short, BRPOP(doc) blocks the connection when there are no elements to pop from the given lists, if the list is not empty, an element is popped from the tail of the given list. It is guaranteed that this operation is atomic, no two connection could get the same element.

Celery leverage this feature to guarantees at-least-once message delivery. use of acknowledgment doesn't affect this guarantee.

In your case, there are multiple celery workers across multiple pods, but all of them connected to one same Redis server, all of them blocked for the same key, try to pop an element from the same list object. when new message arrived, there will be one and only one worker could get that message.


-- georgexsh
Source: StackOverflow