Rabbitmq pods are effecting after worker-node restart

10/19/2021

I have three rabbitmq nodes on an internal Kubernetes cluster I have deployed rabbitmq as a statefulset with bitnami/rabbitmq helm chart. We have classic and quorum queues on the rabbitmq cluster if a worker node shuts down unexpectedly or Kubernetes admin changed the rabbitmq pods to other worker nodes the rabbit cluster is displaying some abnormal behaviors. For example, it is sending these logs after,

2021-10-18 05:37:15.203 [error] <0.10784.180> Channel error on connection <0.19689.38> (192.168.7.206:24385 -> 192.168.8.168:5672, vhost: "/", user: "admin"), channel 34: operation queue.declare caused a channel exception not_found: failed to perform operation on queue "queue name" in vhost "/" due to timeout

Or consumer counts are increasing on rabbitnodes but it should be one because only one service is consuming the queue. enter image description here

I solved the issue to stop statefulset and reinitialized the rabbitmq. Is it possible to solve this issue without reinitializing the rabbitmq or is it a configuration issue that I miss?

-- user2775587
bitnami
kubernetes
kubernetes-helm
rabbitmq

0 Answers