If ActiveMQ Artemis cluster is running in a Kubernetes environment, is the ON_DEMAND load-balancing required for a backup-node to operate?

10/21/2020

Kubernetes Configurations

  • Kuberenetes StatefulSet(replicas=2) for the live-nodes:

    live-node1 ( paired with backup-node1 for HA )

    live-node2 ( paired with backup-node2 for HA )

  • Kubernetes Service for the live-nodes:

    live-node

  • Kuberenetes StatefulSet(replicas=2) for the backup-nodes:

    backup-node1

    backup-node2

  • Kubernetes Service for the backup-nodes:

    backup-node

Note: Clients(Publisher/Consumer) always connect to the cluster via the K8s service - live-node


Scenario

  • client1 is connected to live-node1
  • live-node1 goes down
  • backup-node1 takes over
  • client1 will try to reconnect via the K8s service - live-node
  • Either it connects back to live-node1 ( if it is back up ) OR ends-up connecting to live-node2
  • How the existing-messages and new-messages will be consumed and published in the latter case ??

My Understanding

  • All clients of live-node1 will connect to the live-node2
  • The existing-messages will be redistributed to live-node2 as no consumer on the backup-node1
  • The new-messages will be sent-to and consumed-from the live-node2

Please elaborate on this behavior and correct me if I am wrong.

-- Arshal Jain
activemq-artemis
high-availability
kubernetes

1 Answer

10/22/2020

Strictly speaking the message-load-balancing type configured on your cluster-connection is completely unrelated to how backups works. The message-load-balancing type, as the name suggests, is related to how messages are load-balanced around a cluster. How the backup behaves is determined by the ha-policy you have configured.

The whole point of having a backup is that when the live node fails all the clients connected to the live node will fail-over to the backup node. Furthermore, the backup node will have all the same messages that the live node had (either via replication or shared storage). Therefore, your expectation that all the clients connected to live-node1 will connect to live-node2 when live-node1 fails is misguided.

That said, if clients do actually connect to live-node2 instead of backup-node1 then the message-load-balancing type would need to be ON_DEMAND if you wanted messages to eventually be redistributed from the backup-node1 to live-node2. Obviously the redistribution-delay would also need to be greater than 0.

-- Justin Bertram
Source: StackOverflow