Cannot connect to Artemis from another pod in kubernetes

9/5/2020

I have two pods one is my ArtemisMQ pod and another is a consumer service. I have ingress set up so that I can access the console which works, but the issue comes down to when my consumer pod is trying to access port 61616 for connections.

The error I get from my consumer pod

Could not refresh JMS Connection for destination 'PricingSave' - retrying using FixedBackOff{interval=5000, currentAttempts=0, maxAttempts=unlimited}. Cause: Failed to create session factory; nested exception is ActiveMQConnectionTimedOutException[errorType=CONNECTION_TIMEDOUT message=AMQ219013: Timed out waiting to receive cluster topology. Group:null]

My broker config:

<?xml version='1.0'?>

<configuration xmlns="urn:activemq"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xi="http://www.w3.org/2001/XInclude"
               xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd">

   <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="urn:activemq:core ">

      <name>Artemis</name>
      
      <broadcast-groups>
        <broadcast-group name="brokerCluster-broadcast">
                <local-bind-address>0.0.0.0</local-bind-address>
                <local-bind-port>10000</local-bind-port>
                <group-address>231.7.7.7</group-address>
                <group-port>9876</group-port>
                <broadcast-period>20</broadcast-period>
                <connector-ref>netty-connector</connector-ref>
            </broadcast-group>
        </broadcast-groups>

        <discovery-groups>
            <discovery-group name="brokerCluster-discovery">
                <local-bind-port>10000</local-bind-port>
                <local-bind-address>0.0.0.0</local-bind-address>
                <group-address>231.7.7.7</group-address>
                <group-port>9876</group-port>
                <refresh-timeout>10</refresh-timeout>
            </discovery-group>
        </discovery-groups>

      <cluster-connections>
            <cluster-connection name="brokerCluster">
                <connector-ref>netty-connector</connector-ref>
                <retry-interval>500</retry-interval>
                <use-duplicate-detection>true</use-duplicate-detection>
                <message-load-balancing>ON_DEMAND</message-load-balancing>
                <max-hops>1</max-hops>
          <discovery-group-ref discovery-group-name="brokerCluster-discovery"/>
        </cluster-connection>
      </cluster-connections>

      <connectors>
        <connector name="netty-connector">tcp://PODS_IP:61616</connector>
      </connectors>

      <persistence-enabled>true</persistence-enabled>

      <!-- this could be ASYNCIO, MAPPED, NIO
           ASYNCIO: Linux Libaio
           MAPPED: mmap files
           NIO: Plain Java Files
       -->
      <journal-type>ASYNCIO</journal-type>

      <paging-directory>data/paging</paging-directory>

      <bindings-directory>data/bindings</bindings-directory>

      <journal-directory>data/journal</journal-directory>

      <large-messages-directory>data/large-messages</large-messages-directory>

      <journal-datasync>true</journal-datasync>

      <journal-min-files>2</journal-min-files>

      <journal-pool-files>10</journal-pool-files>

      <journal-device-block-size>4096</journal-device-block-size>

      <journal-file-size>10M</journal-file-size>

      <!--
       This value was determined through a calculation.
       Your system could perform 2.84 writes per millisecond
       on the current journal configuration.
       That translates as a sync write every 352000 nanoseconds.

       Note: If you specify 0 the system will perform writes directly to the disk.
             We recommend this to be 0 if you are using journalType=MAPPED and journal-datasync=false.
      -->
      <journal-buffer-timeout>352000</journal-buffer-timeout>


      <!--
        When using ASYNCIO, this will determine the writing queue depth for libaio.
       -->
      <journal-max-io>4096</journal-max-io>
      <!--
        You can verify the network health of a particular NIC by specifying the <network-check-NIC> element.
         <network-check-NIC>theNicName</network-check-NIC>
        -->

      <!--
        Use this to use an HTTP server to validate the network
         <network-check-URL-list>http://www.apache.org</network-check-URL-list> -->

      <!-- <network-check-period>10000</network-check-period> -->
      <!-- <network-check-timeout>1000</network-check-timeout> -->

      <!-- this is a comma separated list, no spaces, just DNS or IPs
           it should accept IPV6

           Warning: Make sure you understand your network topology as this is meant to validate if your network is valid.
                    Using IPs that could eventually disappear or be partially visible may defeat the purpose.
                    You can use a list of multiple IPs, and if any successful ping will make the server OK to continue running -->
      <!-- <network-check-list>10.0.0.1</network-check-list> -->

      <!-- use this to customize the ping used for ipv4 addresses -->
      <!-- <network-check-ping-command>ping -c 1 -t %d %s</network-check-ping-command> -->

      <!-- use this to customize the ping used for ipv6 addresses -->
      <!-- <network-check-ping6-command>ping6 -c 1 %2$s</network-check-ping6-command> -->




      <!-- how often we are looking for how many bytes are being used on the disk in ms -->
      <disk-scan-period>5000</disk-scan-period>

      <!-- once the disk hits this limit the system will block, or close the connection in certain protocols
           that won't support flow control. -->
      <max-disk-usage>90</max-disk-usage>

      <!-- should the broker detect dead locks and other issues -->
      <critical-analyzer>true</critical-analyzer>

      <critical-analyzer-timeout>120000</critical-analyzer-timeout>

      <critical-analyzer-check-period>60000</critical-analyzer-check-period>

      <critical-analyzer-policy>HALT</critical-analyzer-policy>


      <page-sync-timeout>484000</page-sync-timeout>


            <!-- the system will enter into page mode once you hit this limit.
           This is an estimate in bytes of how much the messages are using in memory

            The system will use half of the available memory (-Xmx) by default for the global-max-size.
            You may specify a different value here if you need to customize it to your needs.

            <global-max-size>100Mb</global-max-size>

      -->

      <acceptors>

         <!-- Acceptor for every supported protocol -->
         <acceptor name="artemis">tcp://PODS_IP:61616?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;amqpMinLargeMessageSize=102400;protocols=CORE,AMQP,STOMP,HORNETQ,MQTT,OPENWIRE;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpDuplicateDetection=true</acceptor>

         <!-- AMQP Acceptor.  Listens on default AMQP port for AMQP traffic.-->
         <acceptor name="amqp">tcp://PODS_IP:5672?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=AMQP;useEpoll=true;amqpCredits=1000;amqpLowCredits=300;amqpMinLargeMessageSize=102400;amqpDuplicateDetection=true</acceptor>

         <!-- STOMP Acceptor. -->
         <acceptor name="stomp">tcp://PODS_IP:61613?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=STOMP;useEpoll=true</acceptor>

         <!-- HornetQ Compatibility Acceptor.  Enables HornetQ Core and STOMP for legacy HornetQ clients. -->
         <acceptor name="hornetq">tcp://PODS_IP:5445?anycastPrefix=jms.queue.;multicastPrefix=jms.topic.;protocols=HORNETQ,STOMP;useEpoll=true</acceptor>

         <!-- MQTT Acceptor -->
         <acceptor name="mqtt">tcp://PODS_IP:1883?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;protocols=MQTT;useEpoll=true</acceptor>

      </acceptors>


      <security-settings>
         <security-setting match="#">
            <permission type="createNonDurableQueue" roles="amq"/>
            <permission type="deleteNonDurableQueue" roles="amq"/>
            <permission type="createDurableQueue" roles="amq"/>
            <permission type="deleteDurableQueue" roles="amq"/>
            <permission type="createAddress" roles="amq"/>
            <permission type="deleteAddress" roles="amq"/>
            <permission type="consume" roles="amq"/>
            <permission type="browse" roles="amq"/>
            <permission type="send" roles="amq"/>
            <!-- we need this otherwise ./artemis data imp wouldn't work -->
            <permission type="manage" roles="amq"/>
         </security-setting>
      </security-settings>

      <address-settings>
         <!-- if you define auto-create on certain queues, management has to be auto-create -->
         <address-setting match="activemq.management#">
            <dead-letter-address>DLQ</dead-letter-address>
            <expiry-address>ExpiryQueue</expiry-address>
            <redelivery-delay>0</redelivery-delay>
            <!-- with -1 only the global-max-size is in use for limiting -->
            <max-size-bytes>-1</max-size-bytes>
            <message-counter-history-day-limit>10</message-counter-history-day-limit>
            <address-full-policy>PAGE</address-full-policy>
            <auto-create-queues>true</auto-create-queues>
            <auto-create-addresses>true</auto-create-addresses>
            <auto-create-jms-queues>true</auto-create-jms-queues>
            <auto-create-jms-topics>true</auto-create-jms-topics>
         </address-setting>
         <!--default for catch all-->
         <address-setting match="#">
            <dead-letter-address>dlq</dead-letter-address>
            <expiry-address>ExpiryQueue</expiry-address>
            <redelivery-delay>0</redelivery-delay>
            <!-- with -1 only the global-max-size is in use for limiting -->
            <max-size-bytes>-1</max-size-bytes>
            <message-counter-history-day-limit>10</message-counter-history-day-limit>
            <address-full-policy>PAGE</address-full-policy>
            <auto-create-queues>true</auto-create-queues>
            <auto-create-addresses>true</auto-create-addresses>
            <auto-create-jms-queues>true</auto-create-jms-queues>
            <auto-create-jms-topics>true</auto-create-jms-topics>
         </address-setting>
      </address-settings>

      <addresses>
         <address name="PricingSave.dlq">
            <anycast>
               <queue name="PricingSave.dlq"/>
            </anycast>
         </address>
         <address name="PricingSave">
            <anycast>
               <queue name="PricingSave"/>
            </anycast>
         </address>

         <address name="ExpiryQueue">
            <anycast>
               <queue name="ExpiryQueue" />
            </anycast>
         </address>

      </addresses>

      <!-- Uncomment the following if you want to use the Standard LoggingActiveMQServerPlugin pluging to log in events
      <broker-plugins>
         <broker-plugin class-name="org.apache.activemq.artemis.core.server.plugin.impl.LoggingActiveMQServerPlugin">
            <property key="LOG_ALL_EVENTS" value="true"/>
            <property key="LOG_CONNECTION_EVENTS" value="true"/>
            <property key="LOG_SESSION_EVENTS" value="true"/>
            <property key="LOG_CONSUMER_EVENTS" value="true"/>
            <property key="LOG_DELIVERING_EVENTS" value="true"/>
            <property key="LOG_SENDING_EVENTS" value="true"/>
            <property key="LOG_INTERNAL_EVENTS" value="true"/>
         </broker-plugin>
      </broker-plugins>
      -->

   </core>
</configuration>

My Artemis Service config:

kind: Service
apiVersion: v1
metadata:
  labels:
    name: amq
  name: amq
  namespace: default
spec:
  selector:
    app: amq
  ports:
    - name: web
      port: 8161
      protocol: TCP
      targetPort: 8161
    - name: stomp
      port: 61613
      protocol: TCP
      targetPort: 61613
    - name: mqtt
      port: 1883
      protocol: TCP
      targetPort: 1883
    - name: openwire
      port: 61616
      protocol: TCP
      targetPort: 61616
    - name: jmx
      port: 9404
      protocol: TCP
      targetPort: 9404
    - name: hornetq
      port: 5445
      protocol: TCP
      targetPort: 5445
    - name: amqp
      port: 5672
      protocol: TCP
      targetPort: 5672
  sessionAffinity: None
  type: ClusterIP

My Artemis deployment Config:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    name: amq
  name: amq
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: amq
  strategy:
    type: Recreate
  template:
    metadata:
      annotations:
        linkerd.io/inject: enabled
      labels:
        app: amq
    spec:
      containers:
      - image: REPOSITORY/IMAGE
        name: amq
        env:
          - name: KUBERNETES_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: CONTAINER_CORE_LIMIT
            valueFrom:
              resourceFieldRef:
                resource: limits.cpu
          - name: CONTAINER_MAX_MEMORY
            valueFrom:
              resourceFieldRef:
                resource: limits.memory
        ports:
        - containerPort: 8161
          name: web
          protocol: TCP
        - containerPort: 61613
          name: stomp
          protocol: TCP
        - containerPort: 1883
          name: mqtt
          protocol: TCP
        - containerPort: 61616
          name: openwire
          protocol: TCP
        - containerPort: 9404
          name: jmx
          protocol: TCP
        - containerPort: 5445
          name: hornetq
          protocol: TCP
        - containerPort: 5672
          name: amqp
          protocol: TCP
        imagePullPolicy: Always
        resources:
          requests:
            cpu: 200m
            memory: 512Mi
          limits:
            cpu: 400m
            memory: 1012Mi
        securityContext:
          capabilities:
            add:
              - NET_ADMIN
              - NET_RAW
      restartPolicy: Always
      imagePullSecrets:
        - name: my-registry...
-- Andy
activemq-artemis
jms
kubernetes

2 Answers

9/20/2020

After some more testing locally on minikube which worked, I rebuilt my cluster and redeployed. I decided not to deploy nor reconfigure linkerd on this new cluster and everything is now working fine.

So the issue I've had must have been done to some proxy settings to do with linkerd. I'll follow this post up when I find a solution to get connections through linkerd.

-- Andy
Source: StackOverflow

9/5/2020

Using 0.0.0.0 for a connector is not valid. In fact, there should be a WARN level message in the log about it. Your cluster will not form properly with such a configuration. You need to use a hostname or IP address that other hosts can use to reach the broker that broadcasts it. That's the whole point of the broadcast-group - to tell other brokers who may be listening how they can connect to it.

-- Justin Bertram
Source: StackOverflow