Hazelcast Cluster : Couldn't discover Hazelcast members using Kubernetes API

11/17/2020

Followed this hazelcast documentation to create a cluster on kubernetes : https://guides.hazelcast.org/kubernetes-external-client/

First created a hazelcast stateful set that created 3 pods. Then created Load Balancer Type Service for each pod as mentioned in above guide.

020-11-17 17:43:15,784 [ INFO] [main] [c.h.i.c.AbstractConfigLocator]: Loading configuration '/data/hazelcast/hazelcast.xml' from System property 'hazelcast.config'
2020-11-17 17:43:15,789 [ INFO] [main] [c.h.i.c.AbstractConfigLocator]: Using configuration file at /data/hazelcast/hazelcast.xml
2020-11-17 17:43:16,266 [ INFO] [main] [c.h.system]: [172.18.10.169]:5701 [dev] [4.1] Hazelcast 4.1 (20201104 - 2a1a477) starting at [172.18.10.169]:5701
2020-11-17 17:43:16,609 [ INFO] [main] [c.h.s.d.i.DiscoveryService]: [172.18.10.169]:5701 [dev] [4.1] Kubernetes Discovery properties: { service-dns: null, service-dns-timeout: 5, service-name: hazelcast-service, service-port: 5701, service-label: null, service-label-value: true, namespace: neo-search-service, pod-label: null, pod-label-value: null, resolve-not-ready-addresses: true, use-node-name-as-external-address: false, kubernetes-api-retries: 10, kubernetes-master: https://kubernetes.default.svc}
2020-11-17 17:43:16,612 [ INFO] [main] [c.h.s.d.i.DiscoveryService]: [172.18.10.169]:5701 [dev] [4.1] Kubernetes Discovery activated with mode: KUBERNETES_API
2020-11-17 17:43:16,677 [ INFO] [main] [c.h.i.i.Node]: [172.18.10.169]:5701 [dev] [4.1] Using Discovery SPI
2020-11-17 17:43:16,681 [ INFO] [main] [c.h.c.CPSubsystem]: [172.18.10.169]:5701 [dev] [4.1] CP Subsystem is enabled with 3 members.
2020-11-17 17:43:17,001 [ INFO] [main] [c.h.i.d.Diagnostics]: [172.18.10.169]:5701 [dev] [4.1] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.
2020-11-17 17:43:17,016 [ INFO] [main] [c.h.c.LifecycleService]: [172.18.10.169]:5701 [dev] [4.1] [172.18.10.169]:5701 is STARTING
2020-11-17 17:43:17,395 [ INFO] [main] [c.h.s.d.i.DiscoveryService]: [172.18.10.169]:5701 [dev] [4.1] Kubernetes plugin discovered availability zone: 4
2020-11-17 17:43:17,440 [ WARN] [main] [c.h.k.RetryUtils]: Couldn't discover Hazelcast members using Kubernetes API, [1] retrying in 1 seconds...
2020-11-17 17:43:18,983 [ WARN] [main] [c.h.k.RetryUtils]: Couldn't discover Hazelcast members using Kubernetes API, [2] retrying in 2 seconds...
2020-11-17 17:43:21,272 [ WARN] [main] [c.h.k.RetryUtils]: Couldn't discover Hazelcast members using Kubernetes API, [3] retrying in 3 seconds...
2020-11-17 17:43:21,658 [ INFO] [hz.boring_sanderson.IO.thread-in-0] [c.h.i.s.t.TcpServerConnection]: [172.18.10.169]:5701 [dev] [4.1] Connection[id=1, /127.0.0.1:5701->/127.0.0.1:47196, qualifier=null, endpoint=null, alive=false, connectionType=NONE, planeIndex=-1] closed. Reason: Connection closed by the other side
2020-11-17 17:43:23,198 [ INFO] [hz.boring_sanderson.IO.thread-in-1] [c.h.i.s.t.TcpServerConnection]: [172.18.10.169]:5701 [dev] [4.1] Connection[id=2, /127.0.0.1:5701->/127.0.0.1:47336, qualifier=null, endpoint=null, alive=false, connectionType=NONE, planeIndex=-1] closed. Reason: Connection closed by the other side

This is my hazelcast.cluster.yaml file.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: hazelcast
  labels:
    app: hazelcast
  annotations:
    service-per-pod-label: "statefulset.kubernetes.io/pod-name"
    service-per-pod-ports: "5701:5701"
spec:
  replicas: 3
  serviceName: hazelcast-service
  selector:
    matchLabels:
      app: hazelcast
  template:
    metadata:
      labels:
        app: hazelcast
    spec:
      serviceAccountName: neo-search-service
      containers:
      - name: hazelcast
        image: hazelcast/hazelcast:4.1
        ports:
        - name: hazelcast
          containerPort: 5701
        volumeMounts:
        - name: hazelcast-storage
          mountPath: /data/hazelcast
        env:
        - name: JAVA_OPTS
          value: "-Dhazelcast.rest.enabled=true -Dhazelcast.config=/data/hazelcast/hazelcast.xml"
      volumes:
      - name: hazelcast-storage
        configMap:
          name: hazelcast-configuration

---

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: hazelcast
  namespace: neo-search-service
spec:
  maxUnavailable: 0
  selector:
    matchLabels:
      app: hazelcast


---

apiVersion: v1
kind: ConfigMap
metadata:
  name: hazelcast-configuration
data:
   hazelcast.xml: |-
     <?xml version="1.0" encoding="UTF-8"?>
         <hazelcast xmlns="http://www.hazelcast.com/schema/config"
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xsi:schemaLocation="http://www.hazelcast.com/schema/config
                http://www.hazelcast.com/schema/config/hazelcast-config-4.1.xsd">
           <network>
             <rest-api enabled="true"></rest-api>
             <join>
               <!-- deactivate normal discovery -->
               <multicast enabled="false"/>
               <tcp-ip enabled="false" />
               <!-- activate the Kubernetes plugin -->
               <kubernetes enabled="true">
                 <service-name>hazelcast-service</service-name>
                 <service-port>5701</service-port>
                 <namespace>neo-search-service</namespace>
                 <kubernetes-api-retries>10</kubernetes-api-retries>
               </kubernetes>
             </join>
           </network>
           <user-code-deployment enabled="true">
             <class-cache-mode>ETERNAL</class-cache-mode>
             <provider-mode>LOCAL_AND_CACHED_CLASSES</provider-mode>
           </user-code-deployment>
           <reliable-topic name="ConfirmationTimeout">
             <read-batch-size>10</read-batch-size>
             <topic-overload-policy>DISCARD_OLDEST</topic-overload-policy>
             <statistics-enabled>true</statistics-enabled>
           </reliable-topic>
           <ringbuffer name="ConfirmationTimeout">
             <capacity>10000</capacity>
             <backup-count>1</backup-count>
             <async-backup-count>0</async-backup-count>
             <time-to-live-seconds>0</time-to-live-seconds>
             <in-memory-format>BINARY</in-memory-format>
             <merge-policy batch-size="100">com.hazelcast.spi.merge.PutIfAbsentMergePolicy</merge-policy>
           </ringbuffer>
           <scheduled-executor-service name="ConfirmationTimeout">
             <capacity>100</capacity>
             <capacity-policy>PER_NODE</capacity-policy>
             <pool-size>32</pool-size>
             <durability>3</durability>
             <merge-policy batch-size="100">com.hazelcast.spi.merge.PutIfAbsentMergePolicy</merge-policy>
           </scheduled-executor-service>
           <cp-subsystem>
             <cp-member-count>3</cp-member-count>
             <group-size>3</group-size>
             <session-time-to-live-seconds>300</session-time-to-live-seconds>
             <session-heartbeat-interval-seconds>5</session-heartbeat-interval-seconds>
             <missing-cp-member-auto-removal-seconds>14400</missing-cp-member-auto-removal-seconds>
             <fail-on-indeterminate-operation-state>false</fail-on-indeterminate-operation-state>
             <raft-algorithm>
                 <leader-election-timeout-in-millis>15000</leader-election-timeout-in-millis>
                 <leader-heartbeat-period-in-millis>5000</leader-heartbeat-period-in-millis>
                 <max-missed-leader-heartbeat-count>10</max-missed-leader-heartbeat-count>
                 <append-request-max-entry-count>100</append-request-max-entry-count>
                 <commit-index-advance-count-to-snapshot>10000</commit-index-advance-count-to-snapshot>
                 <uncommitted-entry-count-to-reject-new-appends>100</uncommitted-entry-count-to-reject-new-appends>
                 <append-request-backoff-timeout-in-millis>100</append-request-backoff-timeout-in-millis>
             </raft-algorithm>
             <locks>
                 <fenced-lock>
                     <name>TimeoutLock</name>
                     <lock-acquire-limit>1</lock-acquire-limit>
                 </fenced-lock>
             </locks>
           </cp-subsystem>
           <metrics enabled="true">
             <management-center>
                 <retention-seconds>30</retention-seconds>
             </management-center>
             <jmx enabled="false"/>
             <collection-frequency-seconds>10</collection-frequency-seconds>
           </metrics>
        </hazelcast>

Please help me with same.

-- Aayushi Khandelwal
hazelcast
java
kubernetes

0 Answers