Hazelcast cluster unstable on Kubernetes

9/28/2018

I'm trying to run Hazelcast on a Kubernetes cluster (provisioned using Kubespray default configuration). The cluster is successfully assembled but is unstable. Members leave and rejoin the cluster with no apparent reason. If there is some load on the cluster it happens almost instantaneously - otherwise, it's a matter of seconds, sometimes minutes.

Here are the logs:

2018-09-28T18:17:57.450214594Z WARNING: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] 
      Member [10.233.71.21]:5701 - 5585d841-f30f-44e5-8290-4f777a9f9a5e is suspected to be dead for reason: 
      Suspecting Member [10.233.71.21]:5701 - 5585d841-f30f-44e5-8290-4f777a9f9a5e because it has not sent any heartbeats since 2018-09-28 18:16:55.957. Now: 2018-09-28 18:17:57.413, heartbeat timeout: 60000 ms, suspicion level: 1.00 
      2018-09-28T18:17:57.450814818Z Sep 28, 2018 6:17:57 PM com.hazelcast.nio.tcp.TcpIpConnection 

2018-09-28T18:17:57.45082653Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] 
      Connection[id=4, /10.233.90.5:5701->/10.233.71.21:41018, endpoint=[10.233.71.21]:5701, alive=false, type=MEMBER] closed. Reason: 
      Suspecting Member [10.233.71.21]:5701 - 5585d841-f30f-44e5-8290-4f777a9f9a5e because it has not sent any heartbeats since 2018-09-28 18:16:55.957. Now: 2018-09-28 18:17:57.413, heartbeat timeout: 60000 ms, suspicion level: 1.00 
      2018-09-28T18:17:59.308401465Z Sep 28, 2018 6:17:59 PM com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager 

2018-09-28T18:17:59.3084277Z WARNING: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] This node does not have a connection to Member [10.233.71.21]:5701 - 5585d841-f30f-44e5-8290-4f777a9f9a5e 2018-09-28T18:17:59.30843287Z Sep 28, 2018 6:17:59 PM com.hazelcast.internal.cluster.ClusterService 
2018-09-28T18:17:59.308436765Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4]  2018-09-28T18:17:59.30844038Z 2018-09-28T18:17:59.308443787Z Members {size:4, ver:6} [ 
     2018-09-28T18:17:59.308447427Z     Member [10.233.97.132]:5701 - edec2d4b-2038-4d4e-a07a-d949c5eddb73 
     2018-09-28T18:17:59.308451285Z     Member [10.233.75.68]:5701 - df4eefa7-5829-4da6-9cf5-0efcfe7aa1e7 
     2018-09-28T18:17:59.308455097Z     Member [10.233.90.5]:5701 - a87ec39a-9df9-45b2-8be3-5a01d9c3e5a7 
     this 
     2018-09-28T18:17:59.308458882Z     Member [10.233.71.21]:5701 - 5585d841-f30f-44e5-8290-4f777a9f9a5e 
     2018-09-28T18:17:59.308481465Z ] 2018-09-28T18:17:59.308485511Z  2018-09-28T18:17:59.308488998Z Sep 28, 2018 6:17:59 PM com.hazelcast.nio.tcp.TcpIpConnector 

2018-09-28T18:17:59.308492741Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] Connecting to /10.233.71.21:5701, timeout: 0, bind-any: true 2018-09-28T18:17:59.310401142Z Sep 28, 2018 6:17:59 PM com.hazelcast.transaction.TransactionManagerService 
2018-09-28T18:17:59.310413971Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] Committing/rolling-back live transactions of [10.233.100.131]:5701, UUID: d386005a-40fc-4d2d-aeb9-f5f58216e55b 2018-09-28T18:17:59.310428599Z Sep 28, 2018 6:17:59 PM com.hazelcast.nio.tcp.TcpIpConnectionManager 
2018-09-28T18:17:59.310433319Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] Established socket connection between /10.233.90.5:41613 and /10.233.71.21:5701 2018-09-28T18:17:59.376900798Z Sep 28, 2018 6:17:59 PM com.hazelcast.nio.tcp.TcpIpConnection 
2018-09-28T18:17:59.376931621Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] Connection[id=5, /10.233.90.5:41613->/10.233.71.21:5701, endpoint=[10.233.71.21]:5701, alive=false, type=MEMBER] closed. Reason: Member left event received from master 2018-09-28T18:17:59.378512532Z Sep 28, 2018 6:17:59 PM com.hazelcast.transaction.TransactionManagerService 
2018-09-28T18:17:59.378612815Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] Committing/rolling-back live transactions of [10.233.71.21]:5701, UUID: 5585d841-f30f-44e5-8290-4f777a9f9a5e 2018-09-28T18:17:59.378757175Z Sep 28, 2018 6:17:59 PM com.hazelcast.internal.cluster.ClusterService 
2018-09-28T18:17:59.378948248Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4]  2018-09-28T18:17:59.379282381Z 2018-09-28T18:17:59.37937371Z Members {size:3, ver:7} [ 
     2018-09-28T18:17:59.379381035Z     Member [10.233.97.132]:5701 - edec2d4b-2038-4d4e-a07a-d949c5eddb73 
     2018-09-28T18:17:59.379475593Z     Member [10.233.75.68]:5701 - df4eefa7-5829-4da6-9cf5-0efcfe7aa1e7 
     2018-09-28T18:17:59.379482891Z     Member [10.233.90.5]:5701 - a87ec39a-9df9-45b2-8be3-5a01d9c3e5a7 
     this 
     2018-09-28T18:17:59.37948704Z ] 2018-09-28T18:17:59.379490677Z     2018-09-28T18:18:00.978709605Z Sep 28, 2018 6:18:00 PM com.hazelcast.nio.tcp.TcpIpAcceptor 

2018-09-28T18:18:00.978736307Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] Accepting socket connection from /10.233.71.21:45044 2018-09-28T18:18:00.978954156Z Sep 28, 2018 6:18:00 PM com.hazelcast.nio.tcp.TcpIpConnectionManager 
2018-09-28T18:18:00.978964757Z INFO: [10.233.90.5]:5701 [kpts-cluster] [3.10.4] Established socket connection between /10.233.90.5:5701 and /10.233.71.21:45044 Logs from 9/28/18 6:16 PM to 9/28/18 6:18 PM UTC

Here is the resource definition:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: iga
spec:
  selector:
    matchLabels:
      app: iga-worker
  replicas: 4
  template:
    metadata:
      labels:
        app: iga-worker
    spec:
      containers:
        - name: iga-worker
          image: "kbhit/iga-adi-cl-worker:latest"
          imagePullPolicy: Always
          ports:
            - containerPort: 5701
          env:
            - name: JAVA_OPTS
              value: "
                -XX:+UnlockExperimentalVMOptions
                -XX:+UseCGroupMemoryLimitForHeap
                -XX:MaxRAMFraction=4
                -XshowSettings:vm
                -Dk8s.service.name=iga-adi-cl-workers
                -Dk8s.namespace=iga-adi-cl
                -Dmancenter.url=http://iga-management-center.iga-adi-cl:8080/hazelcast-mancenter
                -Dhazelcast.diagnostics.enabled=true
                -Dhazelcast.diagnostics.metric.level=info
                -Dhazelcast.diagnostics.invocation.sample.period.seconds=30
                -Dhazelcast.diagnostics.pending.invocations.period.seconds=30
                -Dhazelcast.diagnostics.slowoperations.period.seconds=30
                -Dhazelcast.log.state=true
              "
          resources:
            limits:
              cpu: 3
              memory: 5Gi
            requests:
              cpu: 3
              memory: 5Gi

---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: iga-management-center
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: iga-management-center
    spec:
      containers:
      - name: hazelcast
        image: hazelcast/management-center

---
kind: Service
apiVersion: v1
metadata:
  name: iga-adi-cl-workers
spec:
  selector:
    app: iga-worker
  ports:
  - protocol: TCP
    port: 5701
    targetPort: 5701

---
kind: Service
apiVersion: v1
metadata:
  name: iga-management-center
spec:
  type: NodePort
  selector:
    app: iga-management-center
  ports:
    - protocol: TCP
      port: 8080
      targetPort: 8080

And here is the complete Hazelcast config I'm using

<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-config-3.11.xsd"
           xmlns="http://www.hazelcast.com/schema/config"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">


    <properties>
        <property name="hazelcast.discovery.enabled">true</property>
        <property name="service-name">kpts-worker</property>
        <property name="hazelcast.partition.count">271</property>
        <property name="hazelcast.diagnostics.enabled">true</property>
    </properties>

    <group>
        <name>kpts-cluster</name>
        <password>kpts-cluster-pass</password>
    </group>

    <management-center enabled="true">${mancenter.url}</management-center>

    <network>
        <join>
            <!-- deactivate normal discovery -->
            <multicast enabled="false"/>
            <tcp-ip enabled="false" />

            <!-- activate the Kubernetes plugin -->
            <discovery-strategies>
                <discovery-strategy enabled="true"
                                    class="com.hazelcast.kubernetes.HazelcastKubernetesDiscoveryStrategy">

                    <properties>
                        <property name="service-name">${k8s.service.name}</property>
                        <!--<property name="service-label-name">cluster01</property>-->
                        <!--<property name="service-label-value">true</property>-->
                        <property name="namespace">${k8s.namespace}</property>
                    </properties>
                </discovery-strategy>
            </discovery-strategies>
        </join>
    </network>

    <partition-group enabled="false"/>

    <executor-service name="default">
        <pool-size>4</pool-size>
        <queue-capacity>0</queue-capacity>
    </executor-service>

    <map name="commons">

        <in-memory-format>BINARY</in-memory-format>

        <backup-count>0</backup-count>
        <async-backup-count>0</async-backup-count>

        <near-cache>

            <in-memory-format>OBJECT</in-memory-format>

            <!--
              Maximum size of the near cache. When max size is reached,
              cache is evicted based on the policy defined.
              Any integer between 0 and Integer.MAX_VALUE. 0 means
              Integer.MAX_VALUE. Default is 0.
            -->
            <max-size>0</max-size>

            <!--
              Maximum number of seconds for each entry to stay in the near cache. Entries that are
              older than <time-to-live-seconds> will get automatically evicted from the near cache.
              Any integer between 0 and Integer.MAX_VALUE. 0 means infinite. Default is 0.
            -->
            <time-to-live-seconds>0</time-to-live-seconds>

            <!--
              Maximum number of seconds each entry can stay in the near cache as untouched (not-read).
              Entries that are not read (touched) more than <max-idle-seconds> value will get removed
              from the near cache.
              Any integer between 0 and Integer.MAX_VALUE. 0 means
              Integer.MAX_VALUE. Default is 0.
            -->
            <max-idle-seconds>0</max-idle-seconds>

            <!--
              Valid values are:
              NONE (no extra eviction, <time-to-live-seconds> may still apply),
              LRU  (Least Recently Used),
              LFU  (Least Frequently Used).
              NONE is the default.
              Regardless of the eviction policy used, <time-to-live-seconds> will still apply.
            -->
            <eviction-policy>NONE</eviction-policy>

            <!--
              Should the cached entries get evicted if the entries are changed (updated or removed).
              true of false. Default is true.
            -->
            <invalidate-on-change>true</invalidate-on-change>

            <!--
              You may want also local entries to be cached.
              This is useful when in memory format for near cache is different than the map's one.
              By default it is disabled.
            -->
            <cache-local-entries>true</cache-local-entries>
        </near-cache>
    </map>


    <map name="vertices">
        <!--
           Data type that will be used for storing recordMap.
           Possible values:
           BINARY (default): keys and values will be stored as binary data
           OBJECT : values will be stored in their object forms
           NATIVE : values will be stored in non-heap region of JVM
        -->
        <in-memory-format>BINARY</in-memory-format>

        <!--
            Number of backups. If 1 is set as the backup-count for example,
            then all entries of the map will be copied to another JVM for
            fail-safety. 0 means no backup.
        -->
        <backup-count>0</backup-count>
        <!--
            Number of async backups. 0 means no backup.
        -->
        <async-backup-count>0</async-backup-count>
        <!--
            Maximum number of seconds for each entry to stay in the map. Entries that are
            older than <time-to-live-seconds> and not updated for <time-to-live-seconds>
            will get automatically evicted from the map.
            Any integer between 0 and Integer.MAX_VALUE. 0 means infinite. Default is 0.
        -->
        <time-to-live-seconds>0</time-to-live-seconds>
        <!--
            Maximum number of seconds for each entry to stay idle in the map. Entries that are
            idle(not touched) for more than <max-idle-seconds> will get
            automatically evicted from the map. Entry is touched if get, put or containsKey is called.
            Any integer between 0 and Integer.MAX_VALUE. 0 means infinite. Default is 0.
        -->
        <max-idle-seconds>0</max-idle-seconds>
        <!--
            Valid values are:
            NONE (no eviction),
            LRU (Least Recently Used),
            LFU (Least Frequently Used).
            NONE is the default.
        -->
        <eviction-policy>NONE</eviction-policy>
        <!--
            Maximum size of the map. When max size is reached,
            map is evicted based on the policy defined.
            Any integer between 0 and Integer.MAX_VALUE. 0 means
            Integer.MAX_VALUE. Default is 0.
        -->
        <max-size policy="PER_NODE">0</max-size>
        <!--
            While recovering from split-brain (network partitioning),
            map entries in the small cluster will merge into the bigger cluster
            based on the policy set here. When an entry merge into the
            cluster, there might an existing entry with the same key already.
            Values of these entries might be different for that same key.
            Which value should be set for the key? Conflict is resolved by
            the policy set here. Default policy is PutIfAbsentMapMergePolicy

            There are built-in merge policies such as
            com.hazelcast.map.merge.PassThroughMergePolicy; entry will be overwritten if merging entry exists for the key.
            com.hazelcast.map.merge.PutIfAbsentMapMergePolicy ; entry will be added if the merging entry doesn't exist in the cluster.
            com.hazelcast.map.merge.HigherHitsMapMergePolicy ; entry with the higher hits wins.
            com.hazelcast.map.merge.LatestUpdateMapMergePolicy ; entry with the latest update wins.
        -->
        <merge-policy>com.hazelcast.map.merge.PutIfAbsentMapMergePolicy</merge-policy>

        <!--
           Control caching of de-serialized values. Caching makes query evaluation faster, but it cost memory.
           Possible Values:
                        NEVER: Never cache deserialized object
                        INDEX-ONLY: Caches values only when they are inserted into an index.
                        ALWAYS: Always cache deserialized values.
        -->
        <cache-deserialized-values>NEVER</cache-deserialized-values>

    </map>

    <serialization>
        <data-serializable-factories>
            <data-serializable-factory factory-id="1">
                com.agh.iet.komplastech.solver.factories.HazelcastProductionFactory
            </data-serializable-factory>
            <data-serializable-factory factory-id="2">
                com.agh.iet.komplastech.solver.factories.HazelcastGeneralFactory
            </data-serializable-factory>
            <data-serializable-factory factory-id="3">
                com.agh.iet.komplastech.solver.factories.HazelcastProblemFactory
            </data-serializable-factory>
        </data-serializable-factories>
    </serialization>

    <services enable-defaults="true"/>

    <lite-member enabled="false"/>

</hazelcast>

Client logs

WARNING: hz.client_0 [kpts-cluster] [3.10.4] Error while fetching cluster partition table!
java.util.concurrent.ExecutionException: com.hazelcast.spi.exception.TargetDisconnectedException: Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=139, channel=NioChannel{/10.233.74.77:45202->/10.233.100.138:5701}, remoteEndpoint=[10.233.100.138]:5701, lastReadTime=2018-09-30 11:59:50.747, lastWriteTime=2018-09-30 12:00:48.140, closedTime=never, lastHeartbeatRequested=2018-09-30 12:00:47.934, lastHeartbeatReceived=2018-09-30 11:59:47.936, connected server version=3.10.4}
    at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolve(ClientInvocationFuture.java:73)
    at com.hazelcast.spi.impl.AbstractInvocationFuture$1.run(AbstractInvocationFuture.java:250)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
    at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64)
    at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80)
Caused by: com.hazelcast.spi.exception.TargetDisconnectedException: Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=139, channel=NioChannel{/10.233.74.77:45202->/10.233.100.138:5701}, remoteEndpoint=[10.233.100.138]:5701, lastReadTime=2018-09-30 11:59:50.747, lastWriteTime=2018-09-30 12:00:48.140, closedTime=never, lastHeartbeatRequested=2018-09-30 12:00:47.934, lastHeartbeatReceived=2018-09-30 11:59:47.936, connected server version=3.10.4}
    at com.hazelcast.client.spi.impl.AbstractClientInvocationService$CleanResourcesTask.notifyException(AbstractClientInvocationService.java:224)
    at com.hazelcast.client.spi.impl.AbstractClientInvocationService$CleanResourcesTask.run(AbstractClientInvocationService.java:213)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    ... 5 more
Caused by: com.hazelcast.spi.exception.TargetDisconnectedException: Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=139, channel=NioChannel{/10.233.74.77:45202->/10.233.100.138:5701}, remoteEndpoint=[10.233.100.138]:5701, lastReadTime=2018-09-30 11:59:50.747, lastWriteTime=2018-09-30 12:00:48.140, closedTime=never, lastHeartbeatRequested=2018-09-30 12:00:47.934, lastHeartbeatReceived=2018-09-30 11:59:47.936, connected server version=3.10.4}
    at com.hazelcast.client.connection.nio.DefaultClientConnectionStrategy.onHeartbeatStopped(DefaultClientConnectionStrategy.java:117)
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl.heartbeatStopped(ClientConnectionManagerImpl.java:730)
    at com.hazelcast.client.connection.nio.HeartbeatManager.fireHeartbeatStopped(HeartbeatManager.java:139)
    at com.hazelcast.client.connection.nio.HeartbeatManager.checkConnection(HeartbeatManager.java:98)
    at com.hazelcast.client.connection.nio.HeartbeatManager.run(HeartbeatManager.java:85)
    ... 9 more
 Sep 30, 2018 12:00:53 PM com.hazelcast.client.spi.ClientPartitionService
WARNING: hz.client_0 [kpts-cluster] [3.10.4] Error while fetching cluster partition table!
java.util.concurrent.ExecutionException: com.hazelcast.spi.exception.TargetDisconnectedException: Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=139, channel=NioChannel{/10.233.74.77:45202->/10.233.100.138:5701}, remoteEndpoint=[10.233.100.138]:5701, lastReadTime=2018-09-30 11:59:50.747, lastWriteTime=2018-09-30 12:00:48.140, closedTime=never, lastHeartbeatRequested=2018-09-30 12:00:47.934, lastHeartbeatReceived=2018-09-30 11:59:47.936, connected server version=3.10.4}
    at com.hazelcast.client.spi.impl.ClientInvocationFuture.resolve(ClientInvocationFuture.java:73)

Nodes used can easily accommodate resource requirements and there are no other applications deployed. What might be the issue here?

-- kboom
hazelcast
kubernetes

0 Answers