Communication in k8s cluster

5/8/2018

I have a k8s cluster of 2 hazelcast instances and one client application. Target is to have many clients and at least 2 hazelcast members. I've set up a LoadBalancer type service in k8s to expose hazelcast instances

apiVersion: v1
kind: Service
metadata:
  name: hazelcast-service
  labels:
    app: hazelcast-service
spec:
  type: LoadBalancer
  ports:
  - port: 10236
    targetPort: 5701
  selector:
    app: hazelcast 

And when it comes for client to start with given config:

clientConfig.getNetworkConfig().addAddress("127.0.0.1:10236");

in recognizes a hazelcast members:

May 08, 2018 11:25:21 AM com.hazelcast.core.LifecycleService
INFO: hz.client_0 [dev] [3.9.3] HazelcastClient 3.9.3 (20180216 - 539b124) is STARTING
May 08, 2018 11:25:22 AM com.hazelcast.core.LifecycleService
INFO: hz.client_0 [dev] [3.9.3] HazelcastClient 3.9.3 (20180216 - 539b124) is STARTED
May 08, 2018 11:25:22 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_0 [dev] [3.9.3] Trying to connect to [127.0.0.1]:10236 as owner member
May 08, 2018 11:25:22 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_0 [dev] [3.9.3] Authenticated with server [10.1.0.151]:5701, server version:3.10 Local address: /127.0.0.1:60102
May 08, 2018 11:25:22 AM com.hazelcast.client.spi.impl.ClientMembershipListener
INFO: hz.client_0 [dev] [3.9.3]

Members [2] {
    Member [10.1.0.148]:5701 - b0e4a52f-0170-47f2-8ff3-74d9b67f45f5
    Member [10.1.0.151]:5701 - 1355caa4-5c2b-4366-bd5b-b504f4f0ae4f
}

May 08, 2018 11:25:22 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_0 [dev] [3.9.3] Setting ClientConnection{alive=true, connectionId=1, channel=NioChannel{/127.0.0.1:60102->/127.0.0.1:10236}, remoteEndpoint=[10.1.0.151]:5701, lastReadTime=2018-05-08 11:25:22.420, lastWriteTime=2018-05-08 11:25:22.418, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.10} as owner with principal ClientPrincipal{uuid='28696aaf-e678-47ee-8c7d-a79ba7a0079a', ownerUuid='1355caa4-5c2b-4366-bd5b-b504f4f0ae4f'}
May 08, 2018 11:25:22 AM com.hazelcast.core.LifecycleService
INFO: hz.client_0 [dev] [3.9.3] HazelcastClient 3.9.3 (20180216 - 539b124) is CLIENT_CONNECTED
May 08, 2018 11:25:22 AM com.hazelcast.internal.diagnostics.Diagnostics
INFO: hz.client_0 [dev] [3.9.3] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.

and when it tries to connect to second instance (10.1.0.151) it also seems to be fine:

May 08, 2018 11:25:29 AM com.hazelcast.core.LifecycleService
INFO: hz.client_1 [dev] [3.9.3] HazelcastClient 3.9.3 (20180216 - 539b124) is STARTING
May 08, 2018 11:25:29 AM com.hazelcast.core.LifecycleService
INFO: hz.client_1 [dev] [3.9.3] HazelcastClient 3.9.3 (20180216 - 539b124) is STARTED
May 08, 2018 11:25:29 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_1 [dev] [3.9.3] Trying to connect to [127.0.0.1]:10236 as owner member
May 08, 2018 11:25:29 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_1 [dev] [3.9.3] Authenticated with server [10.1.0.148]:5701, server version:3.10 Local address: /127.0.0.1:60113
May 08, 2018 11:25:29 AM com.hazelcast.client.spi.impl.ClientMembershipListener
INFO: hz.client_1 [dev] [3.9.3]

Members [2] {
    Member [10.1.0.148]:5701 - b0e4a52f-0170-47f2-8ff3-74d9b67f45f5
    Member [10.1.0.151]:5701 - 1355caa4-5c2b-4366-bd5b-b504f4f0ae4f
}

May 08, 2018 11:25:29 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_1 [dev] [3.9.3] Setting ClientConnection{alive=true, connectionId=1, channel=NioChannel{/127.0.0.1:60113->/127.0.0.1:10236}, remoteEndpoint=[10.1.0.148]:5701, lastReadTime=2018-05-08 11:25:29.455, lastWriteTime=2018-05-08 11:25:29.453, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.10} as owner with principal ClientPrincipal{uuid='a04aa2ca-626d-4d1a-a366-38c0dbc4781f', ownerUuid='b0e4a52f-0170-47f2-8ff3-74d9b67f45f5'}
May 08, 2018 11:25:29 AM com.hazelcast.core.LifecycleService
INFO: hz.client_1 [dev] [3.9.3] HazelcastClient 3.9.3 (20180216 - 539b124) is CLIENT_CONNECTED
May 08, 2018 11:25:29 AM com.hazelcast.internal.diagnostics.Diagnostics
INFO: hz.client_1 [dev] [3.9.3] Diagnostics disabled. To enable add -Dhazelcast.diagnostics.enabled=true to the JVM arguments.

but immediately after above message I got another one (seems to be connection problem regarding first member that my client connected to):

Constructor threw exception; nested exception is com.hazelcast.core.OperationTimeoutException: ClientInvocation{clientMessage = ClientMessage{length=72, correlationId=272, operation=Client.createProxy, messageType=5, partitionId=-1, isComplete=true, isRetryable=false, isEvent=false, writeOffset=0}, objectName = hz:impl:mapService, target = address [10.1.0.151]:5701, sendConnection = null} timed out because exception occurred after client invocation timeout 120000 ms. Current time: 2018-05-08 11:27:29.913. Start time: 2018-05-08 11:25:29.458. Total elapsed time: 120455 ms.

Sometimes it cannot even connect to the first member as I got OperationTimeoutException after client says that it connected to 10.1.0.151 member. Funny thing is that sometimes it works all fine:( And when I only have one replica of hazelcast pods it works predictable & fine. Thus, I believe this is because of LoadBalancer service which distributes requests equally among target pods and that there is something wrong with this set-up.

I suppose that client should be able to connect to any node it wants as any node can store requested item in its map but I don't know how to set up such a configuration in k8s.

Question is: how should I configure services in k8s so that client apps can talk to all members? Or this is not the case and it should work in a different way?

Am I missing something?

-- MichaƂ S
hazelcast
java
kubernetes

2 Answers

5/8/2018

If your hazelcast client is inside the kubernetes cluster, you dont really need LoadBalancer type. A simple service ClusterIP or headless would suffice. Hazelcast supports kubernetes discovery mode. I suggest try using ClusterIP or none.

-- Bal Chua
Source: StackOverflow

5/8/2018

Thus, I believe this is because of LoadBalancer service which distributes requests equally among target pods and that there is something wrong with this set-up.

Yes, it is true.

how should I configure services in k8s so that client apps can talk to all members?

Service can only balance requests between available pods.

Or this is not the case and it should work in a different way?

Yes, that is a good idea. The simplest way is to create one service per hazelcast replica. And that is practically the only way to do that.

-- Anton Kostenko
Source: StackOverflow