Facing performance issue in Hazelcast with Client-Server

5/18/2020

I am facing performance issues with hazelcast configured as client-server. I have one K8S Cluster consisting of 5 Nodes and 1 Master Node. Each node is of 64 GB of RAM and 16 Core (Hazelcast Version 3.12.4) Hazelcast server is deployed on K8S with one POD on one of the nodes available in cluster My Client is deployed on K8S which is connected to above Hazelcast with smart client (Hazelcast discovery enabled for K8S). There are a total 10 PODs of my applications with each node consisting of 2 PODs of my application.

I am running different different APIs and performing load testing of my application (Approx 110 Threads at a time shared across all 10 PODs)

I am having the following piece of code in my application to get cache.

public Map<Object, Object> get(String cacheId, Long lTenantId) {
     String strMethodName="get";
     long t1 = System.currentTimeMillis();
     Map<Object,Object> cacheDataMap=hazelcastInstance.getMap(cacheId);
     long totalTimeTaken = (System.currentTimeMillis()-t1);
     if(totalTimeTaken > 10){
         logger.warnLog(CLASSNAME, strMethodName,"Total time taken by "+cacheId+" identifier for get operation is : "+totalTimeTaken+" ms");
      }
      return cacheDataMap;
}

The way my application uses this map varies like

1)

map.get(key);

2)

Set keys = map.keySet();
   Iterator iterator = keys.iterator(); //I changed to keyset iterator because entryset was causing lot of performance issues
   while (iterator.hasNext()) {
      // doing stuff
   }

When all My APIs are started for Load I am getting these logs printed in an application (Total time taken by....) where each cache access time is > 10 miliseconds and this is causing performance issues and hence I am not able to achieve my desired TPS for all APIs.

There are approx 300 Maps are stored in cache where total size of the cache is 4.22 MB

I am using near cache configuration and also on the management center it is showing effectiveness as 100%. (This was taken when hazelcast.client.statistics.enabled was enabled).

I have also tried with 8 PODs deployed on 4 Nodes and 1 dedicated node for Hazelcast server but the issue remains the same. There are no issues observed when I am connecting Hazelcast as embedded and I am able to achieve my desired TPS for all APIs.

Am I Missing any configuration or any other thing which is causing this problem?

Here is my hazelcast-client.xml

<hazelcast-client
xmlns="http://www.hazelcast.com/schema/client-config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.hazelcast.com/schema/client-config
                  http://hazelcast.com/schema/client-config/hazelcast-client-config-3.11.xsd">

<group>
<name>dev</name>
</group>
<instance-name>hazelcast</instance-name>
<properties>
<property name="hazelcast.client.shuffle.member.list">true</property>
<property name="hazelcast.client.heartbeat.timeout">600000</property>
<property name="hazelcast.client.heartbeat.interval">180000</property>
<property name="hazelcast.client.event.queue.capacity">1000000</property>
<property name="hazelcast.client.invocation.timeout.seconds">120</property>
<property name="hazelcast.client.statistics.enabled">false</property>
<property name="hazelcast.discovery.enabled">true</property>
<property name="hazelcast.map.invalidation.batch.enabled">false</property>
</properties>

<network>
<discovery-strategies>
<discovery-strategy enabled="true"
class="com.hazelcast.kubernetes.HazelcastKubernetesDiscoveryStrategy">
<properties>
<property name="service-name"><service-name></property>
<property name="namespace"><namespace></property>
</properties>
</discovery-strategy>
</discovery-strategies>
<smart-routing>true</smart-routing>
<redo-operation>true</redo-operation>
<connection-timeout>90000</connection-timeout>
<connection-attempt-period>100</connection-attempt-period>
<connection-attempt-limit>0</connection-attempt-limit>

</network>

<near-cache name="default">
                <in-memory-format>OBJECT</in-memory-format>
                <serialize-keys>true</serialize-keys>
                <invalidate-on-change>true</invalidate-on-change>
                <eviction eviction-policy="NONE" max-size-policy="ENTRY_COUNT"/>
        </near-cache>
</hazelcast-client>

Here is my hazelcast.xml

<?xml version="1.0" encoding="UTF-8"?>

<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-config-3.11.xsd"
           xmlns="http://www.hazelcast.com/schema/config"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

  <management-center enabled="${hazelcast.mancenter.enabled}">${hazelcast.mancenter.url}</management-center>

</hazelcast>
-- Pavan Mulani
hazelcast
hazelcast-imap
java
kubernetes
performance

1 Answer

5/18/2020

The goal of a cache is to get the value from a key as fast as possible. In general, you already have the key, and request the value. That means you send a request to any node, this looks in the partition table which partition the key belongs to, and forwards the query to the relevant node.

In your second use-case, you try to get all keys from all nodes:

Set keys = map.keySet();
Iterator iterator = keys.iterator();
while (iterator.hasNext()) {
    // doing stuff
}

To return fast as possible, Hazelcast will return a lazy implementation of the Iterator. For each call to next(), it will need first to retrieve the key following the above process. Plus, I assume the // doing stuff code actually loads the value from the key.

In conclusion, please avoid at all costs using map.keySet(). Unless I know more about your context and your use-case, I unfortunately cannot provide with a relevant alternative.

-- Nicolas
Source: StackOverflow