Kafka unable to connect to zookeeper ensemble on EKS

7/6/2020

I am trying to run Kafka cluster on AWS EKS cluster (v1.16). I am using bitnami helm charts.

https://github.com/bitnami/charts/tree/master/bitnami/kafka https://github.com/bitnami/charts/tree/master/bitnami/zookeeper

I have deployed zookeeper ensemble successfully using below command:

helm install zookeeper bitnami/zookeeper --set replicaCount=3 --set auth.enabled=false --set allowAnonymousLogin=true --set persistence.storageClass=ebs --set persistence.accessModes={ReadWriteOnce} --set persistence.size=1Gi --set podLabels."app\.kubernetes\.io/version"="1.0"

It outputs:

ZooKeeper can be accessed via port 2181 on the following DNS name from within your cluster:

    zookeeper.pulse.svc.cluster.local

Now, I am trying to deploy Kafka cluster with below command:

helm install kafka bitnami/kafka --set replicaCount=3 --set zookeeper.enabled=false --set externalZookeeper.servers=zookeeper.pulse.svc.cluster.local --set autoCreateTopicsEnable=true --set persistence.storageClass=ebs --set persistence.accessModes={ReadWriteOnce} --set persistence.size=1Gi --set podLabels."app\.kubernetes\.io/version"="1.0"

It outputs:

Kafka can be accessed by consumers via port 9092 on the following DNS name from within your cluster:

    kafka.pulse.svc.cluster.local

Each Kafka broker can be accessed by producers via port 9092 on the following DNS name(s) from within your cluster:
    kafka-0.kafka-headless.pulse.svc.cluster.local
    kafka-1.kafka-headless.pulse.svc.cluster.local
    kafka-2.kafka-headless.pulse.svc.cluster.local

It creates 3 pods but none of the pod is able to connect to zookeeper. I am not getting what is the issue here.
Kafka pod logs:

2020-07-06T11:22:40.506134648Z  11:22:40.50 Welcome to the Bitnami kafka container
2020-07-06T11:22:40.507301179Z  11:22:40.50 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-kafka
2020-07-06T11:22:40.508519907Z  11:22:40.50 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-kafka/issues
2020-07-06T11:22:40.51039472Z  11:22:40.50 
2020-07-06T11:22:40.511630347Z  11:22:40.51 INFO  ==> ** Starting Kafka setup **
2020-07-06T11:22:40.55379314Z  11:22:40.55 WARN  ==> You set the environment variable ALLOW_PLAINTEXT_LISTENER=yes. For safety reasons, do not use this flag in a production environment.
2020-07-06T11:22:40.561203295Z  11:22:40.56 INFO  ==> Initializing Kafka...
2020-07-06T11:22:40.565054949Z  11:22:40.56 INFO  ==> No injected configuration files found, creating default config files
2020-07-06T11:22:40.723721499Z  11:22:40.72 INFO  ==> Configuring Kafka for inter-broker communications with PLAINTEXT authentication.
2020-07-06T11:22:40.726161543Z  11:22:40.72 WARN  ==> Inter-broker communications are configured as PLAINTEXT. This is not safe for production environments.
2020-07-06T11:22:40.727497832Z  11:22:40.72 INFO  ==> Configuring Kafka for client communications with PLAINTEXT authentication.
2020-07-06T11:22:40.731790674Z  11:22:40.73 WARN  ==> Client communications are configured using PLAINTEXT listeners. For safety reasons, do not use this in a production environment.
2020-07-06T11:22:40.73699684Z  11:22:40.73 INFO  ==> ** Kafka setup finished! **
2020-07-06T11:22:40.737001986Z 
2020-07-06T11:22:40.746297253Z  11:22:40.74 INFO  ==> ** Starting Kafka **
2020-07-06T11:22:41.512303802Z [2020-07-06 11:22:41,511] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$)
2020-07-06T11:22:42.008231959Z [2020-07-06 11:22:42,007] INFO starting (kafka.server.KafkaServer)
2020-07-06T11:22:42.009112085Z [2020-07-06 11:22:42,008] INFO Connecting to zookeeper on zookeeper.pulse.svc.cluster.local (kafka.server.KafkaServer)
2020-07-06T11:22:42.028233655Z [2020-07-06 11:22:42,028] INFO [ZooKeeperClient Kafka server] Initializing a new session to zookeeper.pulse.svc.cluster.local. (kafka.zookeeper.ZooKeeperClient)
2020-07-06T11:22:42.032763227Z [2020-07-06 11:22:42,032] INFO Client environment:zookeeper.version=3.5.7-f0fdd52973d373ffd9c86b81d99842dc2c7f660e, built on 02/10/2020 11:30 GMT (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.032776511Z [2020-07-06 11:22:42,032] INFO Client environment:host.name=kafka-0.kafka-headless.pulse.svc.cluster.local (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.03283528Z [2020-07-06 11:22:42,032] INFO Client environment:java.version=11.0.7 (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.032984414Z [2020-07-06 11:22:42,032] INFO Client environment:java.vendor=BellSoft (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033005091Z [2020-07-06 11:22:42,032] INFO Client environment:java.home=/opt/bitnami/java (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.03312054Z [2020-07-06 11:22:42,032] INFO Client environment:java.class.path=/opt/bitnami/kafka/bin/../libs/activation-1.1.1.jar:/opt/bitnami/kafka/bin/../libs/aopalliance-repackaged-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/argparse4j-0.7.0.jar:/opt/bitnami/kafka/bin/../libs/audience-annotations-0.5.0.jar:/opt/bitnami/kafka/bin/../libs/commons-cli-1.4.jar:/opt/bitnami/kafka/bin/../libs/commons-lang3-3.8.1.jar:/opt/bitnami/kafka/bin/../libs/connect-api-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/connect-basic-auth-extension-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/connect-file-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/connect-json-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/connect-mirror-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/connect-mirror-client-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/connect-runtime-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/connect-transforms-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/hk2-api-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/hk2-locator-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/hk2-utils-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/jackson-annotations-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-core-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-databind-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-dataformat-csv-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-datatype-jdk8-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-jaxrs-base-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-jaxrs-json-provider-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-module-jaxb-annotations-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-module-paranamer-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jackson-module-scala_2.12-2.10.2.jar:/opt/bitnami/kafka/bin/../libs/jakarta.activation-api-1.2.1.jar:/opt/bitnami/kafka/bin/../libs/jakarta.annotation-api-1.3.4.jar:/opt/bitnami/kafka/bin/../libs/jakarta.inject-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/jakarta.ws.rs-api-2.1.5.jar:/opt/bitnami/kafka/bin/../libs/jakarta.xml.bind-api-2.3.2.jar:/opt/bitnami/kafka/bin/../libs/javassist-3.22.0-CR2.jar:/opt/bitnami/kafka/bin/../libs/javassist-3.26.0-GA.jar:/opt/bitnami/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/bitnami/kafka/bin/../libs/javax.ws.rs-api-2.1.1.jar:/opt/bitnami/kafka/bin/../libs/jaxb-api-2.3.0.jar:/opt/bitnami/kafka/bin/../libs/jersey-client-2.28.jar:/opt/bitnami/kafka/bin/../libs/jersey-common-2.28.jar:/opt/bitnami/kafka/bin/../libs/jersey-container-servlet-2.28.jar:/opt/bitnami/kafka/bin/../libs/jersey-container-servlet-core-2.28.jar:/opt/bitnami/kafka/bin/../libs/jersey-hk2-2.28.jar:/opt/bitnami/kafka/bin/../libs/jersey-media-jaxb-2.28.jar:/opt/bitnami/kafka/bin/../libs/jersey-server-2.28.jar:/opt/bitnami/kafka/bin/../libs/jetty-client-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jetty-continuation-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jetty-http-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jetty-io-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jetty-security-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jetty-server-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jetty-servlet-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jetty-servlets-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jetty-util-9.4.24.v20191120.jar:/opt/bitnami/kafka/bin/../libs/jopt-simple-5.0.4.jar:/opt/bitnami/kafka/bin/../libs/kafka-clients-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/kafka-log4j-appender-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/kafka-streams-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/kafka-streams-examples-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/kafka-streams-scala_2.12-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/kafka-streams-test-utils-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/kafka-tools-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/kafka_2.12-2.5.0-sources.jar:/opt/bitnami/kafka/bin/../libs/kafka_2.12-2.5.0.jar:/opt/bitnami/kafka/bin/../libs/log4j-1.2.17.jar:/opt/bitnami/kafka/bin/../libs/lz4-java-1.7.1.jar:/opt/bitnami/kafka/bin/../libs/maven-artifact-3.6.3.jar:/opt/bitnami/kafka/bin/../libs/metrics-core-2.2.0.jar:/opt/bitnami/kafka/bin/../libs/netty-buffer-4.1.45.Final.jar:/opt/bitnami/kafka/bin/../libs/netty-codec-4.1.45.Final.jar:/opt/bitnami/kafka/bin/../libs/netty-common-4.1.45.Final.jar:/opt/bitnami/kafka/bin/../libs/netty-handler-4.1.45.Final.jar:/opt/bitnami/kafka/bin/../libs/netty-resolver-4.1.45.Final.jar:/opt/bitnami/kafka/bin/../libs/netty-transport-4.1.45.Final.jar:/opt/bitnami/kafka/bin/../libs/netty-transport-native-epoll-4.1.45.Final.jar:/opt/bitnami/kafka/bin/../libs/netty-transport-native-unix-common-4.1.45.Final.jar:/opt/bitnami/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/bitnami/kafka/bin/../libs/paranamer-2.8.jar:/opt/bitnami/kafka/bin/../libs/plexus-utils-3.2.1.jar:/opt/bitnami/kafka/bin/../libs/reflections-0.9.12.jar:/opt/bitnami/kafka/bin/../libs/rocksdbjni-5.18.3.jar:/opt/bitnami/kafka/bin/../libs/scala-collection-compat_2.12-2.1.3.jar:/opt/bitnami/kafka/bin/../libs/scala-java8-compat_2.12-0.9.0.jar:/opt/bitnami/kafka/bin/../libs/scala-library-2.12.10.jar:/opt/bitnami/kafka/bin/../libs/scala-logging_2.12-3.9.2.jar:/opt/bitnami/kafka/bin/../libs/scala-reflect-2.12.10.jar:/opt/bitnami/kafka/bin/../libs/slf4j-api-1.7.30.jar:/opt/bitnami/kafka/bin/../libs/slf4j-log4j12-1.7.30.jar:/opt/bitnami/kafka/bin/../libs/snappy-java-1.1.7.3.jar:/opt/bitnami/kafka/bin/../libs/validation-api-2.0.1.Final.jar:/opt/bitnami/kafka/bin/../libs/zookeeper-3.5.7.jar:/opt/bitnami/kafka/bin/../libs/zookeeper-jute-3.5.7.jar:/opt/bitnami/kafka/bin/../libs/zstd-jni-1.4.4-7.jar (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033182063Z [2020-07-06 11:22:42,033] INFO Client environment:java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033188827Z [2020-07-06 11:22:42,033] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.03322714Z [2020-07-06 11:22:42,033] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033253354Z [2020-07-06 11:22:42,033] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033280159Z [2020-07-06 11:22:42,033] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033324405Z [2020-07-06 11:22:42,033] INFO Client environment:os.version=4.14.181-140.257.amzn2.x86_64 (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033355581Z [2020-07-06 11:22:42,033] INFO Client environment:user.name=? (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033399449Z [2020-07-06 11:22:42,033] INFO Client environment:user.home=? (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.03340494Z [2020-07-06 11:22:42,033] INFO Client environment:user.dir=/ (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033468947Z [2020-07-06 11:22:42,033] INFO Client environment:os.memory.free=1015MB (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033509114Z [2020-07-06 11:22:42,033] INFO Client environment:os.memory.max=1024MB (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.033536891Z [2020-07-06 11:22:42,033] INFO Client environment:os.memory.total=1024MB (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.035251257Z [2020-07-06 11:22:42,035] INFO Initiating client connection, connectString=zookeeper.pulse.svc.cluster.local sessionTimeout=18000 watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@6ee6f53 (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:22:42.038953719Z [2020-07-06 11:22:42,038] INFO jute.maxbuffer value is 4194304 Bytes (org.apache.zookeeper.ClientCnxnSocket)
2020-07-06T11:22:42.043407452Z [2020-07-06 11:22:42,043] INFO zookeeper.request.timeout value is 0. feature enabled= (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:42.045196444Z [2020-07-06 11:22:42,045] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
2020-07-06T11:22:42.053941415Z [2020-07-06 11:22:42,053] INFO Opening socket connection to server zookeeper.pulse.svc.cluster.local/172.20.162.36:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:42.057906383Z [2020-07-06 11:22:42,057] INFO Socket connection established, initiating session, client: /100.64.5.213:52738, server: zookeeper.pulse.svc.cluster.local/172.20.162.36:2181 (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:42.061035524Z [2020-07-06 11:22:42,060] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:43.632054003Z [2020-07-06 11:22:43,631] INFO Opening socket connection to server zookeeper.pulse.svc.cluster.local/172.20.162.36:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:43.632596098Z [2020-07-06 11:22:43,632] INFO Socket connection established, initiating session, client: /100.64.5.213:52756, server: zookeeper.pulse.svc.cluster.local/172.20.162.36:2181 (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:43.634993004Z [2020-07-06 11:22:43,634] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:44.760870715Z [2020-07-06 11:22:44,760] INFO Opening socket connection to server zookeeper.pulse.svc.cluster.local/172.20.162.36:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:44.761283232Z [2020-07-06 11:22:44,761] INFO Socket connection established, initiating session, client: /100.64.5.213:52772, server: zookeeper.pulse.svc.cluster.local/172.20.162.36:2181 (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:22:44.763353195Z [2020-07-06 11:22:44,763] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
2
2020-07-06T11:23:12.738834004Z [2020-07-06 11:23:12,738] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
2020-07-06T11:23:12.738918322Z [2020-07-06 11:23:12,738] INFO EventThread shut down for session: 0x0 (org.apache.zookeeper.ClientCnxn)
2020-07-06T11:23:12.740751654Z [2020-07-06 11:23:12,740] INFO [ZooKeeperClient Kafka server] Closed. (kafka.zookeeper.ZooKeeperClient)
2020-07-06T11:23:12.745313347Z [2020-07-06 11:23:12,743] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
2020-07-06T11:23:12.745331011Z kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
2020-07-06T11:23:12.745335139Z 	at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:262)
2020-07-06T11:23:12.745338245Z 	at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:258)
2020-07-06T11:23:12.745340837Z 	at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:119)
2020-07-06T11:23:12.745343374Z 	at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1863)
2020-07-06T11:23:12.745345577Z 	at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:378)
2020-07-06T11:23:12.745347726Z 	at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:403)
2020-07-06T11:23:12.745349947Z 	at kafka.server.KafkaServer.startup(KafkaServer.scala:210)
2020-07-06T11:23:12.745352077Z 	at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)
2020-07-06T11:23:12.745354263Z 	at kafka.Kafka$.main(Kafka.scala:82)
2020-07-06T11:23:12.745368643Z 	at kafka.Kafka.main(Kafka.scala)
2020-07-06T11:23:12.745806818Z [2020-07-06 11:23:12,745] INFO shutting down (kafka.server.KafkaServer)
2020-07-06T11:23:12.752833659Z [2020-07-06 11:23:12,752] INFO shut down completed (kafka.server.KafkaServer)
2020-07-06T11:23:12.753305908Z [2020-07-06 11:23:12,753] ERROR Exiting Kafka. (kafka.server.KafkaServerStartable)
2020-07-06T11:23:12.757961524Z [2020-07-06 11:23:12,757] INFO shutting down (kafka.server.KafkaServer)
Close

Also from Kafka pod machine, curl gives below o/p:

istio-proxy@kafka-0:/$ curl zookeeper.pulse.svc.cluster.local:2181
curl: (52) Empty reply from server
istio-proxy@kafka-0:/$

Note: I am having istio sidecars with mTLS disabled.

Please help.

UPDATE
HI, It comes out to be the Istio porxy issue. I uninstalled Istio and it worked out. https://github.com/bitnami/bitnami-docker-kafka/issues/38#issuecomment-451381003

-- NumeroUno
amazon-eks
apache-kafka
apache-zookeeper
kubernetes

1 Answer

7/7/2020

This works fine for me on my local cluster. Since you are using EKS, you are most likely using AWS CNI(?). CNI allocates IP addresses in your VPC and if you are not allowing your security groups access to the VPC range it will not be able to access the pods. (172.20.162.36:2181 looks like VPC address for example).

Another thing you can check if is if you have some NetworkPolicy preventing access:

kubectl get netpol

It's kind of odd that you get the expected response from Zookeeper:

curl zookeeper.pulse.svc.cluster.local:2181
curl: (52) Empty reply from server

So it could be possible that zookeeper.pulse.svc.cluster.local is resolving to an 'accessible' <IP address>:2181. In any case, it looks like a firewall/network policy issue.

-- Rico
Source: StackOverflow