I have a 3 nodes Kafka cluster which runs on top of Kubernetes using the image wurstmeister/kafka:0.10.1.1.
The Zookeeper cluster is composed by 3 nodes with version 3.4.8.
I noticed that the Kafka broker with id 2 is endlessly printing the message:
[2017-05-08 13:51:16,748] ERROR [ReplicaFetcherThread-0-0], Error for partition [partition_name,5] to broker 0:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)
This message is printed for a lot of partitions every seconds. The broker 2 logs reached more than 10GB.
Looking into Zookeeper, I can see that the broker 2 is not listed in the znode /brokers/ids.
Each Zookeeper and Kafka nodes has its own k8s Deployment and Service (like zk-service-1 -> zk-deployment-1, zk-service-2 -> zk-deployment-2...). Zookeeper nodes know each other through k8s service names. For example in the file properties, server 1 has the line: server.1=zk-service-1:2888:3888.
The same happens for Kafka: the broker X has the advertised.host.name property = kafka-X, where kafka-X is the Service name associated to that broker's pod. The brokers zookeeper.connect property is zk-service-1,zk-service-2,zk-service-3.
I set the hostname of the pods as the service name which is attached to it.
I don't know how to properly debug it and which information can help me in understanding what this issue is about. Do you please have any clue?
The way I see it is that you are looking for your nodes to have a consistent hostname.
I am pretty sure that if you deploy zookeeper using the controller called as statefulset
your problem of having a consistent hostname is solved. You don't have to do all the hacks of naming pods.
deployment
creates pods but the names of pods are not same, but statefulset
creates pods that have consistent names, and you expose it via headless service. So basically you are directly talking to pods.
Read more about statefulsets
here and basics here.
Or for the configurations you can take help from following zookeeper configurations and kafka configurations.
Above configurations can be found here.