I'm having the same problem with kafka-streams and spring-kafka applications. The first one is using kafka-clients:1.0.0 library while another one version 1.0.2
There is just one broker instance running in kubernetes (KAFKA_ADVERTISED_LISTENERS="PLAINTEXT://${POD_IP}:9092"). It's stateful set and it's accessed from the client app via headless service internal endpoint (although I've tried cluster ip and the issue is the same)
Once I delete this kafka pod and it's recreated, my client application can't reconnect. Pod is indeed recreated with another ip address but since I'm accessing it via service internal endpoint I'm expecting my client app to resolve this new address but it's not happening.
The kafka-clients library is logging "Found least loaded node [old_ip]:9092 (id: 0 rack: null)" while there is nothing anymore running on this address
JVM TTL cache is not a problem as I've set it to periodically refresh.
Restarting client application solves the problem
If providing {POD_IP} in KAFKA_ADVERTISED_LISTENERS causes this problem, would providing a pod's hostname solve this problem? Or is there a way to direct my client to resolve this new address?
It seems that it has something to do with KAFKA-7755. Updating client version to 2.2.0 / 2.1.1 should help.