We have a two node Kafka tcluster running in Openshift. We created a topic with retention policy set to 30 days. This work as expected including individual broker restarts. However when Kafka cluster is restarted by scaling down Kafka statefulset are to 0 replicas and scaling it back up to 2 replicas all topic messages are gone.
Before cluster restart:
/usr/bin/kafka-run-class kafka.tools.GetOffsetShell --broker-list localhost:29092 --topic platforms.openshift.events --time -1 --offsets 1
platforms.openshift.events:0:73387
After cluster restart:
$/usr/bin/kafka-run-class kafka.tools.GetOffsetShell --broker-list localhost:29092 --topic platforms.openshift.events --time -1 --offsets 1
platforms.openshift.events:0:0
Is it expected behavior? We use mounted volume for Kafka topic storage.
What I noteced is that kafka.properteis set log.dirs=/var/lib/kafka/data not /var/lib/kafka. After changing volume mount point from /var/lib/kafka to /var/lib/kafka/data the problem went away.