I have a kubernetes cluster with one zookeeper pod and three kafka broker pods.
Deployment descriptor for zk is:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: zookeeper
spec:
replicas: 1
template:
metadata:
labels:
name: zookeeper
spec:
containers:
- env:
- name: ZOOKEEPER_ID
value: "1"
- name: ZOOKEEPER_SERVER_1
value: zookeeper
- name: ZOOKEEPER_CLIENT_PORT
value: "2181"
- name: ZOOKEEPER_TICK_TIME
value: "2000"
name: zookeeper
image: confluentinc/cp-zookeeper:5.0.1
ports:
- containerPort: 2181
volumeMounts:
- mountPath: /var/lib/zookeeper/
name: zookeeper-data
nodeSelector:
noderole: kafka1
restartPolicy: Always
volumes:
- name: zookeeper-data
persistentVolumeClaim:
claimName: zookeeper-volume-claims
And for kafka brokers are like the following (one for each broker with the correspondant broker names, listeners and persistent volume claims):
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kafka1
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
name: kafka1
spec:
containers:
- env:
- name: KAFKA_AUTO_CREATE_TOPICS_ENABLE
value: "true"
- name: KAFKA_ADVERTISED_LISTENERS
value: "PLAINTEXT://<ip>:9092"
- name: KAFKA_LISTENERS
value: "PLAINTEXT://0.0.0.0:9092"
- name: KAFKA_ZOOKEEPER_CONNECT
value: <ip>:2181
- name: KAFKA_BROKER_ID
value: "1"
name: kafka1
image: confluentinc/cp-enterprise-kafka:5.0.1
ports:
- containerPort: 9092
volumeMounts:
- mountPath: /var/lib/kafka
name: kafka1-data
nodeSelector:
noderole: kafka2
restartPolicy: Always
volumes:
- name: kafka1-data
persistentVolumeClaim:
claimName: kafka1-volume-claim
The cluster is up and running and I'm able to create topics and publish and consume messages.
The file log.1 exists in /var/lib/zookeeper/log/version-2
-rw-r--r-- 1 root root 67108880 Jan 18 11:34 log.1
And if I run into one of the brokers:
kubectl exec -it kafka3-97454b745-wddpv bash
I can see the two partitions of the topic:
drwxr-xr-x 2 root root 4096 Jan 21 10:34 test1-1
drwxr-xr-x 2 root root 4096 Jan 21 10:35 test1-0
The issue comes when I restart the virtual machines when zookeeper ant brokers are allocated. One for zk, and one for each broker (three vm's that conform my Kubernetes cluster)
After restart, in everyone of the brokers, there are no topics:
root@kafka3-97454b745-wddpv:/var/lib/kafka/data# ls -lrt
total 24
-rw-r--r-- 1 root root 0 Jan 21 10:56 cleaner-offset-checkpoint
-rw-r--r-- 1 root root 54 Jan 21 10:56 meta.properties
drwxr-xr-x 2 root root 4096 Jan 21 10:56 __confluent.support.metrics-0
drwxr-xr-x 2 root root 4096 Jan 21 10:56 _schemas-0
-rw-r--r-- 1 root root 49 Jan 21 11:10 recovery-point-offset-checkpoint
-rw-r--r-- 1 root root 4 Jan 21 11:10 log-start-offset-checkpoint
-rw-r--r-- 1 root root 49 Jan 21 11:11 replication-offset-checkpoint
And in zookeeper:
root@zookeeper-84bb68d45b-cklwm:/var/lib/zookeeper/log/version-2# ls -lrt
total 16
-rw-r--r-- 1 root root 67108880 Jan 21 10:56 log.1
And if I list the topics, they are gone.
Kubernetes cluster is running on Azure.
I assume that there is not a problem related to the persistent volumes, since when I create manually a file in there, after restarting, the file is still there. I think that is something related to my kafka config. As you can see I'm using confluent docker images for that.
Any help would be really appreciated.
It was simply a wrong configuration on mount path. Paths have to point to the data and transactional log folders, instead of parent folders.