I am currently running kafka and the required zookeeper on a kubernetes cluster, but keeps getting connection errors when kafka tries to connect to zookeeper, which I can't seem to resolve.
Other pods within the cluster can reach it, but for some reason I am not able to reach it using Kafka`?
How come
Here is my kubernetes manifest which include the kafka and zookeeper setup and their services
apiVersion: v1
kind: Service
metadata:
name: leader-zookeeper
spec:
ports:
- name: client
port: 2181
protocol: TCP
targetPort: client
selector:
app: leader-zookeeper
sessionAffinity: None
type: ClusterIP
---
apiVersion: v1
kind: Service
metadata:
name: zookeeper-headless
spec:
clusterIP: None
ports:
- name: client
port: 2181
protocol: TCP
targetPort: 2181
- name: follower
port: 2888
protocol: TCP
targetPort: 2888
- name: election
port: 3888
protocol: TCP
targetPort: 3888
- name: admin-server
port: 8080
protocol: TCP
targetPort: 8080
selector:
app: zookeeper
sessionAffinity: None
type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: zookeeper
spec:
podManagementPolicy: OrderedReady
replicas: 1
revisionHistoryLimit: 1
selector:
matchLabels:
app: zookeeper
serviceName: zookeeper-headless
template:
metadata:
labels:
app: zookeeper
spec:
containers:
- name: zookeeper
imagePullPolicy: Always
image: "gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10"
resources:
requests:
memory: "100Mi"
cpu: "0.5"
ports:
- containerPort: 2181
name: client
- containerPort: 2888
name: server
- containerPort: 3888
name: leader-election
- containerPort: 8080
name: admin-server
command:
- sh
- -c
- "start-zookeeper \
--servers=1 \
--data_dir=/var/lib/zookeeper/data \
--data_log_dir=/var/lib/zookeeper/data/log \
--conf_dir=/opt/zookeeper/conf \
--client_port=2181 \
--election_port=3888 \
--server_port=2888 \
--tick_time=2000 \
--init_limit=10 \
--sync_limit=5 \
--heap=512M \
--max_client_cnxns=200 \
--snap_retain_count=3 \
--purge_interval=12 \
--max_session_timeout=60000 \
--min_session_timeout=4000 \
--log_level=DEBUG"
livenessProbe:
exec:
command:
- sh
- -c
- "zookeeper-ready 2181"
initialDelaySeconds: 10
timeoutSeconds: 5
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1000
runAsUser: 1000
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: data
updateStrategy:
type: OnDelete
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
labels:
app: zookeeper-pdb
name: zookeeper-pdb
spec:
maxUnavailable: 1
selector:
matchLabels:
app: zookeeper
---
apiVersion: v1
kind: Service
metadata:
name: kafka-headless
spec:
clusterIP: None
ports:
- name: broker
port: 9092
protocol: TCP
targetPort: 9092
selector:
app: kafka
sessionAffinity: None
type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: kafka
name: kafka
spec:
podManagementPolicy: OrderedReady
replicas: 1
revisionHistoryLimit: 1
selector:
matchLabels:
app: kafka
serviceName: kafka-headless
template:
metadata:
labels:
app: kafka
spec:
containers:
- command:
- sh
- -exc
- |
unset KAFKA_PORT && \
export KAFKA_BROKER_ID=${HOSTNAME##*-} && \
export KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://${POD_IP}:9092 && \
exec /etc/confluent/docker/run
env:
- name: POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: KAFKA_HEAP_OPTS
value: -Xmx1G -Xms1G
- name: KAFKA_ZOOKEEPER_CONNECT
value: leader-zookeeper:2181
- name: KAFKA_LOG_DIRS
value: /opt/kafka/data/logs
- name: KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR
value: "1"
image: confluentinc/cp-kafka:latest
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- sh
- -ec
- /usr/bin/jps | /bin/grep -q SupportedKafka
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: kafka-broker
ports:
- containerPort: 9092
name: kafka
protocol: TCP
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/kafka/data
name: datadir
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 60
updateStrategy:
type: OnDelete
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
labels:
app: kafka-pdb
name: kafka-pdb
spec:
maxUnavailable: 1
selector:
matchLabels:
app: kafka
and here is the log from the kafka pod:
2019-12-18T22:59:44.6809224Z [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=leader-zookeeper:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher@1ddc4ec2
2019-12-18T22:59:44.6980203Z [main-SendThread(leader-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server leader-zookeeper/10.104.189.16:2181. Will not attempt to authenticate using SASL (unknown error)
2019-12-18T22:59:45.8097553Z [main-SendThread(leader-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket error occurred: leader-zookeeper/10.104.189.16:2181: Connection refused
2019-12-18T22:59:46.91975Z [main-SendThread(leader-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server leader-zookeeper/10.104.189.16:2181. Will not attempt to authenticate using SASL (unknown error)
2019-12-18T22:59:47.9634735Z [main-SendThread(leader-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket error occurred: leader-zookeeper/10.104.189.16:2181: Connection refused
2019-12-18T22:59:49.0653514Z [main-SendThread(leader-zookeeper:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server leader-zookeeper/10.104.189.16:2181. Will not attempt to authenticate using SASL (unknown error)
how do i connect them - as this is required?