I'm having hard time setting up kafka on gke and would like to know the best way of setting it up?

5/22/2019

I was trying to use statefulset to deploy the zookeeper and Kafka server in a cluster in gke but the communication between the Kafka and zookeeper fails with an error message in logs. I'd like to know what would be the easiest way to setup a Kafka in kubernetes.

I've tried the following configurations and I see that the Kafka fails to communicate with zookeeper but I am not sure why? I know that I may need a headless service because the communication is being handled by Kafka and zookeeper themselves.

For Zookeeper

apiVersion: v1
kind: Service
metadata:
  name: zookeeper
spec:
  type: LoadBalancer
  selector:
    app: zookeeper
  ports:
  - port: 2181
    targetPort: client
    name: zk-port
  - port: 2888
    targetPort: leader
    name: zk-leader
  - port: 3888
    targetPort: election
    name: zk-election
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: zookeeper
spec:
  replicas: 3
  selector:
    matchLabels:
      app: zookeeper
  serviceName: zookeeper
  podManagementPolicy: Parallel
  template:
    metadata:
      labels:
        app: zookeeper
    spec:
      containers:
        - name: zk-pod
          image: zookeeper:latest
          imagePullPolicy: Always
          ports:
            - name: client
              containerPort: 2181
            - name: leader
              containerPort: 2888
            - name: election
              containerPort: 3888
          env:
            - name: ZOO_MY_ID
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: ZOO_TICK_TIME
              value: "2000"
            - name: ZOO_INIT_LIMIT
              value: "5"
            - name: ZOO_SYNC_LIMIT
              value: "2"
            - name: ZOO_SERVERS
              value: zookeeper:2888:3888

For Kafka

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: kafka
spec:
  replicas: 3
  selector:
    matchLabels:
      app: kafka
  serviceName: kafka-svc
  podManagementPolicy: Parallel
  template:
    metadata:
      labels:
        app: kafka
    spec:
      containers:
      - name: kafka
        image: confluentinc/cp-kafka:latest
        ports:
          - containerPort: 9092
            name: client
        env:
          - name: KAFKA_ZOOKEEPER_CONNECT
            value: zookeeper:2181
          - name: KAFKA_ADVERTISED_LISTENERS
            value: kafka.default.svc.cluster.local:9092
---
apiVersion: v1
kind: Service
metadata:
  name: kafka-svc
spec:
  type: LoadBalancer
  selector:
    app: kafka
  ports:
  - port: 9092
    targetPort: client
    name: kfk-port
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: kafka-pdb
spec:
  selector:
    matchLabels:
      app: kafka
  minAvailable: 2

I'd like to be able to send messages to a topic and to be able to read them back. I've been using kafkacat to test the connection.

-- Ani Aggarwal
apache-kafka
google-kubernetes-engine
kubernetes

1 Answer

5/22/2019

This is one of limitation that specified in Official Kubernetes Documentation about StateFulsets, that

  • StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service.

So, as you mentioned, you need Headless Service and you can just easily add headless service yaml to top of your configuration similar to below for your both StatefulSets:

apiVersion: v1
kind: Service
metadata:
  name: zookeeper
  labels:
    app: zookeeper
spec:
  ports:
  - port: 2181 
    name: someport
  clusterIP: None
  selector:
    app: zookeeper

Hope it helps!

-- coolinuxoid
Source: StackOverflow