Error while deploying zookeeper services using kubernetes

3/13/2018

I am getting following error while deploying zookeeper services in our bare metal cluster using kubernetes. kube-dns service is also running, but still does not identifies zookeeper-1 service.

2018-03-13 12:11:32,217 [myid:3] - WARN  [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@155] - Failed to resolve address: zookeeper-1
java.net.UnknownHostException: zookeeper-1: Name or service not known
        at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
        at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:922)
        at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
        at java.net.InetAddress.getAllByName0(InetAddress.java:1269)
        at java.net.InetAddress.getAllByName(InetAddress.java:1185)
        at java.net.InetAddress.getAllByName(InetAddress.java:1119)
        at java.net.InetAddress.getByName(InetAddress.java:1069)
        at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:148)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:407)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:426)
        at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:843)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:822)

How can we resolve this issue?

-- rishi007bansod
dns
kubernetes

1 Answer

3/14/2018

In a StatefulSet, the host domain name does not exist until the related Pod gets created. In this case this probably means, that the Pod zookeeper-1 does not exist yet.

By default a Pod in a Stateful set only gets started, when the previous one started and the ReadinessProbe succeeded. This might indicate, that zookeeper-0 is not ready yet, because it is not able to contact zookeeper-1. The startup behavior can be changed by setting the Pod Management Policy to Parallel.

It might get necessary to keep the Zookeeper cluster unready (ie not get any traffic) until the cluster nodes connected to each other. On the same time it might be required that all nodes can reach each other before getting actually ready. We solved this be introducing a second service for zookeeper which we call zookeeper-discovery. This one has an annotation service.alpha.kubernetes.io/tolerate-unready-endpoints set to true, which means that it ignores the readiness in this case. All dependent services still use the default service zookeeper. It is important to use the zookeeper-discovery service in the statefulset:

---
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
  name: zookeeper-discovery
  labels:
    app: zookeeper
spec:
  clusterIP: None
  selector:
    app: zookeeper
  ports:
  - port: 2181
    name: client
  - port: 2888
    name: peer
  - port: 3888
    name: leader-election
---
apiVersion: v1
kind: Service
metadata:
  name: zookeeper
  labels:
    app: zookeeper
spec:
  selector:
    app: zookeeper
  ports:
  - port: 2181
    name: client
---
apiVersion: apps/v1beta1
kind: StatefulSet

metadata:
  name: zookeeper
  labels:
    app: zookeeper

spec:
  serviceName: zookeeper-disovery
  replicas: 3

  template:
    metadata:
      name: zookeeper
      labels:
        app: zookeeper
    spec:
      containers:
<-- snip -->
-- svenwltr
Source: StackOverflow