I am getting following error while deploying zookeeper services in our bare metal cluster using kubernetes. kube-dns service is also running, but still does not identifies zookeeper-1
service.
2018-03-13 12:11:32,217 [myid:3] - WARN [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer$QuorumServer@155] - Failed to resolve address: zookeeper-1
java.net.UnknownHostException: zookeeper-1: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:922)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
at java.net.InetAddress.getAllByName0(InetAddress.java:1269)
at java.net.InetAddress.getAllByName(InetAddress.java:1185)
at java.net.InetAddress.getAllByName(InetAddress.java:1119)
at java.net.InetAddress.getByName(InetAddress.java:1069)
at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:148)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:407)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:426)
at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:843)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:822)
How can we resolve this issue?
In a StatefulSet, the host domain name does not exist until the related Pod gets created. In this case this probably means, that the Pod zookeeper-1
does not exist yet.
By default a Pod in a Stateful set only gets started, when the previous one started and the ReadinessProbe succeeded. This might indicate, that zookeeper-0
is not ready yet, because it is not able to contact zookeeper-1
. The startup behavior can be changed by setting the Pod Management Policy to Parallel
.
It might get necessary to keep the Zookeeper cluster unready (ie not get any traffic) until the cluster nodes connected to each other. On the same time it might be required that all nodes can reach each other before getting actually ready. We solved this be introducing a second service for zookeeper which we call zookeeper-discovery
. This one has an annotation service.alpha.kubernetes.io/tolerate-unready-endpoints
set to true
, which means that it ignores the readiness in this case. All dependent services still use the default service zookeeper
. It is important to use the zookeeper-discovery
service in the statefulset:
---
apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: zookeeper-discovery
labels:
app: zookeeper
spec:
clusterIP: None
selector:
app: zookeeper
ports:
- port: 2181
name: client
- port: 2888
name: peer
- port: 3888
name: leader-election
---
apiVersion: v1
kind: Service
metadata:
name: zookeeper
labels:
app: zookeeper
spec:
selector:
app: zookeeper
ports:
- port: 2181
name: client
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: zookeeper
labels:
app: zookeeper
spec:
serviceName: zookeeper-disovery
replicas: 3
template:
metadata:
name: zookeeper
labels:
app: zookeeper
spec:
containers:
<-- snip -->