I'm trying to write an application where all pods are connected to each other. I have read in documentation:
The Pods’ ordinals, hostnames, SRV records, and A record names have not changed, but the IP addresses associated with the Pods may have changed. In the cluster used for this tutorial, they have. This is why it is important not to configure other applications to connect to Pods in a StatefulSet by IP address.
If you need to find and connect to the active members of a StatefulSet, you should query the CNAME of the Headless Service (
nginx.default.svc.cluster.local
). The SRV records associated with the CNAME will contain only the Pods in the StatefulSet that are Running and Ready.If your application already implements connection logic that tests for liveness and readiness, you can use the SRV records of the Pods (
web-0.nginx.default.svc.cluster.local
,web-1.nginx.default.svc.cluster.local
), as they are stable, and your application will be able to discover the Pods’ addresses when they transition to Running and Ready.
I though I can do it in following way:
However when I started implementing it on minikube it seemed to be racy and when I query A/SRV records:
It seems to me that there is a race between updating of DNS records and statefulset startup. I'm not entirely sure what I'm doing wrong and how I misunderstood the documentation.
apiVersion: v1
kind: Service
metadata:
name: hello-world-lb
labels:
app: hello-world-lb
spec:
ports:
- port: 8080
name: web
type: LoadBalancer
selector:
app: hello-world
---
apiVersion: v1
kind: Service
metadata:
name: hello-world
labels:
app: hello-world
spec:
ports:
- port: 8080
name: web
- port: 3080
name: hello-world
clusterIP: None
selector:
app: hello-world
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: hello-world
spec:
selector:
matchLabels:
app: hello-world
serviceName: "hello-world"
replicas: 3
template:
metadata:
labels:
app: hello-world
spec:
terminationGracePeriodSeconds: 10
containers:
- name: hello-world
image: hello-world
imagePullPolicy: Never
ports:
- containerPort: 8080
name: web
- containerPort: 3080
name: hello-world
livenessProbe:
tcpSocket:
port: 8080
EDIT Currently the code is as follows:
hello-world.default.svc.cluster.local.
/_hello-world._tcp.hello-world.default.svc.cluster.local.
and print them for debugI expected that A/SRV records for hello-world-0
will be empty, for hello-world-1
will contain hello-world-0 and for hello-world-N+1
will contain hello-world-0
to hello-world-N
. During rolling update A/SRV records would contain all other peers.
However it seems that DNS records are updated asynchronously so even when the lifeness of pod n
is detected and pod n + 1
is started it is not guaranteed that pod n + 1
will see address of pod n
in DNS.
add the below annotation to service definition
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"