How to solve race between pod startup in stateful set and service DNS lookup

5/26/2019

I'm trying to write an application where all pods are connected to each other. I have read in documentation:

The Pods’ ordinals, hostnames, SRV records, and A record names have not changed, but the IP addresses associated with the Pods may have changed. In the cluster used for this tutorial, they have. This is why it is important not to configure other applications to connect to Pods in a StatefulSet by IP address.

If you need to find and connect to the active members of a StatefulSet, you should query the CNAME of the Headless Service (nginx.default.svc.cluster.local). The SRV records associated with the CNAME will contain only the Pods in the StatefulSet that are Running and Ready.

If your application already implements connection logic that tests for liveness and readiness, you can use the SRV records of the Pods ( web-0.nginx.default.svc.cluster.local, web-1.nginx.default.svc.cluster.local), as they are stable, and your application will be able to discover the Pods’ addresses when they transition to Running and Ready.

I though I can do it in following way:

  • Lookup SRV records of service to check which pods are ready
  • Connect to all ready pods
  • Open the port which signify the readiness

However when I started implementing it on minikube it seemed to be racy and when I query A/SRV records:

  • On first node I get no records found error (sounds OK) before I open the port
  • On second node I sometimes get no records found sometimes one record
  • On third node I sometimes get one record and sometimes two records

It seems to me that there is a race between updating of DNS records and statefulset startup. I'm not entirely sure what I'm doing wrong and how I misunderstood the documentation.

apiVersion: v1
kind: Service
metadata:
  name: hello-world-lb
  labels:
    app: hello-world-lb
spec:
  ports:
  - port: 8080
    name: web
  type: LoadBalancer
  selector:
    app: hello-world
---
apiVersion: v1
kind: Service
metadata:
  name: hello-world
  labels:
    app: hello-world
spec:
  ports:
  - port: 8080
    name: web
  - port: 3080
    name: hello-world
  clusterIP: None
  selector:
    app: hello-world
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: hello-world
spec:
  selector:
    matchLabels:
      app: hello-world
  serviceName: "hello-world"
  replicas: 3
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: hello-world
        image: hello-world
        imagePullPolicy: Never
        ports:
        - containerPort: 8080
          name: web
        - containerPort: 3080
          name: hello-world
        livenessProbe:
          tcpSocket:
            port: 8080

EDIT Currently the code is as follows:

  • Query A/SRV records of hello-world.default.svc.cluster.local./_hello-world._tcp.hello-world.default.svc.cluster.local. and print them for debug
  • Bind to 3080 port and start to listen (connecting logic not implemented)
  • Open 8080 port

I expected that A/SRV records for hello-world-0 will be empty, for hello-world-1 will contain hello-world-0 and for hello-world-N+1 will contain hello-world-0 to hello-world-N. During rolling update A/SRV records would contain all other peers.

However it seems that DNS records are updated asynchronously so even when the lifeness of pod n is detected and pod n + 1 is started it is not guaranteed that pod n + 1 will see address of pod n in DNS.

-- Maciej Piechotka
dns
kubernetes
kubernetes-service
kubernetes-statefulset

1 Answer

5/27/2019

add the below annotation to service definition

  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
-- P Ekambaram
Source: StackOverflow