dns entries for pods in not ready state

1/15/2020

I'm trying to build a simple mongo replica set cluster in kubernetes.

i have a StatefulSet of mongod instances, with

      livenessProbe:
        initialDelaySeconds: 60
        exec:
          command:
            - mongo
            - --eval
            - "db.adminCommand('ping')"
      readinessProbe:
        initialDelaySeconds: 60
        exec:
          command:
            - /usr/bin/mongo --quiet --eval 'rs.status()' | grep ok | cut -d ':' -f 2 | tr -dc '0-9' | awk '{ if($0=="0"){ exit 127 }else{ exit 0 } }'

as you can see, my readinessProbe is checking to see if the mongo replicaSet is working correctly.

however, i get a circular dependency with (and existing) cluster reporting:

        "lastHeartbeatMessage" : "Error connecting to mongo-2.mongo:27017 :: caused by :: Could not find address for mongo-2.mongo:27017: SocketException: Host not found (authoritative)",

(where mongo-2 was undergoing a rolling update).

looking further:

$ kubectl  run --generator=run-pod/v1 tmp-shell --rm -i --tty --image nicolaka/netshoot -- /bin/bash

bash-5.0# nslookup mongo-2.mongo
Server:     10.96.0.10
Address:    10.96.0.10#53

** server can't find mongo-2.mongo: NXDOMAIN

bash-5.0# nslookup mongo-0.mongo
Server:     10.96.0.10
Address:    10.96.0.10#53

Name:   mongo-0.mongo.cryoem-logbook-dev.svc.cluster.local
Address: 10.27.137.6

so the question is whether there is a way to get kubernetes to always keep the dns entry for the mongo pods to always be present? it appears that i have a chicken and egg situation where if the entire pod hasn't passed its readiness and liveness checks, then a dns entry is not created, and hence the other mongod instances will not be able to access it.

-- yee379
dns
kubernetes
mongodb

2 Answers

1/16/2020

I believe you are misinterpreting the error.

Could not find address for mongo-2.mongo:27017: SocketException: Host not found (authoritative)"

The pod is created with an IP attached. Then it's registered into DNS:

Pod-0 has the IP 10.0.0.10 and now it's FQDN is Pod-0.servicename.namespace.svc.cluster.local

Pod-1 has the IP 10.0.0.11 and now it's FQDN is Pod-1.servicename.namespace.svc.cluster.local

Pod-2 has the IP 10.0.0.12 and now it's FQDN is Pod-2.servicename.namespace.svc.cluster.local

But DNS is a live service, IPs are dynamically assigned and can't be duplicated. So whenever it receives a request:

"Connect me with Pod-A.servicename.namespace.svc.cluster.local"

It tries to reach the registered IP and if the Pod is offline due to a rolling update, it will think the pod is unavailable and will return "Could not find the address (IP) for Pod-0.servicename" until the pod is online again or until the IP reservation expires and only then the DNS registry will be recycled.

The DNS is not discarting the DNS name registered, it's only answering it's currently offline.

You can either ignore the errors during the rolling or rethink your script and try using the internal js environment as mentioned in the comments for continuous monitoring of the mongo status.

EDIT:

  • When Pods from a StatefulSet with N replicas are being deployed, they are created sequentially, in order from {0..N-1}.
  • When Pods are being deleted, they are terminated in reverse order, from {N-1..0}.
  • This is the expected/desired default behavior.
  • So the error is expected, since the rollingUpdate makes the pod temporarily unavailable.
-- willrof
Source: StackOverflow

1/27/2020

I ended up just putting in a ClusterIP Service for each of the statefulset instances with a selector for the specific instance:

ie

apiVersion: v1
kind: Service
metadata:
  name: mongo-0
spec:
  clusterIP: 10.101.41.87
  ports:
  - port: 27017
    protocol: TCP
    targetPort: 27017
  selector:
    role: mongo
    statefulset.kubernetes.io/pod-name: mongo-0
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

and repeat for the othe stss. the key here is the selector:

statefulset.kubernetes.io/pod-name: mongo-0
-- yee379
Source: StackOverflow