I have the following Kubernetes YAML with a StatefulSet I use to deploy a PostgreSQL cluster with Patroni. However, the question is relative to how Kubernetes registers Pod names in CoreDNS.
According to this documentation in the Stable Network ID section if I create a Headless service called spilodemo-svc
for my Pods I can access them using the short hostname (podname.servicename):
spilodemo-0.spilodemo-svc
Basically, my code worked properly for a long time on a K8s cluster deployed with kubeadm on VirtualBox and Vagrant. Today I wanted to deploy it on IBM Cloud but the hostname above didn't work and the strange thing is that when I repeated my tests on Vagrant/VirtualBox again I wasn't able to have it working anymore and I do not know why.
Now the YAML deploys Spilo that is an open-source project developed by Zalando that is a Docker image with Patroni and PostgreSQL. My code comes from their example here.
Basically, they create a ClusterIP Service (and not a Headless) with no Selector. Under these conditions, Kubernetes doesn't create an Endpoint in it. For this reason, we have an Endpoint in the YAML with the same name of the service (it seems this is the binding Kubernetes expect).
Spilo has Python code that always keeps updated this Endpoint with the IP of the primary node.
The StatefulSet has the field serviceName equal to the name of the Service:
serviceName: spilodemo-svc
and, according to the documentation, this guarantees that Kubernetes creates an entry in CoreDNS for this short hostname (podname.servicename):
spilodemo-0.spilodemo-svc
and it worked for a long time until today and nothing happened in the meanwhile. To be honest I never fully understand how the DNS name spilodemo-0.spilodemo-svc
worked so far since it uses a ClusterIP service instead of a Headless one.
Another strange thing is that the Zalando team uses another Headless service that I called spilodemo-config
and according to a comment in their code, it should avoid that Kubernetes delete the Endpoint but this doesn't make much sense to me.
However, today I also tried to convert the Service into a Headless one removing the spilodemo-config
one but no luck. Kubernetes only create the entry for the service in the CoreDNS:
spilodemo.spilons.svc.cluster.local
but not the one for each Pod:
spilodemo-0.spilodemo-svc
spilodemo-1.spilodemo-svc
spilodemo-2.spilodemo-svc
Can anyone help me to figure out what's going on with my YAML file and how I can get the three short hostnames above working in CoreDNS?
PS On Stackoverflow I found these discussions:
Stateful Pod hostname doesn't resolve but they don't address my issue.
1: https://github.com/sasadangelo/patroni-k8s/blob/main/kustomize/spilo/iks/spilo.yaml
2: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
3: https://github.com/zalando/spilo/blob/master/kubernetes/spilo_kubernetes.yaml
4: https://stackoverflow.com/questions/67285052/hostname-of-pods-in-same-statefulset-can-not-be-resolved
5: https://stackoverflow.com/questions/58815171/stateful-pod-hostname-doesnt-resolve
After almost three days of tests, I found a solution. The solution depends on two things: 1. how Kubernetes works; 2. how Patroni works.
How Kubernetes Works
When you create a StatefulSet deployment (but this is true also for Deployment), let's say with 3 pods, Kubernetes register in CoreDNS three DNS names:
IP-with-dashes.<namespace>.pod.cluster.local
however, these names are useless for me because I cannot set them in advance on my YAML files because it depends on the IP Kubernetes assigned to Pods.
However, for StatefulSet deployments, according to this documentation in the Stable Network ID section if I create a Headless service for my Pod I can access them using the short hostname (podname.servicename) of FQDN (<podname>.<servicename governor>.<namespace>.svc.cluster.local).
Here is the Headless service I needed to create:
---
apiVersion: v1
kind: Service
metadata:
name: spilodemo-svc
labels:
application: spilo
spilo-cluster: spilodemo
spec:
clusterIP: None
selector:
application: spilo
spilo-cluster: spilodemo
It's important here to set the selector to bind all three pods. Another important thing is to add the following line to your StatefulSet with a name equal to the headless service:
serviceName: spilodemo-svc
This is the Kubernetes part. Now you can reference your Pods with DNS names:
spilodemo-0.spilodemo-svc
spilodemo-1.spilodemo-svc
spilodemo-2.spilodemo-svc
or FQDN:
spilodemo-0.spilodemo-svc.<namespace>.svc.cluster.local
spilodemo-1.spilodemo-svc.<namespace>.svc.cluster.local
spilodemo-2.spilodemo-svc.<namespace>.svc.cluster.local
How Patroni Works
However, using Pods' DNS name is not meaningful for clients because they need a single point of access. For this reason, the Patroni team suggest to create a ClusterIP service like this:
---
apiVersion: v1
kind: Service
metadata:
name: spilodemo
labels:
application: spilo
spilo-cluster: spilodemo
spec:
type: ClusterIP
ports:
- name: postgresql
port: 5432
targetPort: 5432
Note: there is no selector. This is not an error. When you create a Service like this Kubernetes creates a ClusterIP service (then it can be referenced using an IP or hostname) but without an Endpoint. This means that you connect to its IP or its DNS name: spilodemo.<namespace>.svc.cluster.local
, the connection hangs.
For this reason, the Patroni team asks you to add in your YAML file the following Endpoint having the same name as the ClusterIP service.
apiVersion: v1
kind: Endpoints
metadata:
name: spilodemo
labels:
application: spilo
spilo-cluster: spilodemo
subsets: []
Patroni, internally, has a piece of code in Python that via Kubernetes API updates this endpoint with the Master Pod IP. Patroni is able to determine the Endpoint to update using its relative labels above (application, spilo-cluster) that you can even customize.
At this point, Patroni cluster clients only need to use this DNS name (the ClusterIP one) or the relative IP:
spilodemo.spilons.svc.cluster.local
the connection is automatically redirected to the Pod master node IP.
So far so good. Now the confusing part. If you look at the Patroni Kubernetes sample file in Spilo code, you node another headless service was already present.
---
# headless service to avoid deletion of patronidemo-config endpoint
apiVersion: v1
kind: Service
metadata:
name: spilodemo-config
labels:
application: spilo
spilo-cluster: spilodemo
spec:
clusterIP: None
What confuse me was the presence of this headless service. I didn't understand its purpose. In the beginning, I thought it was the headless service required to have the Pods DNS name mentioned above. But I was wrong. The purpose of this service is different. Basically, the Zalando team doesn't know how the user writes the YAML file to deploy Patroni. If the user creates the Endpoint but forgot to associate to it a Service, Kubernetes see it as an orphan and delete it. For this reason, the Patroni code itself creates this service on its own. In fact, if you don't define it in the YAML file, Patroni will create it for you.
So, if Patroni creates it for you why do they add it in the sample YAML above? The reason is permissions. If Pod doesn't have permissions cannot create it. This is the reason they added it in the YAML. It's a bit confusing but this is the whole story.