kubernetes dns search domain appended in some queries

6/24/2019

I have running a kubernetes cluster with a master and 2 worker nodes.

root@kube-master:~# kubectl get nodes
NAME           STATUS   ROLES    AGE     VERSION
kube-master    Ready    master   4d19h   v1.14.3
kube-node-01   Ready    <none>   4d18h   v1.14.3
kube-node-02   Ready    <none>   6h3m    v1.14.3

Now my traefik ingress controller is not able to resolve dns queries.

/ # nslookup acme-v02.api.letsencrypt.org
nslookup: can't resolve '(null)': Name does not resolve

Name:      acme-v02.api.letsencrypt.org
Address 1: <my.public.ip> mail.xxx.xxx

Now with tcpdump on my opnsense box I receive queries with my internal search domain appended resolving to my public ip which is wrong.

But for some reason ... spinning up a busybox test pod is working ...

/ # nslookup acme-v02.api.letsencrypt.org
Server:    10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local

Name:      acme-v02.api.letsencrypt.org
Address 1: 2a02:26f0:ef:197::3a8e g2a02-26f0-00ef-0197-0000-0000-0000-3a8e.deploy.static.akamaitechnologies.com
Address 2: 2a02:26f0:ef:181::3a8e g2a02-26f0-00ef-0181-0000-0000-0000-3a8e.deploy.static.akamaitechnologies.com
Address 3: 104.74.120.43 a104-74-120-43.deploy.static.akamaitechnologies.com

Both /etc/resolve.conf files are the same expect the namespace

Since kubernetes 1.11 coredns ist the default dns resolve system. On this page debugging the dns system with coredns says I should use

root@kube-master:~# kubectl get pods --namespace=kube-system -l k8s-app=coredns
No resources found.

But this does not return anything! Using the kube-dns returns coredns pods!

root@kube-master:~# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME                      READY   STATUS    RESTARTS   AGE
coredns-fb8b8dccf-jmhdm   1/1     Running   5          4d19h
coredns-fb8b8dccf-tfw7v   1/1     Running   5          4d19h

Whats going on here?! Is the documentation wrong or something inside my cluster?

-- Pascal
coredns
kubernetes

2 Answers

7/30/2019

The default ndots:n is 5. This means that if the name contains less than 5 dots inside it, the syscall will try to resolve it sequentially going through all local search domains first and - in case none succeed - will resolve it as an absolute name only at last.

-- Hang Du
Source: StackOverflow

6/27/2019

I will show and explain you using nginx ingress controller example. I believe situation with traefik ingress controller is the same.

So first of all - regarding kube-dns and coredns mess you are describing: this is implemented by design. you can refer to github coredns is still labeled as kube-dns issue to read more.

In my cluster I also have coredns service that is called kube-dns and its referring to coredns pods that have k8s-app=kube-dns label

kubectl describe service kube-dns -n kube-system
Name:              kube-dns
Namespace:         kube-system
Labels:            k8s-app=kube-dns
                   kubernetes.io/cluster-service=true
                   kubernetes.io/name=KubeDNS
Annotations:       prometheus.io/port: 9153
                   prometheus.io/scrape: true
Selector:          k8s-app=kube-dns
Type:              ClusterIP
IP:                10.96.0.10
Port:              dns  53/UDP
TargetPort:        53/UDP
Endpoints:         10.32.0.2:53,10.32.0.9:53
Port:              dns-tcp  53/TCP
TargetPort:        53/TCP
Endpoints:         10.32.0.2:53,10.32.0.9:53
Port:              metrics  9153/TCP
TargetPort:        9153/TCP
Endpoints:         10.32.0.2:9153,10.32.0.9:9153
Session Affinity:  None
Events:            <none>

kubectl get pods -n kube-system -l k8s-app=kube-dns -o wide
NAME                      READY   STATUS    RESTARTS   AGE     IP          NODE                     NOMINATED NODE   READINESS GATES
coredns-fb8b8dccf-42285   1/1     Running   0          3h26m   10.32.0.9   kubernetessandbox-1-vm   <none>           <none>
coredns-fb8b8dccf-87j5v   1/1     Running   0          3h26m   10.32.0.2   kubernetessandbox-1-vm   <none>           <none>

When Im spinning up new busybox pod - it has /etc/resolv.conf that points to service kube-dns(10.96.0.10) and have correct search:

cat /etc/resolv.conf
search kube-system.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal. google.internal.
nameserver 10.96.0.10
options ndots:5

But at the same time my nginx ingress controller pod has nameserver 169.254.169.254 and also not able to nslookup even kubernetes.default

cat /etc/resolv.conf
search c.myproj.internal. google.internal.
nameserver 169.254.169.254

Not sure what you have in /etc/resolv.conf on traefic pod, but issue is there. And that /etc/resolv.conf you have goes from your node

Setting dnsPolicy: ClusterFirstWithHostNet instead of dnsPolicy: ClusterFirst should solve this issue if ingress uses hostNetwork.

From dns-pod-service documentation:

“ClusterFirstWithHostNet“: For Pods running with hostNetwork, you should explicitly set its DNS policy “ClusterFirstWithHostNet”.

After editing nginx-ingress-controller deployment from

  dnsPolicy: ClusterFirst
  hostNetwork: true

to

  dnsPolicy: ClusterFirstWithHostNet
  hostNetwork: true

pod was recreated with desired /etc/resolv.conf:

cat /etc/resolv.conf
search kube-system.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal. google.internal.
nameserver 10.96.0.10
options ndots:5

nslookup kubernetes.default
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.96.0.1

Few urls for you with hostNetwork/dnsPolicy related issues and explanations. This is important part of configuring Traefik correctly:

1) Traefik on k8s not listening externally without changing deployment

2) Stack traefik question

3) Ingress with Traefik article:

dnsPolicy: ClusterFirstWithHostNet

This setting is important. It will configure the Traefik pods to use the Kubernetes cluster internal DNS server (most likely KubeDNS or maybe CoreDNS). That means the pods /etc/resolv.conf will be configured to use the Kubernetes DNS server. Otherwise the DNS server of the Kubernetes node will be used (basically /etc/resolv.conf of the worker node but that can’t resolv cluster.local DNS e.g.).

Hope it helps

-- VKR
Source: StackOverflow