I have running a kubernetes cluster with a master and 2 worker nodes.
root@kube-master:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-master Ready master 4d19h v1.14.3
kube-node-01 Ready <none> 4d18h v1.14.3
kube-node-02 Ready <none> 6h3m v1.14.3
Now my traefik ingress controller is not able to resolve dns queries.
/ # nslookup acme-v02.api.letsencrypt.org
nslookup: can't resolve '(null)': Name does not resolve
Name: acme-v02.api.letsencrypt.org
Address 1: <my.public.ip> mail.xxx.xxx
Now with tcpdump on my opnsense box I receive queries with my internal search domain appended resolving to my public ip which is wrong.
But for some reason ... spinning up a busybox test pod is working ...
/ # nslookup acme-v02.api.letsencrypt.org
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: acme-v02.api.letsencrypt.org
Address 1: 2a02:26f0:ef:197::3a8e g2a02-26f0-00ef-0197-0000-0000-0000-3a8e.deploy.static.akamaitechnologies.com
Address 2: 2a02:26f0:ef:181::3a8e g2a02-26f0-00ef-0181-0000-0000-0000-3a8e.deploy.static.akamaitechnologies.com
Address 3: 104.74.120.43 a104-74-120-43.deploy.static.akamaitechnologies.com
Both /etc/resolve.conf files are the same expect the namespace
Since kubernetes 1.11 coredns ist the default dns resolve system. On this page debugging the dns system with coredns says I should use
root@kube-master:~# kubectl get pods --namespace=kube-system -l k8s-app=coredns
No resources found.
But this does not return anything! Using the kube-dns
returns coredns
pods!
root@kube-master:~# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-jmhdm 1/1 Running 5 4d19h
coredns-fb8b8dccf-tfw7v 1/1 Running 5 4d19h
Whats going on here?! Is the documentation wrong or something inside my cluster?
The default ndots:n is 5. This means that if the name contains less than 5 dots inside it, the syscall will try to resolve it sequentially going through all local search domains first and - in case none succeed - will resolve it as an absolute name only at last.
I will show and explain you using nginx ingress controller example. I believe situation with traefik ingress controller is the same.
So first of all - regarding kube-dns
and coredns
mess you are describing: this is implemented by design. you can refer to github coredns is still labeled as kube-dns issue to read more.
In my cluster I also have coredns
service that is called kube-dns
and its referring to coredns
pods that have k8s-app=kube-dns
label
kubectl describe service kube-dns -n kube-system
Name: kube-dns
Namespace: kube-system
Labels: k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=KubeDNS
Annotations: prometheus.io/port: 9153
prometheus.io/scrape: true
Selector: k8s-app=kube-dns
Type: ClusterIP
IP: 10.96.0.10
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints: 10.32.0.2:53,10.32.0.9:53
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints: 10.32.0.2:53,10.32.0.9:53
Port: metrics 9153/TCP
TargetPort: 9153/TCP
Endpoints: 10.32.0.2:9153,10.32.0.9:9153
Session Affinity: None
Events: <none>
kubectl get pods -n kube-system -l k8s-app=kube-dns -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-fb8b8dccf-42285 1/1 Running 0 3h26m 10.32.0.9 kubernetessandbox-1-vm <none> <none>
coredns-fb8b8dccf-87j5v 1/1 Running 0 3h26m 10.32.0.2 kubernetessandbox-1-vm <none> <none>
When Im spinning up new busybox pod - it has /etc/resolv.conf that points to service kube-dns(10.96.0.10) and have correct search:
cat /etc/resolv.conf
search kube-system.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal. google.internal.
nameserver 10.96.0.10
options ndots:5
But at the same time my nginx ingress controller pod has nameserver 169.254.169.254
and also not able to nslookup even kubernetes.default
cat /etc/resolv.conf
search c.myproj.internal. google.internal.
nameserver 169.254.169.254
Not sure what you have in /etc/resolv.conf
on traefic pod, but issue is there. And that /etc/resolv.conf
you have goes from your node
Setting dnsPolicy: ClusterFirstWithHostNet instead of dnsPolicy: ClusterFirst should solve this issue if ingress uses hostNetwork.
From dns-pod-service documentation:
“ClusterFirstWithHostNet“: For Pods running with hostNetwork, you should explicitly set its DNS policy “ClusterFirstWithHostNet”.
After editing nginx-ingress-controller deployment from
dnsPolicy: ClusterFirst
hostNetwork: true
to
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true
pod was recreated with desired /etc/resolv.conf:
cat /etc/resolv.conf
search kube-system.svc.cluster.local svc.cluster.local cluster.local c.myproj.internal. google.internal.
nameserver 10.96.0.10
options ndots:5
nslookup kubernetes.default
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1
Few urls for you with hostNetwork/dnsPolicy related issues and explanations. This is important part of configuring Traefik correctly:
1) Traefik on k8s not listening externally without changing deployment
3) Ingress with Traefik article:
dnsPolicy: ClusterFirstWithHostNet
This setting is important. It will configure the Traefik pods to use the Kubernetes cluster internal DNS server (most likely KubeDNS or maybe CoreDNS). That means the pods /etc/resolv.conf will be configured to use the Kubernetes DNS server. Otherwise the DNS server of the Kubernetes node will be used (basically /etc/resolv.conf of the worker node but that can’t resolv cluster.local DNS e.g.).
Hope it helps