After installing just the basic Kubernetes packages and working with minikube, I have started just the basic kube-system pods. I'm trying to investigate why the kube-dns is not able to resolve domain names
Here are the versions I'm using
Client:
Version: 18.06.1-ce
API version: 1.38
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:24:56 2018
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 18.06.1-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: e68fc7a
Built: Tue Aug 21 17:23:21 2018
OS/Arch: linux/amd64
Experimental: false
minikube version: v0.28.2
Kubectl:
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Kubeadm:
kubeadm version: &version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
VirtualBox:
Version 5.2.18 r124319 (Qt5.6.2)
Here are the system pods I have deployed:
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox 1/1 Running 0 31m
kube-system etcd-minikube 1/1 Running 0 32m
kube-system kube-addon-manager-minikube 1/1 Running 0 33m
kube-system kube-apiserver-minikube 1/1 Running 0 33m
kube-system kube-controller-manager-minikube 1/1 Running 0 33m
kube-system kube-dns-86f4d74b45-xjfmv 3/3 Running 2 33m
kube-system kube-proxy-2kkzk 1/1 Running 0 33m
kube-system kube-scheduler-minikube 1/1 Running 0 33m
kube-system kubernetes-dashboard-5498ccf677-pz87g 1/1 Running 0 33m
kube-system storage-provisioner 1/1 Running 0 33m
I've also deployed busybox to allow me to execute commands inside the containers
kubectl exec busybox -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local mapleworks.com
options ndots:5
and
kubectl exec busybox nslookup google.com
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
nslookup: can't resolve 'google.com'
command terminated with exit code 1
The same commands run on the VM itself yield the following:
cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 127.0.1.1
search mapleworks.com <<< OUR local DNS server
nslookup google.com
Server: 127.0.1.1
Address: 127.0.1.1#53
Non-authoritative answer:
Name: google.com
Address: 172.217.13.174
Questions: kube-dns is using the default nameserver 10.96.0.10 whereas I would have expected the VM nameserver would have been imported into kubernetes.
While this same nameserver deployed on a native Windows or Mac platform is able to properly resolve domain names, this VM has an issue with it.
Is this some sort of Firewall issue as I've seen mentioned in some other posts?
I have inspected the kube-dns container logs but the most relevant are from the sidecar container.
I0910 15:47:17.667100 1 main.go:51] Version v1.14.8
I0910 15:47:17.667195 1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
I0910 15:47:17.667240 1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
I0910 15:47:17.668244 1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
W0910 15:50:04.780281 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:34535->127.0.0.1:53: i/o timeout
W0910 15:50:11.781236 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:50887->127.0.0.1:53: i/o timeout
W0910 15:50:24.844065 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:52865->127.0.0.1:53: i/o timeout
W0910 15:50:31.845587 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:42053->127.0.0.1:53: i/o timeout
The i/o timeouts correspond to the manual DNS query I've performed on google.com, I think
Otherwise I see here the localhost address and the port 53
I just don't know what is going on...
Check the ConfigMap
for your kube-dns
server. Do you have upstreamNameservers: |
configured? More info: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/
I managed to resolve this on with help from a colleague. It turns out that there is a difference in terms of Desktop and Server versions of Ubuntu installations On a server the /etc/network/interface
lists the primary interface which blocks a Network/Manager process from running a local dnsmasq service from running
When I added the following lines to this file on a Desktop installation:
#Primary Network Interfaces
auto enp0s3
iface enp0s3 inet dhcp
then the kube-dnsmasq was passed the upstream nameserver address and was then able to resolve any DNS requests
Here is an example of the Network Manager processes running after the change
gilles@gilles-VirtualBox:~$ ps -ef | grep Network
root 870 1 0 16:52 ? 00:00:00 /usr/sbin/NetworkManager --no-daemon
gilles 6991 5316 0 16:55 pts/17 00:00:00 grep --color=auto Network
Here is an example of the dnsmasq container logs after the change:
I0911 20:52:47.878050 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0911 20:52:47.878063 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0911 20:52:47.878070 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0911 20:52:47.878080 1 nanny.go:116] dnsmasq[10]: reading /etc/resolv.conf
I0911 20:52:47.878086 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0911 20:52:47.878092 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0911 20:52:47.878097 1 nanny.go:116] dnsmasq[10]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0911 20:52:47.878103 1 nanny.go:116] dnsmasq[10]: using nameserver 172.28.1.3#53
I0911 20:52:47.878109 1 nanny.go:116] dnsmasq[10]: using nameserver 172.28.1.4#53
The last two lines were present only after the change
And then
kubectl exec busybox -- nslookup google.com
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: google.com
Address 1: 2607:f8b0:4020:804::200e yul02s04-in-x0e.1e100.net
Address 2: 172.217.13.110 yul02s04-in-f14.1e100.net
I hope this can be of value to others
Each kubelet
in a k8s cluster has --cluster-dns
option. This option, in fact, provides a Service
name for kube-dns Deployment
. Each kube-dns
Pod, in turn, has dnsmasq
container, which is using a list of nameservers from the k8s node. You can check it in dnsmasq
container's logs:
I0720 03:49:51.081031 1 nanny.go:116] dnsmasq[13]: reading /etc/resolv.conf
I0720 03:49:51.081068 1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0720 03:49:51.081099 1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0720 03:49:51.081130 1 nanny.go:116] dnsmasq[13]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0720 03:49:51.081160 1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_1>#53
I0720 03:49:51.081190 1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_2>#53
I0720 03:49:51.081222 1 nanny.go:116] dnsmasq[13]: using nameserver <nameserver_N>#53
When any Pod
is created, by default, it has got nameserver <CLUSTER_DNS_IP>
entry in /etc/resolve.conf
. That's how any Pod can (or cannot) resolve certain domain name - through the kube-dns
service.
For example, my cluster-dns is 10.233.0.3:
$ kubectl -n test run -it --image=alpine:3.6 alpine -- sh
If you don't see a command prompt, try pressing enter.
/ # cat /etc/resolv.conf
nameserver 10.233.0.3
search test.svc.cluster.local svc.cluster.local cluster.local test.kz
/ # nslookup kubernetes-charts.storage.googleapis.com 10.233.0.3
Server: 10.233.0.3
Address 1: 10.233.0.3 kube-dns.kube-system.svc.cluster.local
Name: kubernetes-charts.storage.googleapis.com
Address 1: 74.125.131.128 lu-in-f128.1e100.net
Address 2: 2a00:1450:4010:c05::80 li-in-x80.1e100.net
So, if a Node
(where the kube-dns
is scheduled to) can resolve certain domain name, then any Pod can do the same.