Can't resolve home dns from inside k8s pod

11/13/2019

So I recently setup a single node kubernetes cluster on my home network. I have a dns server that runs on my router (DD-WRT, dnsmaq) that resolves a bunch of local domains for ease of use. server1.lan, for example resolves to 192.168.1.11.

Server 1 was setup as my single node kubernetes cluster. Excited about the possibilities of local DNS, I spun up my first deployment using a docker container called netshoot which has a bunch of helpful network debugging tools bundled in. I execd into the container, and ran a ping and got the following...

bash-5.0# ping server1.lan
ping: server1.lan: Try again

It failed, then I tried pinging google's DNS (8.8.8.8) and that worked fine.

I tried to resolve the kubernetes default domain, it worked fine

bash-5.0# nslookup kubernetes.default
Server:     10.96.0.10
Address:    10.96.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.96.0.1

The /etc/resolve.conf file looks fine from inside the pod

bash-5.0# cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

I then got to tailing the coredns logs, and I started seeing some interesting output...

2019-11-13T03:01:23.014Z [ERROR] plugin/errors: 2 server1.lan. AAAA: read udp 192.168.156.140:37521->192.168.1.1:53: i/o timeout
2019-11-13T03:01:24.515Z [ERROR] plugin/errors: 2 server1.lan. A: read udp 192.168.156.140:41964->192.168.1.1:53: i/o timeout
2019-11-13T03:01:24.515Z [ERROR] plugin/errors: 2 server1.lan. AAAA: read udp 192.168.156.140:33455->192.168.1.1:53: i/o timeout
2019-11-13T03:01:25.015Z [ERROR] plugin/errors: 2 server1.lan. AAAA: read udp 192.168.156.140:48864->192.168.1.1:53: i/o timeout
2019-11-13T03:01:25.015Z [ERROR] plugin/errors: 2 server1.lan. A: read udp 192.168.156.140:35328->192.168.1.1:53: i/o timeout

It seems like kubernetes is trying to communicate with 192.168.1.1 from inside the cluster network and failing. I guess CoreDNS uses whatever is in the resolv.conf on the host, so here is what that looks like.

nameserver 192.168.1.1

I can resolve server1.lan from everywhere else on the network, except these pods. My router IP is 192.168.1.1, and that is what is responding to DNS queries.

Any help on this would be greatly appreciated, it seems like some kind of IP routing issue between the kubernetes network and my real home network, or that's my theory anyways. Thanks in advance.

-- Kulix
coredns
dns
docker
kubernetes
networking

1 Answer

11/14/2019

So it turns out the issue was that when I initiated the cluster, I specified a pod CIDR that conflicted with IPs on my home network. My kubeadm command was this

sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-cert-extra-sans=server1.lan

Since my home network conflicted with that CIDR, and since my dns upstream was 192.168.1.1, it thought that was on the pod network and not on my home network and failed to route the DNS resolution packets appropriately.

The solution was to recreate my cluster using the following command,

sudo kubeadm init --pod-network-cidr=10.200.0.0/16 --apiserver-cert-extra-sans=server1.lan

And when I applied my calico yaml file, I made sure to replace the default 192.168.0.0/16 CIDR with the new 10.200.0.0/16 CIDR.

Hope this helps someone. Thanks.

-- Kulix
Source: StackOverflow