I'm seeing a weird issue on kubernetes and I'm not sure how to debug it. The k8s environment was installed by kube-up for vsphere using the 2016-01-08 kube.vmdk
The symptom is that the dns for a container in a pod is not working correctly. When I logon to the kube-dns service to check the settings everything looks correct. When I ping outside the local network it works as it should but when I ping inside my local network it cannot reach any of the hosts.
For the following my host network is 10.1.1.x, the gateway / dns server is 10.1.1.1.
(I can ping outside the network by ip and I can ping the gateway just fine. dns isn't working since the nameserver is unreachable)
kube@kubernetes-master:~$ kubectl --namespace=kube-system exec -ti kube-dns-v20-in2me -- /bin/sh
/ # cat /etc/resolv.conf
nameserver 10.1.1.1
options ndots:5
/ # ping google.com
^C
/ # ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=54 time=13.542 ms
64 bytes from 8.8.8.8: seq=1 ttl=54 time=13.862 ms
^C
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 13.542/13.702/13.862 ms
/ # ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1): 56 data bytes
^C
--- 10.1.1.1 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
/ # netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
default 10.244.2.1 0.0.0.0 UG 0 0 0 eth0
10.244.2.0 * 255.255.255.0 U 0 0 0 eth0
/ # ping 10.244.2.1
PING 10.244.2.1 (10.244.2.1): 56 data bytes
64 bytes from 10.244.2.1: seq=0 ttl=64 time=0.249 ms
64 bytes from 10.244.2.1: seq=1 ttl=64 time=0.091 ms
^C
--- 10.244.2.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.091/0.170/0.249 ms
kube@kubernetes-master:~$ netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
default 10.1.1.1 0.0.0.0 UG 0 0 0 eth0
10.1.1.0 * 255.255.255.0 U 0 0 0 eth0
10.244.0.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.1.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.2.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.3.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.246.0.0 * 255.255.255.0 U 0 0 0 cbr0
172.17.0.0 * 255.255.0.0 U 0 0 0 docker0
kube@kubernetes-master:~$ ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1) 56(84) bytes of data.
64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=0.409 ms
64 bytes from 10.1.1.1: icmp_seq=2 ttl=64 time=0.481 ms
^C
--- 10.1.1.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.409/0.445/0.481/0.036 ms
kube@kubernetes-master:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.5", GitCommit:"5a0a696437ad35c133c0c8493f7e9d22b0f9b81b", GitTreeState:"clean", BuildDate:"2016-10-29T01:38:40Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.5", GitCommit:"5a0a696437ad35c133c0c8493f7e9d22b0f9b81b", GitTreeState:"clean", BuildDate:"2016-10-29T01:32:42Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}
(Per @der's response adding info from 10.244.2.1)
kube@kubernetes-minion-2:~$ ip addr show cbr0
5: cbr0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc htb state UP group default
link/ether 8a:ef:b5:fc:28:f4 brd ff:ff:ff:ff:ff:ff
inet 10.244.2.1/24 scope global cbr0
valid_lft forever preferred_lft forever
inet6 fe80::38b5:44ff:fe8a:6d79/64 scope link
valid_lft forever preferred_lft forever
kube@kubernetes-minion-2:~$ ping google.com
PING google.com (216.58.192.14) 56(84) bytes of data.
64 bytes from nuq04s29-in-f14.1e100.net (216.58.192.14): icmp_seq=1 ttl=52 time=11.8 ms
64 bytes from nuq04s29-in-f14.1e100.net (216.58.192.14): icmp_seq=2 ttl=52 time=11.6 ms
64 bytes from nuq04s29-in-f14.1e100.net (216.58.192.14): icmp_seq=3 ttl=52 time=10.4 ms
^C
--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 10.477/11.343/11.878/0.624 ms
kube@kubernetes-minion-2:~$ ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1) 56(84) bytes of data.
64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=0.369 ms
64 bytes from 10.1.1.1: icmp_seq=2 ttl=64 time=0.456 ms
64 bytes from 10.1.1.1: icmp_seq=3 ttl=64 time=0.442 ms
^C
--- 10.1.1.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.369/0.422/0.456/0.041 ms
kube@kubernetes-minion-2:~$ netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
default 10.1.1.1 0.0.0.0 UG 0 0 0 eth0
10.1.1.0 * 255.255.255.0 U 0 0 0 eth0
10.244.0.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.1.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
10.244.2.0 * 255.255.255.0 U 0 0 0 cbr0
10.244.3.0 kubernetes-mini 255.255.255.0 UG 0 0 0 eth0
172.17.0.0 * 255.255.0.0 U 0 0 0 docker0
kube@kubernetes-minion-2:~$ routel
target gateway source proto scope dev tbl
default 10.1.1.1 eth0
10.1.1.0 24 10.1.1.86 kernel link eth0
10.244.0.0 24 10.1.1.88 eth0
10.244.1.0 24 10.1.1.87 eth0
10.244.2.0 24 10.244.2.1 kernel link cbr0
10.244.3.0 24 10.1.1.85 eth0
172.17.0.0 16 172.17.0.1 kernel linkdocker0
10.1.1.0 broadcast 10.1.1.86 kernel link eth0 local
10.1.1.86 local 10.1.1.86 kernel host eth0 local
10.1.1.255 broadcast 10.1.1.86 kernel link eth0 local
10.244.2.0 broadcast 10.244.2.1 kernel link cbr0 local
10.244.2.1 local 10.244.2.1 kernel host cbr0 local
10.244.2.255 broadcast 10.244.2.1 kernel link cbr0 local
127.0.0.0 broadcast 127.0.0.1 kernel link lo local
127.0.0.0 8 local 127.0.0.1 kernel host lo local
127.0.0.1 local 127.0.0.1 kernel host lo local
127.255.255.255 broadcast 127.0.0.1 kernel link lo local
172.17.0.0 broadcast 172.17.0.1 kernel linkdocker0 local
172.17.0.1 local 172.17.0.1 kernel hostdocker0 local
172.17.255.255 broadcast 172.17.0.1 kernel linkdocker0 local
::1 local kernel lo
fe80:: 64 kernel eth0
fe80:: 64 kernel cbr0
fe80:: 64 kernel veth6129284
default unreachable kernel lo unspec
::1 local none lo local
fe80::250:56ff:fe8e:d580 local none lo local
fe80::38b5:44ff:fe8a:6d79 local none lo local
fe80::88ef:b5ff:fefc:28f4 local none lo local
ff00:: 8 eth0 local
ff00:: 8 cbr0 local
ff00:: 8 veth6129284 local
default unreachable kernel lo unspec
How can I diagnose what is going on here?
thanks!
First, figure out what's up with kubernetes-mini
. Do on it what you've done with the 2 nodes you've shown us. All traffic between 10.1.1.0
and 10.244.2.0
goes through it. It, however, may have a bad route for the 10.1.1.0
net.
Turns out this is an issue with the default nat routing rules on the minions
$ iptables –t nat –vnxL
...
...
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
...
80 4896 MASQUERADE all -- * * 0.0.0.0/0 !10.0.0.0/8 /* kubelet: SNAT outbound cluster traffic */ ADDRTYPE match dst-type !LOCAL
...
...
This shows that all traffic coming from the 10.x.x.x network gets ignored by the postrouting rules.
If anyone runs across this fix it with:
$ iptables -t nat -I POSTROUTING 1 -s 10.244.0.0/16 -d 10.1.1.1/32 -j MASQUERADE
where 10.244.x.x/16 is the container network and 10.1.1.1 is the gateway ip