I have installed Kubernetes v1.13.10 on a group of VMs running CentOS 7. When I deploy pods, they can connect to one another but cannot connect to anything outside of the cluster. The CoreDNS pods have these errors in the log:
[ERROR] plugin/errors: 2 app.harness.io.xentaurs.com. A: unreachable backend: read udp 172.21.0.33:48105->10.20.10.52:53: i/o timeout
[ERROR] plugin/errors: 2 app.harness.io.xentaurs.com. AAAA: unreachable backend: read udp 172.21.0.33:49098->10.20.10.51:53: i/o timeout
[ERROR] plugin/errors: 2 app.harness.io.xentaurs.com. AAAA: unreachable backend: read udp 172.21.0.33:53113->10.20.10.51:53: i/o timeout
[ERROR] plugin/errors: 2 app.harness.io.xentaurs.com. A: unreachable backend: read udp 172.21.0.33:39648->10.20.10.51:53: i/o timeout
The IPs 10.20.10.51 and 10.20.10.52 are the internal DNS servers and are reachable from the nodes. I did a Wireshark capture from the DNS servers, and I see the traffic is coming in from the CoreDNS pod IP address 172.21.0.33. There would be no route for the DNS servers to get back to that IP as it isn't routable outside of the Kubernetes cluster.
My understanding is that an iptables rule should be implemented to nat the pod IPs to the address of the node when a pod is trying to communicate outbound (correct?). Below is the POSTROUTING chain in iptables:
[root@kube-aci-1 ~]# iptables -t nat -L POSTROUTING -v --line-number
Chain POSTROUTING (policy ACCEPT 23 packets, 2324 bytes)
num pkts bytes target prot opt in out source destination
1 1990 166K KUBE-POSTROUTING all -- any any anywhere anywhere /* kubernetes postrouting rules */
2 0 0 MASQUERADE all -- any ens192.152 172.21.0.0/16 anywhere
Line 1 was added by kube-proxy and line 2 was a line I manually added to try to nat anything coming from the pod subnet 172.21.0.0/16 to the node interface ens192.152, but that didn't work.
Here's the kube-proxy logs:
[root@kube-aci-1 ~]# kubectl logs kube-proxy-llq22 -n kube-system
W1117 16:31:59.225870 1 proxier.go:498] Failed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1117 16:31:59.232006 1 proxier.go:498] Failed to load kernel module ip_vs_rr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1117 16:31:59.233727 1 proxier.go:498] Failed to load kernel module ip_vs_wrr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1117 16:31:59.235700 1 proxier.go:498] Failed to load kernel module ip_vs_sh with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1117 16:31:59.255278 1 server_others.go:296] Flag proxy-mode="" unknown, assuming iptables proxy
I1117 16:31:59.289360 1 server_others.go:148] Using iptables Proxier.
I1117 16:31:59.296021 1 server_others.go:178] Tearing down inactive rules.
I1117 16:31:59.324352 1 server.go:484] Version: v1.13.10
I1117 16:31:59.335846 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I1117 16:31:59.336443 1 config.go:102] Starting endpoints config controller
I1117 16:31:59.336466 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller
I1117 16:31:59.336493 1 config.go:202] Starting service config controller
I1117 16:31:59.336499 1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I1117 16:31:59.436617 1 controller_utils.go:1034] Caches are synced for service config controller
I1117 16:31:59.436739 1 controller_utils.go:1034] Caches are synced for endpoints config controller
I have tried flushing the iptables nat table as well as restarted kube-proxy on all nodes, but the problem still persisted. Any clues in the output above, or thoughts on further troubleshooting?
Output of kubectl get nodes:
[root@kube-aci-1 ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kube-aci-1 Ready master 85d v1.13.10 10.10.52.217 <none> CentOS Linux 7 (Core) 3.10.0-957.el7.x86_64 docker://1.13.1
kube-aci-2 Ready <none> 85d v1.13.10 10.10.52.218 <none> CentOS Linux 7 (Core) 3.10.0-957.el7.x86_64 docker://1.13.1
Turns out it is necessary to use a subnet that is routable on the network with the CNI in use if outbound communication from pods is necessary. I made the subnet routable on the external network and the pods can now communicate outbound.