DNS does not work over TCP from pod

11/21/2016

I have an Openshift Origin installation (v. 1.2.1, but also reproduced this issue on 1.3.0), and I'm trying to get pods' IPs from DNS by service name. Assume my node has IP 192.168.58.6, and I look for pods of headless service 'hz' in project 'hz-test'. When I try to send DNS request to dnsmasq (which is installed on nodes and forwards requests to Kubernetes' SkyDNS) over UDP, everything goes well:

# dig +notcp +noall +answer hz.hz-test.svc.cluster.local @192.168.58.6  
hz.hz-test.svc.cluster.local. 14 IN A   10.1.2.5
<and so on...>

However, when I switch transport protocol to TCP, I receive the following error:

# dig +tcp +noall +answer hz.hz-test.svc.cluster.local @192.168.58.6  
;; communications error to 192.168.58.6#53: end of file

After looking on tcpdump output, I've discovered, that after establishing a TCP connection (SYN - SYN/ACK - ACK) dnsmasq immediately sends back FIN/ACK, and when the DNS client tries to send its request using this connection, dnsmasq sends back RST packet instead of DNS answer. I've tried to perform the same DNS query over TCP from the node iteself, and dnsmasq gave me usual response, i.e. it worked normally over TCP, and the problem arose only when I tried to perform request from pod. Also, I've tried to send the same query over TCP directly from pod to Kubernetes' DNS (avoiding dnsmasq), and this query was OK too.

So, why dnsmasq on nodes ignores TCP requests from pods, and why any other communications are okay? Is it supposed behavior?

Any help and ideas are appreciated.

-- Danila Kiver
dnsmasq
kubernetes
openshift-origin

1 Answer

11/23/2016

Finally, the reason was that dnsmasq was configured to listen node's IP (listen-adress=192.168.58.6). With such configuration dnsmasq binds to all node's network interfaces, but tries to reject "wrong" connections in userspace (i.e. on its own).

I don't really understand, why dnsmasq decided that requests from pod to 192.168.58.6 were forbidden with such configuration, but I got it working by changing "listen-address" to

interface=eth0
bind-interfaces

which forced dnsmasq to actually bind only to NIC with IP 192.168.58.6. After that dnsmasq started to accept all TCP requests.

-- Danila Kiver
Source: StackOverflow