kubernetes.default not resolved by DNS

1/12/2016

I am sort of at a lost as to what is going on with the DNS. I followed the instructions outlined here: https://github.com/kubernetes/kubernetes/tree/v1.2.0-alpha.5/cluster/addons/dns

When my test pod runs on the node with the DNS pod it seems to work, however when my test pod runs on the node WITHOUT the DNS pod it doesn't. This leads me to believe that something isn't configured right, but I am not sure what.

I have a master node and two minion nodes running CentOS using Kubernetes:1.2.0-alpha.5. The DNS is running on minion-2. The test pod I am using is per the example outlined in the link above.

If the test pod is ran on minion-1(node WITHOUT DNS) then testing for DNS yields:

$ kubectl exec busybox -- nslookup kubernetes.default
Server:    10.0.0.10
Address 1: 10.0.0.10

nslookup: can't resolve 'kubernetes.default'
error: error executing remote command: Error executing command in container: Error executing in Docker Container: 1

The test pods resolv.conf is:

search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.0.0.10
options ndots:5

Service Cluster IP Range: 10.0.0.0/24 Flannel Net Configuration: 18.16.0.0/16

I have already looked over most of the DNS issues in the Kuberentes repo and most of the documentation related to DNS, but I am still at a lose. If anyone could provide me with any insight or additional documentation to look at for debugging this issue it would be greatly appreciated.

UPDATE Okay it appears the issue may be due to me not having flannel properly setup. I have restarted setting up and verifying flannel following the instructions outlined here: https://github.com/kubernetes/kubernetes/blob/v1.2.0-alpha.5/docs/getting-started-guides/fedora/flannel_multi_node_cluster.md

From minion-1 I get:

$ ip -4 a|grep inet
inet 127.0.0.1/8 scope host lo
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3
inet 172.28.128.9/24 brd 172.28.128.255 scope global enp0s8
inet 18.16.45.0/16 scope global flannel0
inet 18.16.45.1/24 scope global docker0

From minion-2 I get:

$ ip -4 a|grep inet
inet 127.0.0.1/8 scope host lo
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3
inet 172.28.128.9/24 brd 172.28.128.255 scope global enp0s8
inet 18.16.45.0/16 scope global flannel0
inet 18.16.45.1/24 scope global docker0

This seems to match up with what is documented so far. However, for the next step I am not getting one block for each node. I am only seeing one block:

$ curl -s http://172.28.128.8:4001/v2/keys/coreos.com/network/subnets | python -mjson.tool
    {
        "action": "get",
        "node": {
            "createdIndex": 15,
            "dir": true,
            "key": "/coreos.com/network/subnets",
            "modifiedIndex": 15,
            "nodes": [
                {
                    "createdIndex": 18,
                    "expiration": "2016-01-13T19:59:09.489854201Z",
                    "key": "/coreos.com/network/subnets/18.16.45.0-24",
                    "modifiedIndex": 18,
                    "ttl": 86272,
                    "value": "{\"PublicIP\":\"10.0.2.15\"}"
                }
            ]
        }
    }

All the subnet.env files seem to match up From minion-1:

$ cat /run/flannel/subnet.env
    FLANNEL_NETWORK=18.16.0.0/16
    FLANNEL_SUBNET=18.16.45.1/24
    FLANNEL_MTU=1472
    FLANNEL_IPMASQ=true

From minion-2:

$ cat /run/flannel/subnet.env
    FLANNEL_NETWORK=18.16.0.0/16
    FLANNEL_SUBNET=18.16.45.1/24
    FLANNEL_MTU=1472
    FLANNEL_IPMASQ=true

If I continue on to the next steps to verify cross-host container communication I end up with the containers on minion-1 and minion-2 having the same IP address. This is obviously not right.

[root@ecf25fd80d85 /]# ip -4 a l eth0 | grep inet
    inet 18.16.45.2/24 scope global eth0

[root@9941f62e621d /]# ip -4 a l eth0 | grep inet
    inet 18.16.45.2/24 scope global eth0

The flannel service is being started with the following parameters if this helps any:

flanneld --ip-masq -etcd-endpoints=http://172.28.128.8:4001 -etcd-prefix=/coreos.com/network 
-- tbs
kubernetes

1 Answer

1/12/2016

The issue was due to me not explicitly stating which networking interface flannel should use for inter-host communication. It seems to default the first interface it finds, but that wasn't the correct one to use for my setup. So to fix the issue I just had to tell flannel explicitly which interface to use with the --iface command line option.

-- tbs
Source: StackOverflow