kube-dns gets kube-proxy Failed to list *core.Endpoints


Fresh Kubernetes (1.10.0) cluster stood up using kubeadm (1.10.0) install on RHEL7 bare metal VMs

Linux 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Dec 28 14:23:39 EST 2017 x86_64 x86_64 x86_64 GNU/Linux

kubeadm.x86_64                       1.10.0-0                installed
kubectl.x86_64                       1.10.0-0                installed
kubelet.x86_64                       1.10.0-0                installed
kubernetes-cni.x86_64                0.6.0-0                 installed

and 1.12 docker

docker-engine.x86_64                 1.12.6-1.el7.centos     installed
docker-engine-selinux.noarch         1.12.6-1.el7.centos     installed

With Flannel v0.9.1 pod network installed

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml

kubeadm init comand I ran is

kubeadm init --pod-network-cidr= --kubernetes-version stable-1.10

which completes successfully and kubeadm join on worker node also successful. I can deploy busybox pod on the master and nslookups are successful, but as soon as I deploy anything to the worker node I get failed API calls from the worker node on the master:

E0331 03:28:44.368253       1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Service: Get dial tcp getsockopt: connection refused
E0331 03:28:44.368987       1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: Get dial tcp getsockopt: connection refused
E0331 03:28:44.735886       1 event.go:209] Unable to write event: 'Post dial tcp getsockopt: connection refused' (may retry after sleeping)
E0331 03:28:51.980131       1 reflector.go:205] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:86: Failed to list *core.Endpoints: endpoints is forbidden: User "system:serviceaccount:kube-system:kube-proxy" cannot list endpoints at the cluster scope
I0331 03:28:52.048995       1 controller_utils.go:1026] Caches are synced for service config controller
I0331 03:28:53.049005       1 controller_utils.go:1026] Caches are synced for endpoints config controller

and nslookup times out

kubectl exec -it busybox -- nslookup kubernetes
Address 1:

nslookup: can't resolve 'kubernetes'
command terminated with exit code 1

I have looked at many similar post on stackoverflow and github and all seem to be resolved with setting iptables -A FORWARD -j ACCEPT but not this time. I have also included the iptables from the worker node

target     prot opt source               destination
KUBE-SERVICES  all  --  anywhere             anywhere             /* kubernetes service portals */
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
KUBE-SERVICES  all  --  anywhere             anywhere             /* kubernetes service portals */
DOCKER     all  --  anywhere            !loopback/8           ADDRTYPE match dst-type LOCAL

target     prot opt source               destination
KUBE-POSTROUTING  all  --  anywhere             anywhere             /* kubernetes postrouting rules */
MASQUERADE  all  --        anywhere
RETURN     all  --
MASQUERADE  all  --       !base-address.mcast.net/4
RETURN     all  -- !        box2.ara.ac.nz/24
MASQUERADE  all  -- !

Chain DOCKER (2 references)
target     prot opt source               destination
RETURN     all  --  anywhere             anywhere

Chain KUBE-MARK-DROP (0 references)
target     prot opt source               destination
MARK       all  --  anywhere             anywhere             MARK or 0x8000

Chain KUBE-MARK-MASQ (6 references)
target     prot opt source               destination
MARK       all  --  anywhere             anywhere             MARK or 0x4000

Chain KUBE-NODEPORTS (1 references)
target     prot opt source               destination

Chain KUBE-POSTROUTING (1 references)
target     prot opt source               destination
MASQUERADE  all  --  anywhere             anywhere             /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000

Chain KUBE-SEP-HZC4RESJCS322LXV (1 references)
target     prot opt source               destination
KUBE-MARK-MASQ  all  --          anywhere             /* kube-system/kube-dns:dns-tcp */
DNAT       tcp  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp */ tcp to:

target     prot opt source               destination
KUBE-MARK-MASQ  all  --          anywhere             /* kube-system/kube-dns:dns */
DNAT       udp  --  anywhere             anywhere             /* kube-system/kube-dns:dns */ udp to:

Chain KUBE-SEP-U3UDAUPXUG5BP2NG (2 references)
target     prot opt source               destination
KUBE-MARK-MASQ  all  --  box1.ara.ac.nz       anywhere             /* default/kubernetes:https */
DNAT       tcp  --  anywhere             anywhere             /* default/kubernetes:https */ recent: SET name: KUBE-SEP-U3UDAUPXUG5BP2NG side: source mask: tcp to:

Chain KUBE-SERVICES (2 references)
target     prot opt source               destination
KUBE-MARK-MASQ  tcp  -- !            /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  anywhere               /* default/kubernetes:https cluster IP */ tcp dpt:https
KUBE-MARK-MASQ  udp  -- !           /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  anywhere              /* kube-system/kube-dns:dns cluster IP */ udp dpt:domain
KUBE-MARK-MASQ  tcp  -- !           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  anywhere              /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:domain
KUBE-NODEPORTS  all  --  anywhere             anywhere             /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL

Chain KUBE-SVC-ERIFXISQEP7F7OF4 (1 references)
target     prot opt source               destination
KUBE-SEP-HZC4RESJCS322LXV  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns-tcp */

Chain KUBE-SVC-NPX46M4PTMTKRN6Y (1 references)
target     prot opt source               destination
KUBE-SEP-U3UDAUPXUG5BP2NG  all  --  anywhere             anywhere             /* default/kubernetes:https */ recent: CHECK seconds: 10800 reap name: KUBE-SEP-U3UDAUPXUG5BP2NG side: source mask:
KUBE-SEP-U3UDAUPXUG5BP2NG  all  --  anywhere             anywhere             /* default/kubernetes:https */

Chain KUBE-SVC-TCOU7JCQXEZGVUNU (1 references)
target     prot opt source               destination
KUBE-SEP-JNNVSHBUREKVBFWD  all  --  anywhere             anywhere             /* kube-system/kube-dns:dns */

Chain WEAVE (0 references)
target     prot opt source               destination

Chain cali-OUTPUT (0 references)
target     prot opt source               destination

Chain cali-POSTROUTING (0 references)
target     prot opt source               destination

Chain cali-PREROUTING (0 references)
target     prot opt source               destination

Chain cali-fip-dnat (0 references)
target     prot opt source               destination

Chain cali-fip-snat (0 references)
target     prot opt source               destination

Chain cali-nat-outgoing (0 references)
target     prot opt source               destination

Also I can see packets getting dropped on the flannel interface

flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1450
        inet  netmask  broadcast
        inet6 fe80::a096:47ff:fe58:e438  prefixlen 64  scopeid 0x20<link>
        ether a2:96:47:58:e4:38  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 198  bytes 14747 (14.4 KiB)
        TX errors 0  dropped 27 overruns 0  carrier 0  collisions 0

I have installed the same versions of Kubernetes/Docker and Flannel on other VMs and it works, but not sure why I am getting these failed API calls to the master proxy from the worker nodes in this install? I have several fresh installs and tried weave and calico pod networks as well with the same result.

-- Martin Arndt

1 Answer


Right so I got this going by changing from flannel to weave container networking, and a kubeadm reset and restart of the VMs.

Not sure what the problem was flannel and my VMs but happy to have got it going.

-- Martin Arndt
Source: StackOverflow