I setup a 3 nodes kubernetes (v1.9.3
) cluster on Ubuntu 16.04.
Prior setup I cleared the iptables rules and follow k8s documents for flannel with following command to initialize the cluster:
# kubeadm init --apiserver-advertise-address 192.168.56.20 --pod-network-cidr=10.244.0.0/16 --kubernetes-version 1.9.3
# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.10.0/Documentation/kube-flannel.yml
The previous command seemed successful:
# kubectl -n kube-system -n kube-system get pods
NAME READY STATUS RESTARTS AGE
etcd-master 1/1 Running 0 3m
kube-apiserver-master 1/1 Running 0 2m
kube-controller-manager-master 1/1 Running 0 2m
kube-dns-6f4fd4bdf-4c76v 3/3 Running 0 3m
kube-flannel-ds-wbx97 1/1 Running 0 1m
kube-proxy-x65lv 1/1 Running 0 3m
kube-scheduler-master 1/1 Running 0 2m
But the problem is kube-dns
seems got wrong service endpoint address assigned, this can be seen with following commands:
# kubectl get ep kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 172.17.0.2:53,172.17.0.2:53 3m
root@master:~# kubectl describe service kube-dns -n kube-system
Name: kube-dns
Namespace: kube-system
Labels: k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=KubeDNS
Annotations: <none>
Selector: k8s-app=kube-dns
Type: ClusterIP
IP: 10.96.0.10
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints: 172.17.0.2:53
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints: 172.17.0.2:53
Session Affinity: None
Events: <none>
The 172.17.0.2
is the IP address assigned by docker bridge (docker0
) for kube-dns
container. On working k8s network setup, the kube-dns
should have endpoints with address from podSubnet
(10.244.0.0/16
).
The effect of current setup is all the pods will not have functioned DNS while IP communication is ok.
I tried to delete kube-dns
pod to see the new kube-dns
containers can pick up the endpoints from podSubnet
but they don't.
From the startup logs of 3 kube-dns
containers, there are no ANY error messages.
I think I have found out the root cause for this. It is the previous kubeadm reset
did not remove both cni
and flannel.1
interfaces. So the next kubeadm init
makes kube-dns
believes the Kubernetes network plugin is already in place before I apply the flannel yaml.
After I check and remove any virtual NICs created by flannel plugin when tearing down kubernetes cluster, the next kubeadm init
can succeed without this issue any more.
The same thing applies to Weave Net that requires to run weave reset
to remove remained virtual weave NICs.