I have multiple kvm nodes from different network. All these nodes has two iface eth0: 10.0.2.15/24
, eth1: 10.201.(14|12|11).0/24
and few manual routes between dc.
root@k8s-hv09:~# ip r
default via 10.0.2.2 dev eth0 proto dhcp src 10.0.2.15 metric 100
10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
10.0.2.2 dev eth0 proto dhcp scope link src 10.0.2.15 metric 100
10.201.12.0/24 dev eth1 proto kernel scope link src 10.201.12.179
10.201.14.0/24 via 10.201.12.2 dev eth1 proto static
10.201.11.0/24 via 10.201.12.2 dev eth1 proto static
Description for all nodes
Ubuntu 16.04/18.04
Kubernetes 1.13.2
Kubernetes-cni 0.6.0
docker-ce 18.06.1
Master node(k8s-hv06)
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 10.201.14.176:6443
controllerManager: {}
dns:
type: CoreDNS
etcd:
external:
caFile: ""
certFile: ""
endpoints:
- http://10.201.14.176:2379
- http://10.201.12.180:2379
- http://10.201.11.171:2379
keyFile: ""
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.13.2
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
Flannel v0.10.0 was used with rbac and additional arg --iface=eth1. One or more master nodes working fine.
root@k8s-hv06:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-86c58d9df4-b4tf9 1/1 Running 2 23h
kube-system coredns-86c58d9df4-h6nq8 1/1 Running 2 23h
kube-system kube-apiserver-k8s-hv06 1/1 Running 3 23h
kube-system kube-controller-manager-k8s-hv06 1/1 Running 5 23h
kube-system kube-flannel-ds-amd64-rsmhj 1/1 Running 0 21h
kube-system kube-proxy-s5n8l 1/1 Running 3 23h
kube-system kube-scheduler-k8s-hv06 1/1 Running 4 23h
But I cant add any worker node to cluster. For example, I have clear installation Ubuntu 18.04 with docker-ce, kubeadm, kubelet
root@k8s-hv09:~# dpkg -l | grep -E 'kube|docker' | awk '{print $1,$2,$3}'
hi docker-ce 18.06.1~ce~3-0~ubuntu
hi kubeadm 1.13.2-00
hi kubectl 1.13.2-00
hi kubelet 1.13.2-00
ii kubernetes-cni 0.6.0-00
and I'm trying to add worker node(k8s-hv09) to cluster
root@k8s-hv06:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-hv06 Ready master 23h v1.13.2
k8s-hv09 Ready <none> 31s v1.13.2
root@k8s-hv06:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-86c58d9df4-b4tf9 1/1 Running 2 23h
kube-system coredns-86c58d9df4-h6nq8 1/1 Running 2 23h
kube-system kube-apiserver-k8s-hv06 1/1 Running 3 23h
kube-system kube-controller-manager-k8s-hv06 1/1 Running 5 23h
kube-system kube-flannel-ds-amd64-cqw5p 0/1 CrashLoopBackOff 3 113s
kube-system kube-flannel-ds-amd64-rsmhj 1/1 Running 0 22h
kube-system kube-proxy-hbnpq 1/1 Running 0 113s
kube-system kube-proxy-s5n8l 1/1 Running 3 23h
kube-system kube-scheduler-k8s-hv06 1/1 Running 4 23h
cni0
and flannel.1
didn't create and connection to master node can't be established.
root@k8s-hv09:~# ip a | grep -E '(flannel|cni|cbr|eth|docker)'
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether e2:fa:99:0d:3b:05 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic eth0
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether c6:da:44:d9:2e:15 brd ff:ff:ff:ff:ff:ff
inet 10.201.12.179/24 brd 10.201.12.255 scope global eth1
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:30:71:67:92 brd ff:ff:ff:ff:ff:ff
inet 172.172.172.2/24 brd 172.172.172.255 scope global docker0
root@k8s-hv06:~# kubectl logs kube-flannel-ds-amd64-cqw5p -n kube-system -c kube-flannel
I0129 13:02:09.244309 1 main.go:488] Using interface with name eth1 and address 10.201.12.179
I0129 13:02:09.244498 1 main.go:505] Defaulting external address to interface address (10.201.12.179)
E0129 13:02:09.246907 1 main.go:232] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-amd64-cqw5p': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-amd64-cqw5p: dial tcp 10.96.0.1:443: getsockopt: connection refused
root@k8s-hv09:~# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
64a9b21607cb quay.io/coreos/flannel "cp -f /etc/kube-fla…" 23 minutes ago Exited (0) 23 minutes ago k8s_install-cni_kube-flannel-ds-amd64-4k2dt_kube-system_b8f510e3-23c7-11e9-85a5-1a05eef25a13_0
2e0145137449 f0fad859c909 "/opt/bin/flanneld -…" About a minute ago Exited (1) About a minute ago k8s_kube-flannel_kube-flannel-ds-amd64-4k2dt_kube-system_b8f510e3-23c7-11e9-85a5-1a05eef25a13_9
90271ee02f68 k8s.gcr.io/kube-proxy "/usr/local/bin/kube…" 23 minutes ago Up 23 minutes k8s_kube-proxy_kube-proxy-6zgjq_kube-system_b8f50ef6-23c7-11e9-85a5-1a05eef25a13_0
b6345e9d8087 k8s.gcr.io/pause:3.1 "/pause" 23 minutes ago Up 23 minutes k8s_POD_kube-proxy-6zgjq_kube-system_b8f50ef6-23c7-11e9-85a5-1a05eef25a13_0
dca408f8a807 k8s.gcr.io/pause:3.1 "/pause" 23 minutes ago Up 23 minutes k8s_POD_kube-flannel-ds-amd64-4k2dt_kube-system_b8f510e3-23c7-11e9-85a5-1a05eef25a13_0
I see command /opt/bin/flanneld --iface=eth1 --ip-masq --kube-subnet-mgr
running on the worker node but terminate after container k8s_install-cni_kube-flannel-ds-amd64 stopped. File /etc/cni/net.d/10-flannel.conflist
and directory /opt/cni/bin
is presents.
I dont understand the reason. If I add new master node to cluster it will work fine.
root@k8s-hv06:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-hv01 Ready master 17s v1.13.2
k8s-hv06 Ready master 22m v1.13.2
k8s-hv09 Ready <none> 6m22s v1.13.2
root@k8s-hv06:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-86c58d9df4-b8th2 1/1 Running 0 23m
kube-system coredns-86c58d9df4-hmm8q 1/1 Running 0 23m
kube-system kube-apiserver-k8s-hv01 1/1 Running 0 2m16s
kube-system kube-apiserver-k8s-hv06 1/1 Running 0 23m
kube-system kube-controller-manager-k8s-hv01 1/1 Running 0 2m16s
kube-system kube-controller-manager-k8s-hv06 1/1 Running 0 23m
kube-system kube-flannel-ds-amd64-92kmc 0/1 CrashLoopBackOff 6 8m20s
kube-system kube-flannel-ds-amd64-krdgt 1/1 Running 0 2m16s
kube-system kube-flannel-ds-amd64-lpgkt 1/1 Running 0 10m
kube-system kube-proxy-7ck7f 1/1 Running 0 23m
kube-system kube-proxy-nbkvg 1/1 Running 0 8m20s
kube-system kube-proxy-nvbcw 1/1 Running 0 2m16s
kube-system kube-scheduler-k8s-hv01 1/1 Running 0 2m16s
kube-system kube-scheduler-k8s-hv06 1/1 Running 0 23m
But not worker node.
Update:
I don't have issue connection to api server. My issue is two ifaces(cni
, flannel
). Without these ifaces I don't have sync between master and worker nodes. Ok let take once additional node and add it to cluster. If i use kubeadm-init
with my config file all will work fine. Ifaces of flannel plugin is presents. Let make kubeadm reset
** and kubeadm join
this node to same cluster. Network interfaces is absent. But why? In the both case we have the same way to get data for network configuration from master api. If I had found any errors or warning I would have a clue.
** kubectl delete node <node name>(on master)
kubeadm reset && docker system prune -a && reboot
Fixed. API server was bind to eth0 instead of eth1. This is my mistake. I'm very embarrassed.
Additional master node work fine because it healthcheck his own apiserver iface. But this don't work for worker node.
/Close
There is similar github issue to your case and it has been solved by manually editing /etc/kubernetes/manifests/kube-apiserver.yaml
on the master and changing the liveness probe:
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 443 # was 6443
scheme: HTTPS
I hope it would help