kube-system containers continuously crash

6/3/2019

I initialize a new cluster on the master node with kubeadm init --pod-network-cidr=10.1.0.0/16 and install Calico and everything seems to be working:

sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
[sudo] password for sysadm:
NAMESPACE     NAME                                            READY   STATUS    RESTARTS   AGE     IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                               2/2     Running   0          4m9s    192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                         1/1     Running   0          4m9s    10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                         1/1     Running   0          4m9s    10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   etcd-localhost.localdomain                      1/1     Running   0          3m4s    192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-apiserver-localhost.localdomain            1/1     Running   0          3m18s   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-controller-manager-localhost.localdomain   1/1     Running   0          3m23s   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                                1/1     Running   0          4m9s    192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-scheduler-localhost.localdomain            1/1     Running   0          3m11s   192.168.0.249   localhost.localdomain   <none>           <none>

But the moment I try to join a worker node to the master with kubeadm join "$api_server_endpoint" --token "$token" --discovery-token-ca-cert-hash "$hash", all of the kube-system containers start to crash:

sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                      READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES   
kube-system   calico-node-ntzn2         2/2     Running            0          10m   192.168.0.182   localhost.localdomain   <none>           <none>            
kube-system   coredns-fb8b8dccf-hqmn2   0/1     CrashLoopBackOff   2          10m   10.1.0.2        localhost.localdomain   <none>           <none>            
kube-system   coredns-fb8b8dccf-nfgr5   0/1     CrashLoopBackOff   1          10m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb          1/1     Running            0          10m   192.168.0.166   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                            READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                               2/2     Running            0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                         0/1     CrashLoopBackOff   2          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                         0/1     CrashLoopBackOff   2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   etcd-localhost.localdomain                      0/1     Pending            0          1s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-apiserver-localhost.localdomain            0/1     Pending            0          1s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-controller-manager-localhost.localdomain   0/1     Pending            0          1s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                                1/1     Running            0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                      2/2     Running   0          11m   192.168.0.182   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                0/1     Running   3          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                0/1     Running   2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                       1/1     Running   0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
kube-system   kube-scheduler-localhost.localdomain   0/1     Pending   0          0s    <none>          localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                      READY   STATUS    RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2         2/2     Running   0          11m   192.168.0.182   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2   1/1     Running   0          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5   1/1     Running   0          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb          1/1     Running   0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                      2/2     Running   0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                0/1     Error     2          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                1/1     Running   0          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                       1/1     Running   0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   kube-scheduler-localhost.localdomain   0/1     Pending   0          0s    <none>          localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                         READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2            2/2     Running            0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2      0/1     CrashLoopBackOff   2          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5      1/1     Running            0          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   etcd-localhost.localdomain   0/1     Pending            0          1s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb             1/1     Running            0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2                      2/2     Running   0          11m   192.168.0.182   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2                0/1     Error     3          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5                0/1     Error     2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-apiserver-localhost.localdomain   0/1     Pending   0          0s    <none>          localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb                       1/1     Running   0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                      READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2         2/2     Running            0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2   1/1     Running            0          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5   0/1     CrashLoopBackOff   2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb          1/1     Running            0          11m   192.168.0.166   localhost.localdomain   <none>           <none>
sysadm@master$ sudo kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                      READY   STATUS             RESTARTS   AGE   IP              NODE                    NOMINATED NODE   READINESS GATES
kube-system   calico-node-ntzn2         2/2     Running            0          11m   192.168.0.249   localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-hqmn2   0/1     Running            3          11m   10.1.0.2        localhost.localdomain   <none>           <none>
kube-system   coredns-fb8b8dccf-nfgr5   0/1     CrashLoopBackOff   2          11m   10.1.0.3        localhost.localdomain   <none>           <none>
kube-system   kube-proxy-xgnlb          1/1     Running            0          11m   192.168.0.166   localhost.localdomain   <none>           <none>

Any ideas what might be going on? How can I troubleshoot this? I try to use kubectl describe pods but the pods keep on crashing and when I am able to get some information back, I don't see anything to lead me where to investigate next.

Sorry for the vague details. If you can point me to where else to look, I can post more details or know where to investigate next.

Thank you for your time :)

-- Zhao Li
kubernetes

1 Answer

6/4/2019

issue is with the hostname. check NODENAME column. it shows the host name as localhost.localdomain

update the hostname as k8s-master or master. it should work. each node also should have a unique hostname like node1, node2, node3 and so on

-- P Ekambaram
Source: StackOverflow