kube-dns and kubernetes-dashboard pods status are CrashLoopBackOff


I setup multi-node Kubernetes cluster (3 etcds, 2 masters and 2 nodes) in OpenStack following the https://coreos.com/kubernetes/docs/latest/getting-started.html

All VM has CoreOS 1185.3.0

kubectl version
Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.4.3", GitCommit:"ae4550cc9c89a593bcda6678df201db1b208133b", GitTreeState:"clean", BuildDate:"2016-08-26T18:13:23Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.0+coreos.0", GitCommit:"278a1f7034bdba61cba443722647da1a8204a6fc", GitTreeState:"clean", BuildDate:"2016-09-26T20:48:37Z", GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

kubectl get nodes return the cluster is healthy

NAME            STATUS                     AGE    Ready,SchedulingDisabled   1d    Ready,SchedulingDisabled   1d    Ready                      1d    Ready                      1d

kubectl get pods --namespace=kube-system return kube-dns and kubernetes-dashboard pods status are CrashLoopBackOff

NAME                                   READY     STATUS             RESTARTS   AGE
heapster-v1.2.0-3646253287-xweg5       2/2       Running            0          2h
kube-apiserver-            1/1       Running            2          1d
kube-apiserver-            1/1       Running            1          1d
kube-controller-manager-   1/1       Running            2          1d
kube-controller-manager-   1/1       Running            1          1d
kube-dns-v19-h7qyh                     2/3       CrashLoopBackOff   13         2h
kube-proxy-                1/1       Running            2          36m
kube-proxy-                1/1       Running            2          37m
kube-proxy-                1/1       Running            2          1d
kube-proxy-                1/1       Running            1          1d
kube-scheduler-            1/1       Running            2          1d
kube-scheduler-            1/1       Running            1          1d
kubernetes-dashboard-v1.4.0-t2lpu      0/1       CrashLoopBackOff   12         2h

Can somebody tell me know how to figure out the exact issue here?


I was able to get logs of kube-dns and kubernetes-dashboard containers. It seems to be certificate issue when trying to call kubernetes api. I have recreated all certificate and replace them.

Setting up master and worker instructions, https://coreos.com/kubernetes/docs/latest/deploy-master.html https://coreos.com/kubernetes/docs/latest/deploy-workers.html

Masters are fronted by a load balancer.

Finally restarted kubernetes 2 master VMs and 2 node VMs. But the problem still persists in the kube-dns and kubernetes-dashboard.

kube-dns container logs

docker logs c8c82e68cde9
I1111 16:28:25.097452       1 server.go:94] Using for kubernetes master, kubernetes API: <nil>
I1111 16:28:25.103598       1 server.go:99] v1.4.0-alpha.2.1652+c69e3d32a29cfa-dirty
I1111 16:28:25.103789       1 server.go:101] FLAG: --alsologtostderr="false"
I1111 16:28:25.103928       1 server.go:101] FLAG: --dns-port="10053"
I1111 16:28:25.104185       1 server.go:101] FLAG: --domain="cluster.local."
I1111 16:28:25.104301       1 server.go:101] FLAG: --federations=""
I1111 16:28:25.104465       1 server.go:101] FLAG: --healthz-port="8081"
I1111 16:28:25.104607       1 server.go:101] FLAG: --kube-master-url=""
I1111 16:28:25.104718       1 server.go:101] FLAG: --kubecfg-file=""
I1111 16:28:25.104831       1 server.go:101] FLAG: --log-backtrace-at=":0"
I1111 16:28:25.104945       1 server.go:101] FLAG: --log-dir=""
I1111 16:28:25.105056       1 server.go:101] FLAG: --log-flush-frequency="5s"
I1111 16:28:25.105188       1 server.go:101] FLAG: --logtostderr="true"
I1111 16:28:25.105302       1 server.go:101] FLAG: --stderrthreshold="2"
I1111 16:28:25.105412       1 server.go:101] FLAG: --v="0"
I1111 16:28:25.105520       1 server.go:101] FLAG: --version="false"
I1111 16:28:25.105632       1 server.go:101] FLAG: --vmodule=""
I1111 16:28:25.105853       1 server.go:138] Starting SkyDNS server. Listening on port:10053
I1111 16:28:25.106185       1 server.go:145] skydns: metrics enabled on : /metrics:
I1111 16:28:25.106367       1 dns.go:167] Waiting for service: default/kubernetes
I1111 16:28:25.108281       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp:// [rcache 0]
I1111 16:28:25.108469       1 logs.go:41] skydns: ready for queries on cluster.local. for udp:// [rcache 0]
E1111 16:28:25.176270       1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Endpoints: the server has asked for the client to provide credentials (get endpoints)
I1111 16:28:25.176774       1 dns.go:173] Ignoring error while waiting for service default/kubernetes: the server has asked for the client to provide credentials (get services kubernetes). Sleeping 1s before retrying.

kubernetes-dashboard container logs

docker logs b1d3b0fa617a
Starting HTTP server on port 9090
Creating API server client for
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: the server has asked for the client to provide credentials

kubernetes node logs

journalctl -u kubelet -f
Failed to list *api.Node: Get https://{load_balancer_ip}/api/v1/nodes?fieldSelector=metadata.name%3D172.24.0.121&resourceVersion=0: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kube-ca")

I followed https://coreos.com/kubernetes/docs/latest/openssl.html when generating certs.

API server certs generated by below openssl config

req_extensions = v3_req
distinguished_name = req_distinguished_name
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = nonRepudiation, digitalSignature, keyEncipherment
subjectAltName = @alt_names
DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster.local

Am I missing something here ?


-- Indika Sampath

