restore destroyed kubeadm master

12/29/2020

I created a 1-master 2-workers kubernetes cluster using kubeadm 1.20 and backed up the etcd. I destroyed the master on purpose to see test how to get cluster back to running state.

Kubernetes version: 1.20
Installation method: kubeadm
Host OS: windows 10 pro
Guest OS: ubuntu 18 on virtual box 6
CNI and version: weave-net
CRI and version: docker 19

I'm partially successful in that the secret that I created before destroying master is visible after etcd restore, so that part seems to work.

HOWEVER the coredns pods are unauthorized to make requests to api server, based on the logs of coredns pods:

[INFO] plugin/ready: Still waiting on: "kubernetes"
E1229 21:42:25.892580       1 reflector.go:178] pkg/mod/k8s.io/client-go@v0.18.3/tools/cache/reflector.go:125: Failed to list *v1.Namespace: Unauthorized
E1229 21:42:29.680620       1 reflector.go:178] pkg/mod/k8s.io/client-go@v0.18.3/tools/cache/reflector.go:125: Failed to list *v1.Endpoints: Unauthorized
[INFO] plugin/ready: Still waiting on: "kubernetes"
E1229 21:42:39.492521       1 reflector.go:178] pkg/mod/k8s.io/client-go@v0.18.3/tools/cache/reflector.go:125: Failed to list *v1.Service: Unauthorized

I'm guessing it has something to do with service account tokens so there's a step I'm missing to authorize pods to authenticate to api-server after etcd db replacement.

What am I missing?

-- Oliver
etcd
etcdctl
kubeadm
kubernetes

1 Answer

12/30/2020

If you only backed up the contents of Etcd then kubeadm would have generated new certificates used for signing the ServiceAccount JWTs. Old tokens would no longer verify. As this is not generally done during routine maintenance, I don't think the SA controller knows to reissues the tokens. If you delete all the underlying secrets it should do the reissue though.

-- coderanger
Source: StackOverflow