New master with new etcd volume do not join in the cluster

10/3/2018

I am using KOPS and I have a cluster with 3 masters. I deleted one master and the disks (root disk and etcd disks(main and events)).

Now kops recreated this master and the disks, but this new master node cannot join in the cluster. The error message on kube-apiserver is

controller.go:135] Unable to perform initial IP allocation check: unable to refresh the service IP block: client: etcd cluster is unavailable or misconfigured; error #0: dial tcp 127.0.0.1:4001: getsockopt: connection refused

Any idea?

-- Danilo
coreos
etcd
kops
kubectl
kubernetes

2 Answers

10/3/2018

Looks like your etcd server is down on that host. It might have not been able to sync with the etcd servers on the other masters.

You can check like this:

$ sudo docker ps | grep etcd

If you don't see anything then it's down. Then you can check the logs for the 'Exited' etcd container:

$ sudo docker ps -a | grep Exited | grep etcd
$ sudo docker logs <etcd-container-id>

Also check that your kube-apiserver options for etcd look ok under /etc/kuberbetes/manifests/kube-apiserver.yaml

-- Rico
Source: StackOverflow

10/3/2018

Issue Solved.

1 - I removed the old master from de etcd cluster using etcdctl. You will need to connect on the etcd-server container to do this.

2 - On the new master node I stopped kubelet and protokube services.

3 - Empty Etcd data dir. (data and data-events)

4 - Edit /etc/kubernetes/manifests/etcd.manifests and etcd-events.manifest changing ETCD_INITIAL_CLUSTER_STATE from new to existing.

5 - Get the name and PeerURLS from new master and use etcdctl to add the new master on the cluster. (etcdctl member add "name" "PeerULR")You will need to connect on the etcd-server container to do this.

6 - Start kubelet and protokube services on the new master.

-- Danilo
Source: StackOverflow