Restore Etcd Quorum

10/15/2018

I have a Kubernetes cluster distributed on AWS via Kops consisting of 3 master nodes, each in a different AZ. As is well known, Kops realizes the deployment of a cluster where Etcd is executed on each master node through two pods, each of which mounts an EBS volume for saving the state. If you lose the volumes of 2 of the 3 masters, you automatically lose consensus among the masters.

Is there a way to use information about the only master who still has the status of the cluster, and retrieve the Quorum between the three masters on that state? I recreated this scenario, but the cluster becomes unavailable, and I can no longer access the Etcd pods of any of the 3 masters, because those pods fail with an error. Moreover, Etcd itself becomes read-only and it is impossible to add or remove members of the cluster, to try to perform manual interventions.

Tips? Thanks to all of you

-- falberto89
amazon-web-services
etcd
kops
kubernetes

1 Answer

10/15/2018

This is documented here. There's also another guide here

You basically have to backup your cluster and create a brand new one.

-- Rico
Source: StackOverflow