Recover kops Kubernetes cluster

10/12/2018

There's a kubernetes cluster that was created by kops and when running kops validate this is the output:

    INSTANCE GROUPS
NAME            ROLE    MACHINETYPE MIN MAX SUBNETS
master-us-east-1a   Master  m4.xlarge   1   1   us-east-1a
nodes           Node    c4.2xlarge  1   75  us-east-1a

NODE STATUS
NAME                ROLE    READY
ip-172-20-59-93.ec2.internal    master  False

VALIDATION ERRORS
KIND    NAME                MESSAGE
Machine i-0a44bbdd18c86e846     machine "i-0a44bbdd18c86e846" has not yet joined cluster
Machine i-0d3302056f3dfeef0     machine "i-0d3302056f3dfeef0" has not yet joined cluster
Machine i-0d6199876b91962f4     machine "i-0d6199876b91962f4" has not yet joined cluster
Node    ip-172-20-59-93.ec2.internal    master "ip-172-20-59-93.ec2.internal" is not ready

Validation Failed

How I can recover this cluster? The s3 files for this cluster are available.

The etcd volumes show status as "in-use"

screenshot

-- smk
kops
kubernetes

1 Answer

10/12/2018

kops stores the state of your cluster in S3.

  1. Find the bucket where the state is stored
  2. Set export KOPS_STATE_STORE=s3://your-k8s-state-store
  3. Run kops update cluster
  4. If that fails.
  5. Terminate all your instances related to the cluster
  6. Run kops create cluster

Note that the internal state of your cluster is not in S3 but in etcd. The answer here has more details about this topic and also how to backup/restore etcd: How to restore kubernetes cluster using kops?

-- Rico
Source: StackOverflow