Periodic backups of an etcd-operator cluster

1/29/2018

I'm trying to deploy an etcd cluster on GKE using the etcd-operator helm chart.

I've successfully got the cluster online and working, but now I'm trying to figure out how to backup the thing. If I'm understanding this issue and this issue, is it true etcd-operator doesn't actually support periodic backups?

It seems to me that including backup and restore operators is pretty useless if you can't backup your cluster on an ongoing basis.

Am I misunderstanding the documentation? How have other people solved this?

Here's the chart values I'm working with currently:

rbac:
  create: false
  apiVersion: v1beta1
  etcdOperatorServiceAccountName: vault-etcd-operator
  backupOperatorServiceAccountName: vault-etcd-backup
  restoreOperatorServiceAccountName: vault-etcd-restore

deployments:
  etcdOperator: true
  # one time deployment, delete once completed,
  # Ref: https://github.com/coreos/etcd-operator/blob/master/doc/user/walkthrough/backup-operator.md
  backupOperator: true
  # one time deployment, delete once completed
  # Ref: https://github.com/coreos/etcd-operator/blob/master/doc/user/walkthrough/restore-operator.md
  restoreOperator: false

customResources:
  createEtcdClusterCRD: true
  createBackupCRD: true
  createRestoreCRD: false

etcdOperator:
  name: etcd-operator
  replicaCount: 1
  image:
    repository: quay.io/coreos/etcd-operator
    tag: v0.7.0
    pullPolicy: Always
  resources:
    cpu: 100m
    memory: 128Mi
  ## Node labels for etcd-operator pod assignment
  ## Ref: https://kubernetes.io/docs/user-guide/node-selection/
  nodeSelector: {}
  ## additional command arguments go here; will be translated to `--key=value` form
  ## e.g., analytics: true
  commandArgs: {}

backupOperator:
  name: etcd-backup-operator
  replicaCount: 1
  image:
    repository: quay.io/coreos/etcd-operator
    tag: v0.7.0
    pullPolicy: Always
  resources:
    cpu: 100m
    memory: 128Mi
  spec:
    storageType: S3
    s3:
      s3Bucket: my-vault-backups
      awsSecret: aws
  ## Node labels for etcd pod assignment
  ## Ref: https://kubernetes.io/docs/user-guide/node-selection/
  nodeSelector: {}
  ## additional command arguments go here; will be translated to `--key=value` form
  ## e.g., analytics: true
  commandArgs: {}
-- Adam Lassek
etcd
google-kubernetes-engine
kubernetes-helm

1 Answer

10/8/2018

Not a complete answer, but these resources may point you in the right direction:
https://labs.consol.de/kubernetes/2018/05/25/kubeadm-backup.html
(It's a cronjob that autobacks up etcd.)

Also there's a unique Kubernetes Disaster Recovery tool called Heptio Ark https://www.youtube.com/watch?v=qRPNuT080Hk
It can do partial and filtered backup and restore based on reading from the api server, it can also backup PV's, and be scheduled.

Because Heptio Ark works via kube-apiserver it works even in cases like AKS/managed kubernetes where the master nodes and etcd are abstracted away. So since it backsup etcd without directly interfacing with etcd it may work for your scenario.

-- neokyle
Source: StackOverflow