Periodic database backup in kubernetes?

3/29/2020

How to setup periodic database backup in kubernetes? I have deployed a postgres database as StatefulSet in kubernetes and mounted a PersistantVolume to store data. Now to get periodic backup for the mounted volume, I found three options,

  1. Setup a CronJob in kubernetes and execute pg_dump and upload to the storage location.
  2. I have already using celery in my project. So now add a new task to backup postgres data and upload to the storage location.
  3. VolumeSnapshots. This looks more kubernetes way. But I couldn't find a way to automate this periodically.

Which is the recommended way to take periodic volume backup in kubernetes?

[EDIT] I prefer solutions without using operators.

-- Nirmal Raghavan
database
django
kubernetes
postgresql

2 Answers

3/29/2020

Velero does offer periodic volume snapshots. I would probably start there.

-- coderanger
Source: StackOverflow

3/30/2020

Which is the recommended way to take periodic volume backup in kubernetes?

I would recommend VolumeSnapshots, but you need to keep in mind that this is not normal backup and you won't be able to revert the data to previous state.

Many storage systems (like Google Cloud Persistent Disks, Amazon Elastic Block Storage, and many on-premise storage systems) provide the ability to create a “snapshot” of a persistent volume. A snapshot represents a point-in-time copy of a volume. A snapshot can be used either to provision a new volume (pre-populated with the snapshot data) or to restore an existing volume to a previous state (represented by the snapshot).

It's easy to use, as of December 2019 it was moved to beta Kubernetes 1.17 Feature: Kubernetes Volume Snapshot Moves to Beta.

Once you specify the VolumeSnapshotClass

The following VolumeSnapshotClass, for example, tells the Kubernetes cluster that a CSI driver, testdriver.csi.k8s.io, can handle volume snapshots, and that when these snapshots are created, their deletion policy should be to delete.

apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
 name: test-snapclass
driver: testdriver.csi.k8s.io
deletionPolicy: Delete
parameters:
 csi.storage.k8s.io/snapshotter-secret-name: mysecret
 csi.storage.k8s.io/snapshotter-secret-namespace: mysecretnamespace

The common snapshot controller reserves the parameter keys csi.storage.k8s.io/snapshotter-secret-name and csi.storage.k8s.io/snapshotter-secret-namespace. If specified, it fetches the referenced Kubernetes secret and sets it as an annotation on the volume snapshot content object. The CSI external-snapshotter sidecar retrieves it from the content annotation and passes it to the CSI driver during snapshot creation.

Creation of a volume snapshot is triggered by the creation of a VolumeSnapshot API object.

The VolumeSnapshot object must specify the following source type: persistentVolumeClaimName - The name of the PVC to snapshot. Please note that the source PVC, PV, and VolumeSnapshotClass for a VolumeSnapshot object must point to the same CSI driver.

You can create VolumeSnapshot which in this example will make s snapshot of PVC called test-pvc:

apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: test-snapshot
spec:
  volumeSnapshotClassName: test-snapclass
  source:
    persistentVolumeClaimName: test-pvc

When volume snapshot creation is invoked, the common snapshot controller first creates a VolumeSnapshotContent object with the volumeSnapshotRef, source volumeHandle, volumeSnapshotClassName if specified, driver, and deletionPolicy.

You can restore PersistentVolumeClaim from a Volume Snapshot:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: restore-pvc
spec:
  storageClassName: csi-hostpath-sc
  dataSource:
    name: test-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

To enable support for restoring a volume from a volume snapshot data source, enable the VolumeSnapshotDataSource feature gate on the apiserver and controller-manager.

What are the limitations?

  • Does not support reverting an existing volume to an earlier state represented by a snapshot (beta only supports provisioning a new volume from a snapshot).
  • No snapshot consistency guarantees beyond any guarantees provided by storage system (e.g. crash consistency). These are the responsibility of higher level APIs/controllers

EDIT:

To automate this process you would need to setup a CronJob or write a Python code using Python client library for kubernetes and change for example python/examples/custom_object.py to your needs.

You can also use already developed apps like stash.run.

-- Crou
Source: StackOverflow