How to setup periodic database backup in kubernetes? I have deployed a postgres database as StatefulSet
in kubernetes and mounted a PersistantVolume
to store data. Now to get periodic backup for the mounted volume, I found three options,
CronJob
in kubernetes and execute pg_dump and upload to the storage location.Which is the recommended way to take periodic volume backup in kubernetes?
[EDIT] I prefer solutions without using operators.
Velero does offer periodic volume snapshots. I would probably start there.
Which is the recommended way to take periodic volume backup in kubernetes?
I would recommend VolumeSnapshots, but you need to keep in mind that this is not normal backup and you won't be able to revert the data to previous state.
Many storage systems (like Google Cloud Persistent Disks, Amazon Elastic Block Storage, and many on-premise storage systems) provide the ability to create a “snapshot” of a persistent volume. A snapshot represents a point-in-time copy of a volume. A snapshot can be used either to provision a new volume (pre-populated with the snapshot data) or to restore an existing volume to a previous state (represented by the snapshot).
It's easy to use, as of December 2019 it was moved to beta Kubernetes 1.17 Feature: Kubernetes Volume Snapshot Moves to Beta.
Once you specify the VolumeSnapshotClass
The following VolumeSnapshotClass, for example, tells the Kubernetes cluster that a CSI driver,
testdriver.csi.k8s.io
, can handle volume snapshots, and that when these snapshots are created, their deletion policy should be to delete.
apiVersion: snapshot.storage.k8s.io/v1beta1 kind: VolumeSnapshotClass metadata: name: test-snapclass driver: testdriver.csi.k8s.io deletionPolicy: Delete parameters: csi.storage.k8s.io/snapshotter-secret-name: mysecret csi.storage.k8s.io/snapshotter-secret-namespace: mysecretnamespace
The common snapshot controller reserves the parameter keys
csi.storage.k8s.io/snapshotter-secret-name
andcsi.storage.k8s.io/snapshotter-secret-namespace
. If specified, it fetches the referenced Kubernetes secret and sets it as an annotation on the volume snapshot content object. The CSI external-snapshotter sidecar retrieves it from the content annotation and passes it to the CSI driver during snapshot creation.Creation of a volume snapshot is triggered by the creation of a VolumeSnapshot API object.
The VolumeSnapshot object must specify the following source type:
persistentVolumeClaimName
- The name of the PVC to snapshot. Please note that the source PVC, PV, and VolumeSnapshotClass for a VolumeSnapshot object must point to the same CSI driver.
You can create VolumeSnapshot
which in this example will make s snapshot of PVC called test-pvc
:
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
name: test-snapshot
spec:
volumeSnapshotClassName: test-snapclass
source:
persistentVolumeClaimName: test-pvc
When volume snapshot creation is invoked, the common snapshot controller first creates a VolumeSnapshotContent object with the
volumeSnapshotRef
, sourcevolumeHandle
,volumeSnapshotClassName
if specified,driver
, anddeletionPolicy
.
You can restore PersistentVolumeClaim from a Volume Snapshot:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: restore-pvc
spec:
storageClassName: csi-hostpath-sc
dataSource:
name: test-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
To enable support for restoring a volume from a volume snapshot data source, enable the
VolumeSnapshotDataSource
feature gate on the apiserver and controller-manager.
What are the limitations?
- Does not support reverting an existing volume to an earlier state represented by a snapshot (beta only supports provisioning a new volume from a snapshot).
- No snapshot consistency guarantees beyond any guarantees provided by storage system (e.g. crash consistency). These are the responsibility of higher level APIs/controllers
EDIT:
To automate this process you would need to setup a CronJob
or write a Python code using Python client library for kubernetes and change for example python/examples/custom_object.py to your needs.
You can also use already developed apps like stash.run.