I need to run pods on multiple nodes with very large (700GB) readonly dataset in Kubernetes. I tried using readonlymany, but it fails in multi-node setup, and in general was very unstable.
Is there a way for pods to create a new persistent disk from a snapshot, attach it to the pod, and destroy it when pod is destroyed? This would allow me to update snapshots once in a while with the new data.
You can manually provision a persistent disk using an existing image on GCP:
gcloud beta compute disks create --size=500GB --image=<snapshot-name> my-data-disk
Then use it on your pod:
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-pd
name: test-volume
volumes:
- name: test-volume
# This GCE PD must already exist.
gcePersistentDisk:
pdName: my-data-disk
fsType: ext4
The GCE storage class doesn't support snapshots so unfortunately, you can't do it with PVCs. More info here
Hope it helps.