I have one pod which requires a persistent disk. I have 1 pod running on us-central1-a and if that zone goes down I want to migrate to another zone without data loss to another zone (us-central1-*).
Is it possible to migrate a pod to another zone(where i know the disks exists) and use the regional disk for the pod in the new zone?
Approach 1
Using the below StorageClass
my pod is always unable to claim any of these and my pod never starts. I had the understanding this regional disk with all zones configured would make the disk available to all zones in case of zone failure. I do not understand why I cannot claim any of these.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: regionalpd-storageclass
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-standard
replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
- key: topology.kubernetes.io/zone
values:
- us-central1-a
- us-central1-b
- us-central1-c
- us-central1-f
Error: My PVC status is always pending
Normal NotTriggerScaleUp 106s cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added):
Warning FailedScheduling 62s (x2 over 108s) default-scheduler 0/8 nodes are available: 8 node(s) didn't find available persistent volumes to bind.
Attempt 2
This storage config will allow me to run my pod in 2/4 zones with 1 zone being the initial zone and 1 being random. When I intentionally reduce and move out of my initial pods zone I will get the below error unless i'm lucky enough to have chosen the other randomly provisioned zone. Is this functionality intentional because Google assumes a very low chance of 2 zone failures? If one does fail wouldn't i have to provision another disk in another zone just in case?
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: regionalpd-storageclass
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-standard
replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer
Errors:
Normal NotTriggerScaleUp 4m49s cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added):
Warning FailedScheduling 103s (x13 over 4m51s) default-scheduler 0/4 nodes are available: 2 node(s) had volume node affinity conflict, 2 node(s) were unschedulable.
Warning FailedScheduling 43s (x2 over 43s) default-scheduler 0/3 nodes are available: 1 node(s) were unschedulable, 2 node(s) had volume node affinity conflict.
Warning FailedScheduling 18s (x3 over 41s) default-scheduler 0/2 nodes are available: 2 node(s) had volume node affinity conflict.
My pvc
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: my-pvc
namespace: mynamespace
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Gi
storageClassName: regionalpd-storageclass
My Pod volume
volumes:
- name: console-persistent-volume
persistentVolumeClaim:
claimName: my-pvc
A regional Persistent Disk on Google Cloud is only available in two zones, so you must change your StorageClass
to only two zones.
See example StorageClass on Using Kubernetes Engine to Deploy Apps with Regional Persistent Disks and more details on GKE: Provisioning regional persistent disks