After learning that we should have used a StatefulSet
instead of a Deployment
in order to be able to attach the same persistent volume to multiple pods and especially pods on different nodes, I tried changing our config accordingly.
However, even when using the same name for the volume claim as before, it seems to be creating an entirely new volume instead of using our existing one, hence the application loses access to the existing data when run as a StatefulSet
.
Here's the volume claim part of our current Deployment
config:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gitea-server-data
labels:
app: gitea
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
This results in a claim with the same name.
And here's the template for the StatefulSet
:
volumeClaimTemplates:
- metadata:
name: gitea-server-data
labels:
app: gitea
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
This results in new claims for every pod, with the pod name and an ID per claim, like e.g. gitea-server-data-gitea-server-0
.
The new claims are now using a new volume instead of the existing one. So I tried specifying the existing volume explicitly, like so:
volumeClaimTemplates:
- metadata:
name: gitea-server-data
labels:
app: gitea
spec:
accessModes:
- ReadWriteOnce
volumeName: pvc-c87ff507-fd77-11e8-9a7b-420101234567
resources:
requests:
storage: 20Gi
However, this results in pods failing to be scheduled and the new claim being "pending" indefinitely:
pod has unbound immediate PersistentVolumeClaims (repeated times)
So the question is: how can we migrate the volume claim(s) in a way that allows us to use the existing persistent volume and access the current application data from a new StatefulSet
instead of the current Deployment
?
(In case it is relevant, we are using Kubernetes on GKE.)
OK, so I spent quite some time trying out all kinds of different configs until finally learning that GCE persistent disks simply don't support ReadWriteMany
to begin with.
The GKE docs go out of their way to never explicitly mention that you cannot actually mount any normal GKE persistent volume on multiple pods/nodes.
Apparently, the only way to get shared file storage between pods is to deploy either your own NFS/Gluster/etc. or to cough up a bunch of money and use Google Cloud Filestore, for which there is a GKE storage class, and which can indeed be mounted on multiple pods.
Unfortunately, that's not an option for this app, as Filestore pricing begins with 1TB minimum capacity at a whopping $0.20/GB/month, which means that the cheapest option available costs around $205 per month. We currently pay around $60/month, so that would more than triple our bill, simply to get rolling deployments without errors.
In StatefulSet, when you try to use PVC to store your data, you actually define your PVC by using volumeClaimTemplates
like:
volumeClaimTemplates:
- metadata:
name: gitea-server-data
labels:
app: gitea
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
In this scenario, the following things could happen:
gitea-server
and the replica is 1
then the only pod of the StatefulSet will use the PVC named gitea-server-data-gitea-server-0
(if already exist in the cluster) or create a new one named gitea-server-data-gitea-server-0
(if doesn't exist in the cluster).gitea-server
and the replica is 2
then the two Pod of the StatefulSet will use the PVCs named gitea-server-data-gitea-server-0
and gitea-server-data-gitea-server-1
repectively(if already exist in the cluster) or create new PVCs named gitea-server-data-gitea-server-0
an gitea-server-data-gitea-server-1
(if doesn't exist in the cluster) and so on.Generally, in StatefulSet, generated PVC name follow the convention:
<volumeClaimTemplates name>-<StatefulSet name>-<Pod ordinal>
Now, if you create a PVC named gitea-server-data-gitea-server-0
and the other things looks like:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gitea-server-data
labels:
app: gitea
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
then after creating the PVC, if you try to create a StatefulSet with replica 1
and with the above configuration that defined in volumeClaimTemplates
then the SatefulSet will use this PVC(gitea-server-data-gitea-server-0
).
You can also use this PVC in other workload(Like Deployment) by specifying the field spec.accessmodes
as ReadWriteMany
.