We have a lot of static data that our apps would need for processing different requests. This data is available in many files of various sizes. We like to share these files across the kubernetes cluster.
One solution is to host these files on each node with the fixed volume path and use hostPath to read the files in the pods.
Another solution is to create a PV and let multiple pods to claim it. But this means to tie all pods to a particular node that need access to files.
Is there a better/standard way of solving this without using database?
Please note that we are running the cluster in an air-gapped network, hence cloud storage provisioning is not possible.
Persistent Volumes can be used by more than one Pod, even if they are on different nodes, if storage backend support. Check out storage options that support ReadOnlyMany
if that data is meant to be used readonly or ReadWriteMany
if it's writable. Note that in public cloud solutions, you often can't have ReadWriteMany
by providers' own driver.
After you found a proper storage for your needs and deployed the driver you can create Persistent Volumes and populate them by your shared data; that's called Static Provisioning.
You have to use the ReadWriteMany volume or PVC to store all data so that all containers and access it.
NFS is very nice option here and you can use it and can be helpful in such scenarios.
here example for the GCP : https://medium.com/@Sushil_Kumar/readwritemany-persistent-volumes-in-google-kubernetes-engine-a0b93e203180
You can also use the third parties options if you want to use and can be fit with your scenario : https://kubernetes-csi.github.io/docs/#:~:text=This%20site%20documents%20how%20to,Systems%20(COs)%20like%20Kubernetes.
Having in mind the fact that your internal resources have no external access to resources or network traffic outside of their internal network so in this case, your cluster cannot access Cloud Storage via API.
I believe using persistent volumes (PV) is the best solution if you want to use only a k8s/GKE API. You can define a PV with the data that all the pods need access to. The PV needs to have the access mode set to ReadOnlyMany
or ReadWriteMany
for multiple pods to be able to access the same volume.
Another option which might work if you have smaller amounts of static data is a ConfigMap. Note that Configmaps
are not designed to hold large chunks of data. The data stored in a ConfigMap
cannot exceed 1 MiB. The benefit of a Configmap
being that the interface is simpler for pods to access than a PV.