So I'm basically looking for anyone that can point me in the right direction for setting up Kubernetes to perform a common computation on many work items where each work item is a separate file.
I have been reading the documentation here, and it seems to suggest that it is possible, the examples are shown with queues of words and simply printing the words, however, I am having trouble with persistent volumes.
What I need to end up with is a deployment that will take a large file containing data points and split it into several files. I then want to have a Job object execute several pods, one on each file, performing the computation before passing the files back to the deployment for post-processing.
I am having trouble finding out how to go about transferring the files, from what I read it seems that a PersistentVolume cannot be bound to more than one pod at once. So how do I go about passing a file to a single pod in a Job?
Any suggestions or general direction would be greatly appreciated.
PersistentVolume cannot be bound to more than one pod at once.
Whether a PV is shared among Nodes/Pods (or not) is determined by the accessMode; it's not the case that all PVs are universally bound to just one Node/Pod
As the chart on that page shows, there are many PV technologies that tolerate ReadWriteMany
, with the most famous of them being NFS