On a GKE kubernetes cluster, I have a pod that's using a persistent volume bound to a NFS.
Sometimes there are cluster-wide restarts (that I still don't know the reason for), and I end up with the Pod being stuck on containercreating due to a failure in the volume attachment.
The thing that puzzles me is that manually scaling the pod's deployment back and forward from 0 to 1 multiple times will eventually fix the issue, but I don't understand the reason for it. I also have another pod that uses the same pvc, but i can restart that one with no issues during the attachment.
It obviously isn't reasonable that I need to manually take action in fixing such a problem by restarting the pod multiple times until the volume is correctly attached.
The cluster is a GKE cluster on Kubernetes version v.18.17-gke.1901
, and the NFS server uses the image itsthenetwork/nfs-server-alpine:latest