I've tried running 2 Kubernetes pods on a single-node GKE cluster, sharing a read-only GCE persistent disk, but while one pod successfully runs, the other is stuck in the ContainerCreating
state.
The container is very simple:
FROM debian:jessie
CMD ["/bin/sh", "-c", "while true; do ls /mount; sleep 5; done"]
The deployment looks like this:
apiVersion: extensions/v1beta1 kind: Deployment metadata: name: dumpy spec: replicas: 2 template: metadata: labels: app: dump spec: containers: - name: dump image: gcr.io/myproject/dump volumeMounts: - mountPath: /mount name: dump readOnly: true volumes: - name: dump gcePersistentDisk: pdName: my-disk fsType: ext4 readOnly: true
The failed pod reports:
Warning FailedMount Unable to mount volumes for pod "xxx". Could not attach GCE PD "my-disk". Timeout waiting for mount paths to be created.
FailedSync Error syncing pod, skipping: Could not attach GCE PD "my-disk". Timeout waiting for mount paths to be created.
/var/log/kubelet.log
reports:
gce.go:422] GCE operation failed: googleapi: Error 400: The disk resource 'my-disk' is already being used by 'xxx-123'
gce_util.go:187] Error attaching PD "my-disk": googleapi: Error 400: The disk resource 'my-disk' is already being used by 'xxx-123'
I believe the Kubernetes documentation explicitly allows this scenario.
A feature of PD is that they can be mounted as read-only by multiple consumers simultaneously. This means that you can pre-populate a PD with your dataset and then serve it in parallel from as many pods as you need.
What's going on and what's the fix?
Glen, you're hitting https://github.com/kubernetes/kubernetes/issues/19953
There's no good workaround for this.
It's fixed by https://github.com/kubernetes/kubernetes/pull/26351 which will be part of the next Kubernetes release (v1.3) scheduled to be released by the end of the June (https://github.com/kubernetes/kubernetes/wiki/Release-1.3).