Intermittent failure of container mounts in Kubernetes

3/25/2019

We are seeing an intermittent failure of volume mount with this error message:

Error: cannot find volume "work" to mount into container "notebook".

The issue happens on ~5% of pod launches (where they all have the same config). The volume is backed by PVC which is created immediately before pod creation.

We are running on GKE with version v1.11.7-gke.12.

Pod manifest is here:

{
      apiVersion: 'v1',
      kind: 'Pod',
      metadata: {
        name: 'some pod name',
        annotations: {},
        labels: {},
      },
      spec: {
        restartPolicy: 'OnFailure',
        securityContext: {
          fsGroup: 100,
        },
        automountServiceAccountToken: false,
        volumes: [
          {
            name: 'work',
            persistentVolumeClaim: {
              claimName: pvcName,
            },
          },
        ],
        containers: [
          {
            name: 'notebook',
            image,
            workingDir: undefined, // this is defined in Dockerfile
            ports: [
              {
                name: 'notebook-port',
                containerPort: port,
              },
            ],
            args: [...command.split(' '), ...args],
            imagePullPolicy: 'IfNotPresent',
            volumeMounts: [
              {
                name: 'work',
                mountPath: '/home/jovyan/work',
              },
            ],
            resources: {
              requests: {
                memory: '256M',
              },
              limits: {
                memory: '1G',
              },
            },
          },
          {
            name: 'watcher',
            image: 'gcr.io/deepnote-200602/wacher:0.0.3',
            imagePullPolicy: 'Always',
            volumeMounts: [
              {
                name: 'work',
                mountPath: '/home/jovyan/work',
              },
            ],
          },
        ],
      },
    }
  }

Any help or ideas would be greatly appreciated! Also, very happy to try any suggestions what other logs/steps might be useful to isolate the issue.

-- Jan Matas
google-kubernetes-engine
kubernetes

1 Answer

3/26/2019

most likely the volume is not bound. can you check and confirm status of below pvc

claimName: pvcName

kubectl get pvc | grep pvcName
-- P Ekambaram
Source: StackOverflow