How do you mount volumes on Dask workers with dask-kubernetes?

1/31/2020

I used the following code to create a cluster

from dask_kubernetes import KubeCluster
cluster = KubeCluster.from_yaml('worker.yaml')
cluster.adapt(minimum=1, maximum=10)

with the following yaml code (worker.yaml):

kind: Pod
metadata:
  labels:
    foo: bar
spec:
  restartPolicy: Never
  containers:
  - image: daskdev/dask:latest
    imagePullPolicy: IfNotPresent
    args: [dask-worker, --nthreads, '4', --no-bokeh, --memory-limit,  3GB, --death-timeout, '300']
    name: dask
    resources:
      limits:
        cpu: "4"
        memory: 3G
      requests:
        cpu: "2"
        memory: 2G

This worked as expected. Now I added a volume mount as shown

kind: Pod
metadata:
  labels:
    foo: bar
spec:
  restartPolicy: Never
  containers:
  - image: daskdev/dask:latest
    imagePullPolicy: IfNotPresent
    args: [dask-worker, --nthreads, '4', --no-bokeh, --memory-limit,  3GB, --death-timeout, '300']
    name: dask
    resources:
      limits:
        cpu: "4"
        memory: 3G
      requests:
        cpu: "2"
        memory: 2G
    volumeMounts:
    - name: somedata
      mountPath: /opt/some/data
  volumes:
  - name: somedata
    azureFile:
      secretName: azure-secret
      shareName: somedata
      readOnly: true

I don't see the volume getting mounted. But when I simply run

kubectl create -f worker.yaml

I can see the volume getting mounted.

Does KubeCluster support volume mounts? And if so how do you configure them?

-- Paeng G
azure-storage-files
dask-kubernetes

1 Answer

5/26/2020

I am unable to reproduce your issue when testing with a HostPath volume.

from dask_kubernetes import KubeCluster
cluster = KubeCluster.from_yaml('worker.yaml')
cluster.adapt(minimum=1, maximum=10)
# worker.yaml
kind: Pod
metadata:
  labels:
    foo: bar
spec:
  restartPolicy: Never
  containers:
  - image: daskdev/dask:latest
    imagePullPolicy: IfNotPresent
    args: [dask-worker, --nthreads, '4', --no-bokeh, --memory-limit,  3GB, --death-timeout, '300']
    name: dask
    resources:
      limits:
        cpu: "4"
        memory: 3G
      requests:
        cpu: "2"
        memory: 2G
    volumeMounts:
    - name: somedata
      mountPath: /opt/some/data
  volumes:
  - name: somedata
    hostPath:
      path: /tmp/data
      type: Directory

If I run kubectl describe po <podname> for the worker that is created I can see the volume created successfully.

Volumes:
  somedata:
    Type:          HostPath (bare host directory volume)
    Path:          /tmp/data
    HostPathType:  Directory

And it is mounted where I would expect.

    Mounts:
      /opt/some/data from somedata (rw)

Also if I create a shell into the container with kubectl exec -it <podname> bash and ls /opt/some/data I can see files that I create in the host path.

Therefore volumes work with KubeCluster, so if you are experiencing issues with the azureFile storage then there must be some configuration issue with your Kubernetes cluster.

-- Jacob Tomlinson
Source: StackOverflow