dask kubernetes import local library

5/1/2020

When working on a local project, from local_project.funcs import local_func will fail in the cluster because local_project is not installed.

This forces me to develop everything on the same file.

Solutions? Is there a way to "import" the contents of the module into the working file so that the cluster doesn't need to import it?

Installing the local_project in the cluster is not development friendly because any change in an imported feature requires a cluster redeploy.

import dask
from dask_kubernetes import KubeCluster, make_pod_spec
from local_project.funcs import local_func

pod_spec = make_pod_spec(
    image="daskdev/dask:latest",
    memory_limit="4G",
    memory_request="4G",
    cpu_limit=1,
    cpu_request=1,
)
cluster = KubeCluster(pod_spec)

df = dask.datasets.timeseries()
df.groupby('id').apply(local_func)  #fails if local_project not installed in cluster
-- Nuno Silva
dask
dask-kubernetes
kubernetes
python

1 Answer

5/8/2020

Typically the solution to this is to make your own docker image. If you have only a single file, or an egg or zip file then you might also look into the Client.upload_file method

-- MRocklin
Source: StackOverflow