When working on a local project, from local_project.funcs import local_func
will fail in the cluster because local_project
is not installed.
This forces me to develop everything on the same file.
Solutions? Is there a way to "import" the contents of the module into the working file so that the cluster doesn't need to import it?
Installing the local_project
in the cluster is not development friendly because any change in an imported feature requires a cluster redeploy.
import dask
from dask_kubernetes import KubeCluster, make_pod_spec
from local_project.funcs import local_func
pod_spec = make_pod_spec(
image="daskdev/dask:latest",
memory_limit="4G",
memory_request="4G",
cpu_limit=1,
cpu_request=1,
)
cluster = KubeCluster(pod_spec)
df = dask.datasets.timeseries()
df.groupby('id').apply(local_func) #fails if local_project not installed in cluster
Typically the solution to this is to make your own docker image. If you have only a single file, or an egg or zip file then you might also look into the Client.upload_file
method