I have just finished the setup for Dask on a Kubernetes cluster using Helm and now that I want to do the basic tutorials on the jupyter notebook, I run into the following error:
Also I have tried in another notebook to analyze a 40 GB dataset but it's very slow to run the following commands (I am just importing 40GB from GCS and then making a value_counts on a binary column):
import dask.dataframe as ddf
import gcsfs
fs = gcsfs.GCSFileSystem(project='tme-chrome')
fs.ls('tme-churning')
df = dd.read_csv('gs://tme-churning/*.csv')
df['churning'].value_counts().compute()
Thanks a lot for your help. I seem to be missing something here.
I tried to reproduce this issue using the dask helm chart found here and wasn't able to. These are the steps I took:
1. helm install -n stable-dask stable/dask
2. Go to output Jupyter IP:PORT
3. Run the first few cells in the notebook.
Are you using a different helm chart?