Do I have to use jupyter notebook/lab/hub for running Dask on a Kubernetes cluster?

11/14/2019

I am testing dask.distributed for big data and machine learning related things. I've watched videos, read blog pages and tried to understand library documentations. But I am confused. There was always jupyter notebook/lab/hub in every source I found. Do I have to use jupyter notebook/lab/hub in order to run Dask on a Kubernetes cluster? Can't I build a Kubernetes cluster with 2 laptops and run Dask without jupyter related things on them?

Why? Because I want to use my own server (kubernetes cluster) to serve users my own web page (flask in the background).

-- MehmedB
dask
dask-distributed
jupyter
kubernetes
python-3.x

2 Answers

11/14/2019

No you don't. Jupyter is just the most common setup for working with Dask, and JupyterLab has nice extensions so you can visualize task graphs as they are executing. But for just orchestrating dask workers on kubernetes, I'd have a look at dask-kubernetes. That's the library we're using at Saturn Cloud to deploy dask for our enterprise customers.

In the docs, these lines should be sufficient to get you started

from dask_kubernetes import KubeCluster

cluster = KubeCluster.from_yaml('worker-spec.yml')
cluster.adapt(minimum=1, maximum=100)  # or dynamically scale based on current 

It's important to understand that the KubeCluster works by attaching a PeriodicCallback to the asyncio event loop. Which means that you definitely want to make sure it doesn't get garbage collected. You can pass the cluster instance directly into the distributed.client, Or grab the scheduler address and communicate that way.

-- Hugo
Source: StackOverflow

11/14/2019

I see no jupyter notebooks here. Jupyter notebooks are convenient for data science folks, but that not a requirement to use tools, you still can import dask.distributed into your flask application as any other python package, containerize it and ship it to work in your Kubernetes cluster as service. its all up to you as developer.

-- Oleg Butuzov
Source: StackOverflow