Accessing an internal loadBalancer (dask-scheduler) from outside GCP

1/7/2019

Our structure is as follows:

  • Project A - Defines and manages a shared VPC, which all projects are members of
  • Project B - Running Jupyterhub on GKE
  • Project C/D/E ... - Running dask on GKE

Jupyterhub and dask have been deployed on the clusters using Helm, with the majority of the modifications being applied e via Helm values charts.

Now usually, with Jupyter and dask on the same cluster, it's very easy to utilise the dask scheduler. It is also easy to then view the dask's Bokeh diagnostics view by visiting the EXTERNAL-IP of the scheduler with a web browser.

In our case, the cluster sits in a different project (within the same VPC), so to allow usage of the scheduler (Project C) from Jupyterhub (Project B), we needed to modify the dask-scheduler to act as an internal load balancer with the following command:

kubectl annotate svc dask-scheduler cloud.google.com/load-balancer-type=Internal

We are now able to utilise the dask-scheduler from project B, great! However, the problem is that now we cannot view the dask-scheduler's Bokeh, because the EXTERNAL-IP (that we would usually view Bokeh with) has been replaced by this special internal IP.

I feel like I should be able to expose an external IP and forward it to the scheduler, but I've not had any luck after a couple of days worth of experimentation. It's also worth noting that JupyterLab's (as opposed to JupyterHub) dask bokeh plugin does not work, which seems strange to me since we're allowing traffic within the VPC, and usually simply supplying the scheduler IP to this plugin works (the notebook is evidently able to speak to dask-scheduler, it just cannot see the bokeh panel).

Current firewall rules allow communication between the GKE IP ranges on all ports and all protocols (this was key in allowing Jupyterhub to utilise the dask-scheduler).

I can provide more info if needed, I'm still learning a lot of these concepts so let me know if anything is unclear.

Thanks!

-- Chrisjw42
dask
google-cloud-platform
jupyterhub
kubernetes
kubernetes-helm

0 Answers