Expose spark-ui with zeppelin on kubernetes

4/13/2021

First of all I'm pretty new on all this (kubernetes, ingress, spark/zeppelin ...) so my apologies if this is obvious. I tried searching here, documentations etc but couldn't find anything.

I am trying to make the spark interpreter ui accessible from my zeppelin notebook running on kubernetes. Following what I understood from here: http://zeppelin.apache.org/docs/0.9.0-preview1/quickstart/kubernetes.html, my ingress yaml looks something like this:

Ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ingress-zeppelin-server-http
http
spec:
  rules:
  - host: my-zeppelin.my-domain
    http:
      paths:
      - backend:
          serviceName: zeppelin-server
          servicePort: 8080
  - host: '*.my-zeppelin.my-domain'
    http:
      paths:
      - backend:
          serviceName: spark-guovyx
          servicePort: 4040
status:
  loadBalancer: {}

My issue here is that I need to rely on the service-name (in this case spark-guovyx) being set to the interpreter pod name in order to have the UI show up. However since this name is bound to change / have different ones (i.e. I have one interpreter per user + interpreters are frequently restarted) obviously I cannot rely on setting it manually. My initial thought was to use some kind of wildcard naming for the serviceName but turns out ingress/kubernetes don't support that. Any ideas please ?

Thanks.

-- Theo Sardin
apache-spark
apache-zeppelin
kubernetes
kubernetes-ingress

1 Answer

4/14/2021

You can create a new service and leverage the interpreterSettingName label of the spark master pod. When zeppelin creates a master spark pod it adds this label and its value is spark. I am not sure if it will work for more than one pods in a per user per interpreter setting. Below is the code for service, do let me know how it behaves for per user per interpreter.

kind: Service
apiVersion: v1
metadata:
  name: sparkUI
spec:
  ports:
    - name: spark-ui
      protocol: TCP
      port: 4040
      targetPort: 4040
  selector:
    interpreterSettingName: spark
  clusterIP: None
  type: ClusterIP

And then you can have your ingress as:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ingress-zeppelin-server-http
http
spec:
  rules:
  - host: my-zeppelin.my-domain
    http:
      paths:
      - backend:
          serviceName: zeppelin-server
          servicePort: 8080
  - host: '*.my-zeppelin.my-domain'
    http:
      paths:
      - backend:
          serviceName: sparkUI
          servicePort: 4040
status:
  loadBalancer: {}

Also do checkout this repo https://github.com/cuebook/cuelake, it is still in early stage of development but would love to hear your feedback.

-- Vikrant Dubey
Source: StackOverflow