How to properly expose spark in kubernetes outside the cluster

10/4/2018

I have Spark running in Kubernetes with 1 master and 3 workers. It's able to run a simple CalculatePi example if I run it within a container in the k8s cluster (so Spark is in a working state). I created a service exposing a externalIPs and was able to telnet <ip address> 7077 successfully, but when running spark-submit --master spark://<ip>:7077 with my .jar I get the error

ERROR StandaloneSchedulerBackend:70 - Application has been killed. Reason: All masters are unresponsive! Giving up.

Along with some others.

But if i kubectl cp it to one of the spark worker containers and run spark-submit within that container it works fine.

What am I missing here?

Edit: The Spark configuration is identical to the official Helm chart, except I added an additional service to the spark-master-deployment.yaml file, which is identical to the existing ClusterIP service except with a externalIPs property to expose it outside the Kubernetes cluster:

apiVersion: v1
kind: Service
metadata:
  name: {{ template "master-fullname" . }}-exposed
  labels:
    heritage: {{ .Release.Service | quote }}
    release: {{ .Release.Name | quote }}
    chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    component: "{{ .Release.Name }}-{{ .Values.Master.Component }}"
spec:
  ports:
    - port: {{ .Values.Master.ServicePort }}
      targetPort: {{ .Values.Master.ContainerPort }}
  selector:
    component: "{{ .Release.Name }}-{{ .Values.Master.Component }}"
  externalIPs:
  - <the IP address of one of the Kubernetes nodes here>
-- Mike
apache-spark
kubernetes

0 Answers