Spark/k8s: How to run spark submit on Kubernetes with client mode

1/9/2020

I am trying to use spark-submit with client mode in the kubernetes pod to submit jobs to EMR (Due to some other infra issues, we don't allow cluster mode). By default, spark-submit uses the hostname of the pod as the spark.driver.host and the hostname is the pod's hostname so spark executor could not resolve it. And the spark.driver.port is also locally to the pod (container).

I know a way to pass some confs to spark-submit so that the spark executor can talk to the driver, those configs are:

--conf spark.driver.bindAddress=0.0.0.0 --conf spark.driver.host=$HOST_IP_OF_K8S_WORKER --conf spark.driver.port=32000 --conf spark.driver.blockManager.port=32001

and create a service to in the kubernetes so that spark executor can talk to the driver:

apiVersion: v1
kind: Service
metadata:
  name: spark-block-manager
  namespace: my-app
spec:
  selector:
    app: my-app
  type: NodePort
  ports:
    - name: port-0
      nodePort: 32000
      port: 32000
      protocol: TCP
      targetPort: 32000
    - name: port-1
      nodePort: 32001
      port: 32001
      protocol: TCP
      targetPort: 32001
    - name: port-2
      nodePort: 32002
      port: 32002
      protocol: TCP
      targetPort: 32002

But the issue is there are can be more than 1 pods running on one k8s worker and even more than 1 spark-submit jobs in one pod. So before launching a pod, we need to dynamically select few available ports in the k8s node and create a service to do the port mapping and then during launching the pod, pass those ports into the pod to tell spark-submit to use them. I feel like this is a little bit complex.

Using hostNetwork: true could potentially solve this issue but it introduces lots of other issues in our infra so this is not an option.

If spark-submit can support bindPort concept just like driver.bindAddress and driver.host or support proxy, it will be cleaner to solve the issue.

Does someone have similar situation? Please share some insights.

Thanks.

Additional context: spark version: 2.4

-- Ping.Goblue
apache-spark
docker
kubernetes

1 Answer

5/29/2020

Spark submit can take additional args like, --conf spark.driver.bindAddress, --conf spark.driver.host, --conf spark.driver.port, --conf spark.driver.blockManager.port, --conf spark.port.maxRetries. The spark.driver.host and driver.port is used to tell Spark Executor to use this host and port to connect back to the Spark submit.

We use hostPort and containerPort to expose the ports inside the container, inject the port range and hostIP as the environment variables to the Pod so that spark-submit knows what to use. So those additional args are:

--conf spark.driver.bindAddress=0.0.0.0` # has to be `0.0.0.0` so that it is accessible outside pod
--conf spark.driver.host=$HOST_IP # k8s worker ip, can be easily injected to the pod
--conf spark.driver.port=$SPARK_DRIVER_PORT # defined as environment variable
--conf spark.driver.blockManager.port=$SPARK_DRIVER_PORT # defined as environment variable
--conf spark.port.maxRetries=$SPARK_PORT_MAX_RETRIES # defined as environment variable

The hostPort is local to the Kubernetes worker, which means we don’t need to worry about the run out of ports. The k8s scheduler can find a host to run the pod.

We can reserve the ports from 40000 to 49000 on each host, and open 8 ports for each pod (as each spark-submit requires 2 open ports). The ports are chosen based on the pod_id. Since Kubernetes recommends running less than 100 pods per node, the ports collision will be very rare.

-- Ping.Goblue
Source: StackOverflow