I am trying to use spark-submit
with client
mode in the kubernetes pod to submit jobs to EMR (Due to some other infra issues, we don't allow cluster
mode). By default, spark-submit
uses the hostname
of the pod as the spark.driver.host
and the hostname
is the pod's hostname so spark executor
could not resolve it. And the spark.driver.port
is also locally to the pod (container).
I know a way to pass some confs to spark-submit
so that the spark executor
can talk to the driver
, those configs are:
--conf spark.driver.bindAddress=0.0.0.0 --conf spark.driver.host=$HOST_IP_OF_K8S_WORKER --conf spark.driver.port=32000 --conf spark.driver.blockManager.port=32001
and create a service to in the kubernetes so that spark executor
can talk to the driver
:
apiVersion: v1
kind: Service
metadata:
name: spark-block-manager
namespace: my-app
spec:
selector:
app: my-app
type: NodePort
ports:
- name: port-0
nodePort: 32000
port: 32000
protocol: TCP
targetPort: 32000
- name: port-1
nodePort: 32001
port: 32001
protocol: TCP
targetPort: 32001
- name: port-2
nodePort: 32002
port: 32002
protocol: TCP
targetPort: 32002
But the issue is there are can be more than 1 pods running on one k8s worker and even more than 1 spark-submit
jobs in one pod. So before launching a pod, we need to dynamically select few available ports in the k8s node and create a service to do the port mapping and then during launching the pod, pass those ports into the pod to tell spark-submit
to use them. I feel like this is a little bit complex.
Using hostNetwork: true
could potentially solve this issue but it introduces lots of other issues in our infra so this is not an option.
If spark-submit
can support bindPort
concept just like driver.bindAddress
and driver.host
or support proxy
, it will be cleaner to solve the issue.
Does someone have similar situation? Please share some insights.
Thanks.
Additional context: spark version
: 2.4
Spark submit can take additional args like, --conf spark.driver.bindAddress, --conf spark.driver.host, --conf spark.driver.port, --conf spark.driver.blockManager.port, --conf spark.port.maxRetries
. The spark.driver.host
and driver.port
is used to tell Spark Executor to use this host and port to connect back to the Spark submit.
We use hostPort
and containerPort
to expose the ports inside the container, inject the port range and hostIP
as the environment variables to the Pod so that spark-submit knows what to use. So those additional args are:
--conf spark.driver.bindAddress=0.0.0.0` # has to be `0.0.0.0` so that it is accessible outside pod
--conf spark.driver.host=$HOST_IP # k8s worker ip, can be easily injected to the pod
--conf spark.driver.port=$SPARK_DRIVER_PORT # defined as environment variable
--conf spark.driver.blockManager.port=$SPARK_DRIVER_PORT # defined as environment variable
--conf spark.port.maxRetries=$SPARK_PORT_MAX_RETRIES # defined as environment variable
The hostPort
is local to the Kubernetes worker, which means we don’t need to worry about the run out of ports. The k8s scheduler can find a host to run the pod.
We can reserve the ports from 40000 to 49000 on each host, and open 8 ports for each pod (as each spark-submit requires 2 open ports). The ports are chosen based on the pod_id. Since Kubernetes recommends running less than 100 pods per node, the ports collision will be very rare.