I'm running Spark 2.4.1 in client mode on Kubernetes.
I'm trying to submit a task from a pod containing spark that will launch 2 executor pods. The command is the following:
bin/spark-shell \
--master k8s://https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_SERVICE_PORT \
--deploy-mode client \
--conf spark.executor.instances=2 \
--conf spark.kubernetes.container.image=$SPARK_IMAGE \
--conf spark.kubernetes.driver.pod.name=$HOSTNAME
--conf spark.kubernetes.executor.podNamePrefix=spark-exec \
--conf spark.ui.port=4040
These executor pods are created but keep failing with the error:
Caused by: java.io.IOException: Failed to connect to spark-57b8f99554-7nd45:4444
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:194)
at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:190)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: spark-57b8f99554-7nd45
at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getByName(InetAddress.java:1077)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
It seems that the worker pods can't reach the master node (pod spark-57b8f99554-7nd45) and it should be related with THIS point but I can't figure out how to solve it. Any idea?
For running Spark with client mode on Kubernetes pods you will need to follow these steps:
Create a headless service like this one:
apiVersion: v1
kind: Service
metadata:
name: yoursparkapp
spec:
clusterIP: "None"
selector:
spark-app-selector: yoursparkapp
ports:
- name: driver-rpc-port
protocol: TCP
port: 7078
targetPort: 7078
- name: blockmanager
protocol: TCP
port: 7079
targetPort: 7079
Be careful with this section: spark-app-selector: yoursparkapp
because it must match the label used for running the pod where spark-submit will be performed.
Install the above service in you cluster with this command: kubectl create -f yoursparkappservice.yml -n your_namespace
Run some pod assigning the above service:
kubectl run \
-n your_namespace -i --tty yoursparkapp \
--restart=Never \
--overrides='{ "apiVersion" : "v1", "metadata" : { "annotations" : "labels": { "spark-app-selector" : "yoursparkapp" } } }' \
--image=your_container:latest -- /bin/bash
For labels we are using "spark-app-selector" : "yoursparkapp"
. In this way, this pod will be using the service created in the first step.
Inside the pod created in the step 2 we can execute a spark-submit:
spark-submit --master k8s://https://kubernetes_url:443 \
--deploy-mode client \
--name yoursparkapp \
--conf spark.kubernetes.container.image=your_container:latest \
--conf spark.kubernetes.pyspark.pythonVersion=3 \
--conf spark.kubernetes.namespace=your_namespace \
--conf spark.kubernetes.container.image.pullPolicy=Always \
--conf spark.driver.memory=2g \
--conf spark.executor.memory=2g \
--conf spark.submit.deployMode=client \
--conf spark.executor.cores=3 \
--conf spark.driver.cores=3 \
--conf spark.driver.host=yoursparkapp \
--conf spark.driver.port=7078 \
--conf spark.kubernetes.driver.pod.name=yoursparkapp \
/path/to/your/remote_spark_app.py