Spark submit from windows vs. linux


I'm experiencing with Spark (2.3.0) over Kubernetes in the past few days.

I've tested the example SparkPi from both linux and windows machines and found the linux spark-submit to run ok and give my proper results (spoiler: Pi is roughly 3.1402157010785055)
while on windows spark fails with class path issues (Could not find or load main class org.apache.spark.examples.SparkPi)

I've noticed that when running spark-submit from linux the classpath looks like that:
-cp ':/opt/spark/jars/*:/var/spark-data/spark-jars/spark-examples_2.11-2.3.0.jar:/var/spark-data/spark-jars/spark-examples_2.11-2.3.0.jar'

While on windows, the logs show a bit different version:
-cp ':/opt/spark/jars/*:/var/spark-data/spark-jars/spark-examples_2.11-2.3.0.jar;/var/spark-data/spark-jars/spark-examples_2.11-2.3.0.jar'

Note the : vs. ; in the classpath which I think is the cause for this issue.

Suggestions how to spark-submit from windows machine without the classpath issue?

This is our spark-submit command:

bin/spark-submit \ --master k8s:// \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --conf spark.driver.memory=1G \ --conf spark.driver.cores=1 \ --conf spark.executor.instances=5 \ --conf spark.executor.cores=1 \ --conf spark.executor.memory=500m \ --conf spark.kubernetes.container.image=spark:2.3.0 \


-- Y. Eliash

1 Answer


As a workaround, you can overwrite the environment variable SPARK_MOUNTED_CLASSPATH in the script $SPARK_HOME/kubernetes/dockerfiles/spark/, such that the wrong semicolon is replaced by the correct colon.

Then you need to rebuild the docker image, e.g., with $SPARK_HOME/bin/ After that, spark-submit on Windows should work.

See also Spark issue tracker:

-- Tobias
Source: StackOverflow