Exception in thread "main" org.apache.spark.SparkException: Must specify the driver container image

4/6/2018

I am trying to do spark-submit on minikube(kubernets) from local machine CLI with command

spark-submit --master k8s://https://127.0.0.1:8001 --name cfe2 
--deploy-mode cluster --class com.yyy.Test --conf spark.executor.instances=2 --conf spark.kubernetes.container.image docker.io/anantpukale/spark_app:1.1 local://spark-0.0.1-SNAPSHOT.jar

I have a simple spark job jar built on verison 2.3.0. I also have containerized it in docker and minikube up and running on virtual box. Below is exception stack:

Exception in thread "main" org.apache.spark.SparkException: Must specify the driver container image at org.apache.spark.deploy.k8s.submit.steps.BasicDriverConfigurationStep$anonfun$3.apply(BasicDriverConfigurationStep.scala:51) at org.apache.spark.deploy.k8s.submit.steps.BasicDriverConfigurationStep$anonfun$3.apply(BasicDriverConfigurationStep.scala:51) at scala.Option.getOrElse(Option.scala:121)  at org.apache.spark.deploy.k8s.submit.steps.BasicDriverConfigurationStep.<init>(BasicDriverConfigurationStep.scala:51)
        at org.apache.spark.deploy.k8s.submit.DriverConfigOrchestrator.getAllConfigurationSteps(DriverConfigOrchestrator.scala:82)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$anonfun$run$5.apply(KubernetesClientApplication.scala:229)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication$anonfun$run$5.apply(KubernetesClientApplication.scala:227)
        at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:2585)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:227)
        at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:192)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:879)
      at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 2018-04-06 13:33:52 INFO  ShutdownHookManager:54 - Shutdown hook called 2018-04-06 13:33:52 INFO  ShutdownHookManager:54 - Deleting directory C:\Users\anant\AppData\Local\Temp\spark-6da93408-88cb-4fc7-a2de-18ed166c3c66
-- Anant Pukale
apache-spark
docker
kubectl
kubernetes
minikube

2 Answers

4/6/2018

Look like bug with default value for parameters spark.kubernetes.driver.container.image, that must be spark.kubernetes.container.image. So try specify driver/executor container image directly:

  • spark.kubernetes.driver.container.image
  • spark.kubernetes.executor.container.image
-- Grigoriev Nick
Source: StackOverflow

4/12/2018

From the source code, the only available conf options are:

spark.kubernetes.container.image
spark.kubernetes.driver.container.image
spark.kubernetes.executor.container.image

And I noticed that Spark 2.3.0 has changed a lot in terms of k8s implementation compared to 2.2.0. For example, instead of specifying driver and executor separately, the official starter's guide is to use a single image given to spark.kubernetes.container.image.

See if this works:

spark-submit \
--master k8s://http://127.0.0.1:8001 \
--name cfe2 \
--deploy-mode cluster \
--class com.oracle.Test \
--conf spark.executor.instances=2 \
--conf spark.kubernetes.container.image=docker/anantpukale/spark_app:1.1 \
--conf spark.kubernetes.authenticate.submission.oauthToken=YOUR_TOKEN \
--conf spark.kubernetes.authenticate.submission.caCertFile=PATH_TO_YOUR_CERT \
local://spark-0.0.1-SNAPSHOT.jar

The token and cert can be found on k8s dashboard. Follow the instructions to make Spark 2.3.0 compatible docker images.

-- wolich22
Source: StackOverflow