Spark on k8s - Error: Missing application resource

9/15/2019

I'm trying to run the SparkPi example using spark on k8s.

Working with

  • kubectl
  • minikube
  • spark-2.4.4-bin-hadoop2.7

Running the following command:

spark-submit --master k8s://https://192.168.99.100:8443  --deploy-mode cluster  --name spark-pi  --class org.apache.spark.examples.SparkPi  --conf spark.executor.instances=1  --conf spark.kubernetes.container.image=sparkk8s:latest --conf spark.kubernetes.driver.pod.name=sparkpi  local:///opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar 10

throws the following exception in the pod logs:

+ env
+ sed 's/[^=]*=\(.*\)/\1/g'
+ sort -t_ -k4 -n
+ grep SPARK_JAVA_OPT_
+ readarray -t SPARK_EXECUTOR_JAVA_OPTS
+ '[' -n '' ']'
+ '[' -n '' ']'
+ PYSPARK_ARGS=
+ '[' -n '' ']'
+ R_ARGS=
+ '[' -n '' ']'
+ '[' '' == 2 ']'
+ '[' '' == 3 ']'
+ case "$SPARK_K8S_CMD" in
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")
+ exec /sbin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=172.17.0.6 --deploy-mode client
Error: Missing application resource.
Usage: spark-submit [options] <app jar | python file | R file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]

Initially I thought the parameters are not passing since the exec command doesn't show the driver class or path to jar executable. but kubectl describe shows the follows:

name:               sparkpi
Namespace:          default
Priority:           0
PriorityClassName:  <none>
Node:               minikube/10.0.2.15
Start Time:         Sun, 15 Sep 2019 13:14:37 +0300
Labels:             spark-app-selector=spark-7c0293be51924505b91e381df8de2b4f
                spark-role=driver
Annotations:        spark-app-name: spark-pi
Status:             Failed
IP:                 172.17.0.5
Containers:
spark-kubernetes-driver:
Container ID:  docker://db03f9a45df283848dc3e10c5d3171454b0d47ae25192e54f266e44f58eb7bc8
Image:         spark2:latest
Image ID:      docker://sha256:1d574a61cb26558ec38376d045bdf39fa18168d96486b2f921ea57d3d4fb2b48
Port:          <none>
Host Port:     <none>
Args:
  driver
State:          Terminated
  Reason:       Error
  Exit Code:    1
  Started:      Sun, 15 Sep 2019 13:14:37 +0300
  Finished:     Sun, 15 Sep 2019 13:14:38 +0300
Ready:          False
Restart Count:  0
Limits:
  memory:  1408Mi
Requests:
  cpu:     1
  memory:  1Gi
Environment:
  SPARK_DRIVER_MEMORY:        1g
  SPARK_DRIVER_CLASS:         org.apache.spark.examples.SparkPi
  SPARK_DRIVER_ARGS:          10
  SPARK_DRIVER_BIND_ADDRESS:   (v1:status.podIP)
  SPARK_MOUNTED_CLASSPATH:    /opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar:/opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar
  SPARK_JAVA_OPT_0:           -Dspark.app.name=spark-pi
  SPARK_JAVA_OPT_1:           -Dspark.app.id=spark-7c0293be51924505b91e381df8de2b4f
  SPARK_JAVA_OPT_2:           -Dspark.submit.deployMode=cluster
  SPARK_JAVA_OPT_3:           -Dspark.driver.blockManager.port=7079
  SPARK_JAVA_OPT_4:           -Dspark.driver.host=spark-pi-b8556ee3d1c33baf8d9feacc1cae7a9d-driver-svc.default.svc
  SPARK_JAVA_OPT_5:           -Dspark.kubernetes.container.image=spark2:latest
  SPARK_JAVA_OPT_6:           -Dspark.executor.instances=1
  SPARK_JAVA_OPT_7:           -Dspark.jars=/opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar,/opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar
  SPARK_JAVA_OPT_8:           -Dspark.kubernetes.executor.podNamePrefix=spark-pi-b8556ee3d1c33baf8d9feacc1cae7a9d
  SPARK_JAVA_OPT_9:           -Dspark.kubernetes.driver.pod.name=sparkpi
  SPARK_JAVA_OPT_10:          -Dspark.driver.port=7078
  SPARK_JAVA_OPT_11:          -Dspark.master=k8s://https://192.168.99.100:8443

I also tried to run the image using docker and check that the jar file is actually under the provided path - /opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar

Any suggestions?

-- LiranBo
apache-spark
kubernetes
spark-submit

1 Answer

11/19/2019

I got back on this issue now, and the fix is somewhat annoying, the spark-submit is expected to run from the spark distribution folder so instead of using the spark-submit alias, run the spark-sumbit as bin/spark-submit ...

bin/spark-submit --master k8s://https://192.168.99.100:8443  --deploy-mode cluster  --name spark-pi  --class org.apache.spark.examples.SparkPi  --conf spark.executor.instances=1  --conf spark.kubernetes.container.image=sparkk8s:latest --conf spark.kubernetes.driver.pod.name=sparkpi  local:///opt/spark/examples/jars/spark-examples_2.11-2.4.4.jar 10
-- LiranBo
Source: StackOverflow