NoClassDefFoundError on Spark Executor using Kubernetes Master

5/26/2020

I am trying to start a spark job with Kubernetes. Our spark jobs use MapR clients so the Docker image deployed into the pods contain both Spark and MapR jar files. I start the job using spark-submit in client mode. The driver pod is launched successfully in the same pod as the spark-submit job was run. It then attempts to launch executor pods. These start but fail immediately. The failure is due to a NoClassDefFoundError exception for the following class:

org/apache/hadoop/mapreduce/InputFormat

I searched the jars in the Docker image and found this class to be defined in the following jar:

/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.0-mapr-1808.jar

The spark.executor.extraClassPath contains the directory /opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/mapreduce so it would seem that this property is not being passed to the executor. Has anyone seen this before and resolved this issue?

Thanks in advance.

-- Conrad Mukai
apache-spark
kubernetes

0 Answers