package learn.spark
import org.apache.spark.{SparkConf, SparkContext}
object MasterLocal2 {
def main(args: Array[String]): Unit = {
val conf = new SparkConf()
conf.setAppName("spark-k8s")
conf.setMaster("k8s://https://192.168.99.100:16443")
conf.set("spark.driver.host", "192.168.99.1")
conf.set("spark.executor.instances", "5")
conf.set("spark.kubernetes.executor.request.cores", "0.1")
conf.set("spark.kubernetes.container.image", "spark:latest")
val sc = new SparkContext(conf)
println(sc.parallelize(1 to 5).map(_ * 10).collect().mkString(", "))
sc.stop()
}
}
I am trying to speed up the local running of the Spark program, but I got some exceptions. I don't know how to configure to pass the JVM things to the executors.
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 8, 10.1.1.217, executor 4): java.lang.ClassNotFoundException: learn.spark.MasterLocal2$anonfun$main$1
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
Mount the Idea compilation result directory to the executor, then set spark.executor.extraClassPath
to that.
conf.set("spark.kubernetes.executor.volumes.hostPath.anyname.options.path", "/path/to/your/project/out/production/examples")
conf.set("spark.kubernetes.executor.volumes.hostPath.anyname.mount.path", "/intellij-idea-build-out")
conf.set("spark.executor.extraClassPath", "/intellij-idea-build-out")
Make sure that your compilation out directory can be mounted to the executor container via K8S volume, which involves the use of Kubernetes.