I followed the Spark on Kubernetes blog but got to a point where it runs the job but fails inside the worker pods with an file access error.
2018-05-22 22:20:51 WARN  TaskSetManager:66 - Lost task 0.0 in stage 0.0 (TID 0, 172.17.0.15, executor 3): java.nio.file.AccessDeniedException: ./spark-examples_2.11-2.3.0.jar
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixCopyFile.copyFile(UnixCopyFile.java:243)
at sun.nio.fs.UnixCopyFile.copy(UnixCopyFile.java:581)
at sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:253)
at java.nio.file.Files.copy(Files.java:1274)
at org.apache.spark.util.Utils$.org$apache$spark$util$Utils$copyRecursive(Utils.scala:632)
at org.apache.spark.util.Utils$.copyFile(Utils.scala:603)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:478)
at org.apache.spark.executor.Executor$anonfun$org$apache$spark$executor$Executor$updateDependencies$5.apply(Executor.scala:755)
at org.apache.spark.executor.Executor$anonfun$org$apache$spark$executor$Executor$updateDependencies$5.apply(Executor.scala:747)
at scala.collection.TraversableLike$WithFilter$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashMap$anonfun$foreach$1.apply(HashMap.scala:99)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$updateDependencies(Executor.scala:747)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:312)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)The command i use to run the SparkPi example is :
$DIR/$SPARKVERSION/bin/spark-submit \
--master=k8s://https://192.168.99.101:8443 \
--deploy-mode=cluster \
--conf spark.executor.instances=3 \
--name spark-pi  \
--class org.apache.spark.examples.SparkPi \
--conf spark.kubernetes.container.image=172.30.1.1:5000/myapp/spark-docker:latest \
--conf spark.kubernetes.namespace=$namespace \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.driver.pod.name=spark-pi-driver \
 local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jarOn working through the code it seems like the spark jar files are being copied to an internal location inside the container. But:
RBAC has been setup as follows: (oc get rolebinding -n myapp)
NAME                     ROLE                    USERS       GROUPS                         SERVICE ACCOUNTS   SUBJECTS
admin                    /admin                  developer                                                     
spark-role               /edit                                                              spark         And the service account (oc get sa -n myapp)
NAME       SECRETS   AGE
builder    2         18d
default    2         18d
deployer   2         18d
pusher     2         13d
spark      2         12dOr am i doing something silly here?
My kubernetes system is running inside Docker Machine (via virtualbox on osx) I am using:
Any hints on solving this greatly appreciated?
I know this is an 5m old post, but it looks that there's not enough information related to this issue around, so I'm posting my answer in case it can help someone.
It looks like you are not running the process inside the container as root, if that's the case you can take a look at this link (https://github.com/minishift/minishift/issues/2836).
Since it looks like you are also using openshift you can do:
oc adm policy add-scc-to-user anyuid -z spark-sa -n sparkIn my case I'm using kubernetes and I need to use runAsUser:XX. Thus I gave group read/write access to /opt/spark inside the container and that solved the issue, just add the following line to resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile.
RUN chmod g+rwx -R /opt/sparkOf course you have to re-build the docker images manually or using the provided script like shown below.
./bin/docker-image-tool.sh -r YOUR_REPO  -t YOUR_TAG build
./bin/docker-image-tool.sh -r YOUR_REPO -t YOUR_TAG  push