I have 3 node[host a,host b, host c] kubernetes cluster(version 1.12.2). I am trying run spark-pi example jar as mentioned in kubernetes document.
Host a is my kubernetes Master. >> kubectl get nodees list all the three nodes.
I have built the spark docker image using whats provided in spark 2.3.0 binary folder.
>> sudo ./bin/docker-image-tool.sh -r docker.io/spark/spark -t spark230 build
I got the message the image got built successfully.
>> docker images ls
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/spark/spark spark230 6a2b645d7efe About an hour ago 346 MB
docker.io/weaveworks/weave-npc 2.5.0 d499500e93d3 7 days ago 49.5 MB
docker.io/weaveworks/weave-kube 2.5.0 a5103f96993a 7 days ago 148 MB
docker.io/openjdk 8-alpine 97bc1352afde 2 weeks ago 103 MB
k8s.gcr.io/kube-proxy v1.12.2 15e9da1ca195 2 weeks ago 96.5 MB
k8s.gcr.io/kube-apiserver v1.12.2 51a9c329b7c5 2 weeks ago 194 MB
k8s.gcr.io/kube-controller-manager v1.12.2 15548c720a70 2 weeks ago 164 MB
k8s.gcr.io/kube-scheduler v1.12.2 d6d57c76136c 2 weeks ago 58.3 MB
k8s.gcr.io/etcd 3.2.24 3cab8e1b9802 7 weeks ago 220 MB
k8s.gcr.io/coredns 1.2.2 367cdc8433a4 2 months ago 39.2 MB
k8s.gcr.io/pause 3.1 da86e6ba6ca1 10 months ago 742 kB
> ./bin/spark-submit
--master k8s://https://<api-server>:<api
> server port> --deploy-mode cluster --name spark-pi
> --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=5 --conf
> spark.kubernetes.container.image=spark/spark:spark230 --conf
> spark.kubernetes.authenticate.driver.serviceAccountName=spark
> local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar
When I submit above command , it gives proper output sometimes. Other times it throws below error.
> code = Unknown desc = repository docker.io/spark/spark not found:
does not exist or no pull access, reason=ErrImagePull
When I debugged it further, it looks like, whenever node name: host b or host c its throwing above error message.
When node name : host a , then it runs fine. Looks like other nodes are unable to locate the image.
Questions:
Yes, you need to build Spark image in all the nodes. You can write the wrapper script to invoke the 'rebuild-image.sh' on all the nodes as below:
for h in hostnames; do
rsync -av /opt/spark ${h}:/opt
ssh ${h} /opt/spark/rebuild-image.sh
You can always save docker image as tar file and then copy that tar file to other host and load the image there.
To save docker image as tar file:
sudo docker save -o <path for generated tar file> <image name>
Now copy your tar file to other host using scp or some other copy tools. And load the docker image using:
sudo docker load -i <path to image tar file>
Hope this helps