I have a spark cluster set-up on kubernetes and to run the spark-app.py
script on spark, I:
./bin/spark-submit \
--master k8s://https://<master-ip>:<port> \
--deploy-mode cluster \
--name spark-app \
--conf spark.executor.instances=3 \
--conf spark.kubernetes.container.image=my-repo/spark-py:v2.4.3 \
--conf spark.kubernetes.namespace=default \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image.pullPolicy=Always \
--conf spark.kubernetes.container.image.pullSecrets=<my-secret> \
--conf spark.kubernetes.pyspark.pythonVersion=3 \
local:///opt/spark/examples/src/main/python/spark-app.py
But this takes a lot of time as everytime I edit the script, I have to rebuild a new image.
Q1) How can I avoid re-building an image every time I edit just the script?
Q2) Is there a way so that spark-submit can accept script from my computer?