I am exploring argo workflow for my spark
usecase. Are there any example YAML
which shows how to execute a spark job
on k8s
using Argo workflow
Here is an example to run the Pi example of Spark, just replace the correct values for images, class, url of the k8s api
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: wf-spark-pi
namespace: spark
spec:
entrypoint: sparkapp
templates:
- name: sparkapp
container:
image: Spark-Image
imagePullPolicy: Always
command: [sh]
args:
- /opt/spark/bin/spark-submit
- --master
- k8s://https://<K8S_API_TCP_ADDR>:<K8S_API_TCP_PORT>
- --deploy-mode
- cluster
- --conf
- spark.kubernetes.namespace=spark
- --conf
- spark.kubernetes.container.image=Spark-Image
- --conf
- spark.kubernetes.driver.pod.name=spark
- --conf
- spark.executor.instances=2
- --class
- org.apache.spark.examples.SparkPi
- local:///opt/spark/examples/jars/spark-examples_2.11-2.4.5.jar
resources: {}
restartPolicy: OnFailure
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: hello-spark-
spec:
entrypoint: sparkapp
templates:
- name: sparkapp
container:
image: sparkimage
command: [sh]
args: [
"-c",
"sh /opt/spark/bin/spark-submit.sh \"--class\" \"org.apache.spark.examples.SparkPi\" \"/opt/spark/examples/jars/spark-examples_2.11-2.4.0.jar\" "
]
Hope This helps !