run spark on kubernetes

5/31/2021

I've installed spark charts helm on my k8s cluster and i have 3 pods running 1 master et 2 executos but still enable to submit spark job... In the section "Submit an application" https://github.com/bitnami/charts/tree/master/bitnami/spark it mention that we could use ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://<master-IP>:<master-cluster-port> --deploy-mode cluster ./examples/jars/spark-examples_2.11-2.4.3.jar 1000 But from where?? from our local machine? from inside spark master pod? Any help please?

-- mam667
apache-spark
bitnami
kubernetes
kubernetes-helm

2 Answers

6/1/2021

Bitnami Engineer here, when you install the chart of spark appears the following lines:

...
2. Submit an application to the cluster:

  To submit an application to the cluster the spark-submit script must be used. That script can be obtained at https://github.com/apache/spark/tree/master/bin. Also you can use kubectl run.

  export EXAMPLE_JAR=$(kubectl exec -ti --namespace default spark-worker-0 -- find examples/jars/ -name 'spark-example*\.jar' | tr -d '\r')

  kubectl exec -ti --namespace default spark-worker-0 -- spark-submit --master spark://spark-master-svc:7077 \
    --class org.apache.spark.examples.SparkPi \
    $EXAMPLE_JAR 5
...
-- Ibone
Source: StackOverflow

6/2/2021

You can use the documentation of spark for this, you already have a Redis cluster.

I found this command:

./bin/spark-submit \
   --master yarn \
   --deploy-mode cluster \
   wordByExample.py

in Kubernetes will be something like this:

kubectl exec -ti --namespace default spark-worker-0 -- spark-submit --master yarn --deploy-mode cluster wordByExample.py
-- Ibone
Source: StackOverflow