Zeppelin Spark Master Settings on Kubernetes

9/10/2020

I am configuring Zeppelin 0.8 to run on Kubernetes with my spark cluster in the same namespace.

The issue is, when I mention the spark master in the spark interpreter settings, my code stops working with error,

java.lang.RuntimeException: SPARK_HOME is not specified in interpreter-setting for non-local mode, if you specify it in zeppelin-env.sh, please move that into  interpreter setting

Does anyone actively use Zeppelin on Kubernetes to run Spark Apps?

Any Leads would be appreciated!

-- unitom
apache-spark
apache-zeppelin
kubernetes

1 Answer

9/22/2020

This was pretty straight forward. All that was required to be done was adding SPARK_HOME in the spark's interpreter settings in Zeppelin.

SPARK_HOME needs to be directed to the the downloaded spark release. In my case, I am using spark 2.4 downloaded from here : https://archive.apache.org/dist/spark/spark-2.4.0/

I used the release with Hadoop 2.7 and mounted the files on my running container.

To connect to spark master running in Kubernetes, Zeppelin requires binaries in the release.

Also do mention master url in interpreter settings as spark://spark-master:7077

This made the setup working smoothly, though I am currently navigating some DNS issues which are hampering internal connections.

-- unitom
Source: StackOverflow