Can't create spark session using yarn inside kubernetes pod

10/31/2019

I have a kubernetes pod with spark client installed.

bash-4.2# spark-shell --version
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.1.2.6.2.0-205
      /_/

Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_144
Branch HEAD
Compiled by user jenkins on 2017-08-26T09:32:23Z
Revision a2efc34efde0fd268a9f83ea1861bd2548a8c188
Url git@github.com:hortonworks/spark2.git
Type --help for more information.
bash-4.2#

I can submit a spark job successfully under client and cluster mode using these commands:

${SPARK_HOME}/bin/spark-submit --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=$PYTHONPATH:/usr/local/spark/python:/usr/local/spark/python/lib/py4j-0.10.4-src.zip --master yarn --deploy-mode client --num-executors 50 --executor-cores 4 --executor-memory 3G  --driver-memory 6G my_python_script.py --config=configurations/sandbox.yaml --startdate='2019-01-01' --enddate='2019-08-01'
${SPARK_HOME}/bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 ${SPARK_HOME}/lib/spark-examples*.jar 10

But whenever I start a session using any of these:

spark-shell --master yarn
pyspark --master yarn

It hangs and times out with this error:

org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.

We have another python script that needs to create a spark session. The code on that script is:

from pyspark import SparkConf
from pyspark.sql import SparkSession
conf = SparkConf()
conf.setAll(configs.items())
spark = SparkSession.builder.config(conf=conf).enableHiveSupport().getOrCreate()

Not sure where else to check. This is the first time we are initiating a spark connection from inside a kubernetes cluster. Getting a spark session inside a normal virtual machine works fine. Not sure what is the difference in terms of network connection. It also puzzles me that I was able to submit a spark job above but unable to create a spark session.

Any thoughts and ideas is highly appreciated. Thanks in advance.

-- mit13
apache-spark
kubernetes

1 Answer

10/31/2019

In client mode Spark Driver process is running on your machine and Executors run on Yarn nodes (spark-shell and pyspark submit client mode sessions). Driver and Executor processes to communicate should be able to connect to each other via network in both directions.

Since submitting jobs in cluster mode works for you and you can reach the Yarn master from the Kubernetes Pod network, that route is fine. Most probably you don't have network access from the Yarn cluster network to the Pod, which most probably lives within Kubernetes private network unless exposed explicitly. This is the first thing I would recommend you to check, as well as Yarn logs.

After you expose the Pod to be accessible from the Yarn cluster network you may want to refer the following spark configs to setup bindings:

- spark.driver.host
- spark.driver.port
- spark.driver.bindAddress
- spark.blockManager.port

Find their descriptions in docs.

-- Aliaksandr Sasnouskikh
Source: StackOverflow