Spark with Kubernetes connecting to pod id, not address

2/13/2019

We have a k8s deployment of several services including Apache Spark. All services seem to be operational. Our application connects to the Spark master to submit a job using the k8s DNS service for the cluster where the master is called spark-api so we use master=spark://spark-api:7077 and we use spark.submit.deployMode=cluster. We submit the job through the API not by the spark-submit script.

This will run the "driver" and all "executors" on the cluster and this part seems to work but there is a callback to the launching code in our app from some Spark process. For some reason it is trying to connect to harness-64d97d6d6-4r4d8, which is the pod ID, not the k8s cluster IP or DNS.

How could this pod ID be getting into the system? Spark somehow seems to think it is the address of the service that called it. Needless to say any connection to the k8s pod ID fails and so does the job.

Any idea how Spark could think the pod ID is an IP address or DNS name?

BTW if we run a small sample job with master=local all is well, but the same job executed with the above config tries to connect to the spurious pod ID.

BTW2: the k8s DNS for the calling pod is harness-api

-- pferrel
amazon-eks
apache-spark
kubernetes

1 Answer

5/24/2019

You can consider to use Headless service for harness-64etcetc Pod in order to accomplish backward DNS discovery. Actually, it will create particular endpoint for the relevant service by matching appropriate selector inside your application Pod and as result A record expects to be added into Kubernetes DNS configuration.

Eventually, I've found related #266 Github issue, which probably can bring some useful information for further investigation.

-- mk_sta
Source: StackOverflow