how to submit spark application from kubernetes

6/23/2017

Have a look at the image from https://spark.apache.org/docs/latest/cluster-overview.html.

enter image description here

The spark cluster in running outside kubernetes. But I am going to run the driver program inside kubernetes. The issue is how to let the spark cluster know whether the driver program is.

My kubernetes yaml file:

kind: List
apiVersion: v1
items:
- kind: Deployment
  apiVersion: extensions/v1beta1
  metadata:
    name: counter-uat
  spec:
    replicas: 1
    selector:
      matchLabels:
        name: spark-driver
    template:
      metadata:
        labels:
          name: spark-driver
      spec:
        containers:
          - name: counter-uat
            image: counter:0.1.0
            command: ["/opt/spark/bin/spark-submit", "--class", "Counter", "--master", "spark://spark.uat:7077", "/usr/src/counter.jar"]
- kind: Service
  apiVersion: v1
  metadata:
    name: spark-driver
    labels:
      name: spark-driver
  spec:
    type: NodePort
    ports:
    - name: port
      port: 4040
      targetPort: 4040
    selector:
      name: spark-driver

The error is:

Caused by: java.io.IOException: Failed to connect to /172.17.0.8:44117
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:197)
    at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:191)
    at org.apache.spark.rpc.netty.Outbox$anon$1.call(Outbox.scala:187)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: Host is unreachable: /172.17.0.8:44117

The spark cluster is trying to reach the driver program which ip is 172.17.0.8. 172.17.0.8 may be an internal ip inside kubernetes.

How to fix the problem? How to fix my yaml file? Thanks

UPDATE

I added the following two parameters: "--conf", "spark.driver.bindAddress=192.168.42.8", "--conf", "spark.driver.host=0.0.0.0".

But from the log, still trying to reach 172.17.0.8, which is the kubernetes internal pod ip.

UPDATE

kind: List
apiVersion: v1
items:
- kind: Deployment
  apiVersion: extensions/v1beta1
  metadata:
    name: counter-uat
  spec:
    replicas: 1
    selector:
      matchLabels:
        name: counter-driver
    template:
      metadata:
        labels:
          name: counter-driver
      spec:
        containers:
          - name: counter-uat
            image: counter:0.1.0
            command: ["/opt/spark/bin/spark-submit", "--class", "Counter", "--master", "spark://spark.uat:7077", "--conf", "spark.driver.bindAddress=192.168.42.8","/usr/src/counter.jar"]

kind: Service
apiVersion: v1
metadata:
  name: counter-driver
  labels:
    name: counter-driver
spec:
  type: NodePort
  ports:
  - name: driverport
    port: 42761
    targetPort: 42761
    nodePort: 30002
  selector:
    name: counter-driver

Another error:

2017-06-23T20:00:07.487656154Z Exception in thread "main" java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (starting from 31319)! Consider explicitly setting the appropriate port for the service 'sparkDriver' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
-- BAE
apache-spark
kubectl
kubernetes
minikube

1 Answer

6/23/2017

Try setting spark.driver.host or spark.driver.bindAddress to "spark.uat" or "spark-driver.uat" or the actual driver host in Spark itself. This is a common issue with these type of distributed projects where the master tells the client where to connect. If you don't specify spark.driver.host, it tries to figure out the proper host by itself and just uses the IP it sees. But in this case the IP it sees is an internal Kubernetes IP and might not work properly for the client.

You can also try setting SPARK_PUBLIC_DNS environment variable. It actually has a more promising description of:

Hostname your Spark program will advertise to other machines.

-- kichik
Source: StackOverflow