specify host for application jar file

12/26/2018

I have created an HTTP server from the directory using this command:

python -c 'import BaseHTTPServer as bhs, SimpleHTTPServer as shs; bhs.HTTPServer(("0.0.0.0", 8089), shs.SimpleHTTPRequestHandler).serve_forever()'

I use Spark-K8s operator for the Spark application execution. As main file I want to use jar file stored in the directory (from which I create http server). However, I do not know to which host I should map my directory so it could be used by the Spark application running through the deployment. When I run it with a current host I get this connection error:

java.net.ConnectException: Connection refused

Basically, I have HTTP server which refers to specified host and port and I want to run this jar file using Spark on top of K8s. How can I define this host?

For now application definition looks like this:

apiVersion: sparkoperator.k8s.io/v1alpha1
kind: SparkApplication
metadata:
  name: spark-example
  namespace: default
spec:
  type: Scala
  image: gcr.io/spark-operator/spark:v2.4.0
  mainClass: org.apache.spark.examples.SparkExample
  mainApplicationFile: https://0.0.0.0:8089/spark_k8s_airflow.jar
  mode: cluster
  deps: {}
  driver:
    coreLimit: 1000m
    cores: 0.1
    labels:
      version: 2.4.0
    memory: 1024m
    serviceAccount: intended-mink-spark
  executor:
    cores: 1
    instances: 1
    labels:
      version: 2.4.0
    memory: 1024m
  imagePullPolicy: Never
-- Cassie
apache-spark
kubernetes
python

1 Answer

12/26/2018

Basically, I have HTTP server which refers to specified host and port and I want to run this jar file using Spark on top of K8s. How can I define this host?

The kubernetes way of doing that is via a Service, which by default creates a DNS entry of the form service-name.service-namespace.svc.cluster.local where service-name and service-namespace are not those literal words, but the other 3 are literally that. However, if you just want to play around, and creating a Service is too much work, then you can use the current IP of the Pod in which your SimpleHTTPServer is running.

mainApplicationFile: https://0.0.0.0:8089/spark_k8s_airflow.jar

Be aware that, at least as you have written the python example above, you cannot use https: since SimpleHTTPServer is just that HTTP. It's possible you can convince one of the built-in packages to serve https, but it'll be a lot more typing and arguably not worth the effort

-- mdaniel
Source: StackOverflow