How to inject evnironment variables to driver pod when using spark-on-k8s?

7/16/2020

I am writing a Kubernetes Spark Application using GCP spark on k8s.

Currently, I am stuck at not being able to inject environment variables into my container.

I am following the doc here

Manifest:

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: spark-search-indexer
  namespace: spark-operator
spec:
  type: Scala
  mode: cluster
  image: "gcr.io/spark-operator/spark:v2.4.5"
  imagePullPolicy: Always
  mainClass: com.quid.indexer.news.jobs.ESIndexingJob
  mainApplicationFile: "https://lala.com/baba-0.0.43.jar"
  arguments:
    - "--esSink"
    - "http://something:9200/mo-sn-{yyyy-MM}-v0.0.43/searchable-article"
    - "-streaming"
    - "--kafkaTopics"
    - "annotated_blogs,annotated_ln_news,annotated_news"
    - "--kafkaBrokers"
    - "10.1.1.1:9092"
  sparkVersion: "2.4.5"
  restartPolicy:
    type: Never
  volumes:
    - name: "test-volume"
      hostPath:
        path: "/tmp"
        type: Directory
  driver:
    cores: 1
    coreLimit: "1200m"
    memory: "512m"
    env:
      - name: "DEMOGRAPHICS_ES_URI"
        value: "somevalue"
    labels:
      version: 2.4.5
    volumeMounts:
      - name: "test-volume"
        mountPath: "/tmp"
  executor:
    cores: 1
    instances: 1
    memory: "512m"
    env:
      - name: "DEMOGRAPHICS_ES_URI"
        value: "somevalue"
    labels:
      version: 2.4.5
    volumeMounts:
      - name: "test-volume"
        mountPath: "/tmp"

Environment Variables set at pod:

Environment:
      SPARK_DRIVER_BIND_ADDRESS:   (v1:status.podIP)
      SPARK_LOCAL_DIRS:           /var/data/spark-1ed8539d-b157-4fab-9aa6-daff5789bfb5
      SPARK_CONF_DIR:             /opt/spark/conf
-- codersofthedark
apache-spark
kubernetes
spark-operator

1 Answer

7/16/2020

It turns out to use this one must enable webhooks (how to set up in quick-start guide here)

The other approach could be to use envVars

Example:

   spec:
       executor:
           envVars:
               DEMOGRAPHICS_ES_URI: "somevalue"

Ref: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/978

-- codersofthedark
Source: StackOverflow