Helm /bitnami/spark - how to load files to extraVolumeMounts?

2/18/2020

I'm using the Helm chart to deploy Spark to Kubernetes in GCE. I've configured extraVolumes and extraVolumeMounts in values.yaml and they were created successfully during deployment.

What is the right way to add files to these volumes during the chart's deployment?

  ## Array to add extra volumes
  extraVolumes:
    - name: spark-volume

  ## Array to add extra mounts (normally used with extraVolumes)
  extraVolumeMounts:
    - name: spark-volume
      mountPath: /tmp/new-data
-- samba
apache-spark
configuration
google-cloud-platform
kubernetes-helm

1 Answer

2/18/2020

Depends where are the files. If you have them inside a repo I would use an initContainer to clone it.

This might look like this:

    initContainers:
      - name: git-clone-spark-volumes
        image: alpine/git
        args:
          - clone
          - --single-branch
          - --branch=master
          - --depth=1
          - --
          - https://github.com/your/repo.git
          - /tmp/new-data
        securityContext:
          runAsUser: 0
        volumeMounts:
          - name: spark-volume
            mountPath: /tmp/new-data
    extraVolumes:
      - name: spark-volume
        emptyDir: {}
    extraVolumeMounts:
      - name: spark-volume
        mountPath: /tmp/new-data

This clones the repo (https://github.com/your/repo.git) into /tmp/new-data folder, to where the spark-volume is mounted.

If your files are key=value based you could use ConfigMap:

$ kubectl create configmap spark-volume --from-file=configure-pod-container/configmap/game.properties --from-file=configure-pod-container/configmap/ui.properties

Which can be used:

    extraVolumes:
      - name: spark-volume
        configMap:
          name: spark-volume
    extraVolumeMounts:
      - name: spark-volume
        mountPath: /tmp/new-data

This is nicely described on Configure a Pod to Use a ConfigMap.

-- Crou
Source: StackOverflow