My goal is to setup configmap and then use the config file in the spark application. Here are the details:
I have a config file (test_config.cfg) that looks like this:
[test_tracker]
url = http://localhost:8080/testsomething/
username = TEST
password = SECRET
I created the config map by running the following command:
kubectl create configmap testcfg1 --from-file test_config.cfg
Now, I have a YAML file(testprog.yaml) with SparkApplication specs that look like this:
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: testprog
namespace: default
spec:
type: Python
pythonVersion: "3"
mode: cluster
image: "<ip-url>:5000/schemamatcher/schemamatcher-spark-py:latest"
imagePullPolicy: Always
mainApplicationFile: local:///opt/spark/dependencies/testprog.py
arguments: ['s3a://f1.parquet', 's3a://f2.parquet', '--tokenizer-type', 'param']
sparkVersion: "3.0.0"
restartPolicy:
type: OnFailure
onFailureRetries: 3
onFailureRetryInterval: 10
onSubmissionFailureRetries: 5
onSubmissionFailureRetryInterval: 20
driver:
cores: 1
coreLimit: "1200m"
memory: "16g"
labels:
version: 3.0.0
serviceAccount: default
configMaps:
- name: testcfg1
path: /mnt/config-maps
executor:
cores: 1
instances: 2
memory: "20g"
labels:
version: 3.0.0
hadoopConf:
"fs.s3a.access.key": minio
"fs.s3a.secret.key": minio123
"fs.s3a.endpoint": http://<ip-url>:9000
Now, I am able to run the program using:
kubectl apply -f testprog.yaml
the pod just runs fine and doesn't throw any error. But I am unable to see my config file at the path given and I don't understand why. When the pod is executing I do:
kubectl exec --stdin --tty test-driver -- /bin/bash
and I try to look for the config file in the path /mnt/config-maps I don't see anything. I tried a couple of things but no luck. Besides, some of the documentation says that mutation webhook should be setup and I think the previous guy did it but I am not sure how to check it (but I think it is there).
Any help would be great as I am new and I am still learning about k8s.
Update: Have also tried to update the specs like this and run and still no luck.
volumes:
- name: config
configMap:
name: testcfg1
driver:
cores: 1
coreLimit: "1200m"
memory: "16g"
labels:
version: 3.0.0
serviceAccount: default
volumeMounts:
- name: config
mountPath: /opt/spark
executor:
cores: 1
instances: 2
memory: "20g"
labels:
version: 3.0.0
volumeMounts:
- name: config
mountPath: /opt/spark
Try this:
kubectl apply -f manifest/spark-operator-with-webhook.yaml
This will enable mutating admission webhooks. This will create a deployment named sparkoperator and a service named spark-webhook for the webhook in namespace spark-operator.
Not sure if this issue was solved in Spark v3.0.0 (that you seem to be using), but there was a bug in Spark on Kubernetes that was preventing ConfigMaps from mounting properly. See this discussion: https://stackoverflow.com/a/58508313/8570169