I would like to provide DAGs to all Kubernetes airflow pods (web, scheduler, workers) via a persistent volume,
kubectl create -f pv-claim.yaml
pv-claim.yaml containing:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: airflow-pv-claim
annotations:
pv.beta.kubernetes.io/gid: "1000"
pv.beta.kubernetes.io/uid: "1000"
spec:
storageClassName: standard
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
The deployment command is then:
helm install --namespace my_name --name "airflow" stable/airflow --values ~my_name/airflow/charts/airflow/values.yaml
In the chart stable/airflow, values.yaml also allows for specification of persistence:
persistence:
enabled: true
existingClaim: airflow-pv-claim
accessMode: ReadWriteMany
size: 1Gi
But if I do
kubectl exec -it airflow-worker-0 -- /bin/bash
touch dags/hello.txt
I get a permission denied error.
I have tried hacking the airflow chart to set up an initContainer to chown dags/:
command: ["sh", "-c", "chown -R 1000:1000 /dags"]
which is working for all but the workers (because they are created by flower?), as suggested at https://serverfault.com/a/907160/464205
I have also seen talk of fsGroup etc. - see e.g. Kubernetes NFS persistent volumes permission denied
I am trying to avoid editing the airflow charts (which seems to require hacks to at least two deployments-*.yaml files, plus one other), but perhaps this is unavoidable.
Punchline:
What is the easiest way to provision DAGs through a persistent volume to all airflow pods running on Kubernetes, with the correct permissions?
See also:
Persistent volume atached to k8s pod group
Kubernetes NFS persistent volumes permission denied [not clear to me how to integrate this with the airflow helm charts]
Kubernetes - setting custom permissions/file ownership per volume (and not per pod) [non-detailed, non-airflow-specific]
It turns out you do, I think, have to edit the airflow charts, by adding the following block in deployments-web.yaml
and deployments-scheduler.yaml
under spec.template.spec
:
kind: Deployment
spec:
template:
spec:
securityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
fsUser: 1000
This allows one to get dags into airflow using e.g.
kubectl cp my_dag.py my_namespace/airflow-worker-0:/usr/local/airflow/dags/