I am running Airflow with k8s executor.
I have everything set up under the [kubernetes]
section and things are working fine. However, I would prefer to use a pod file for the worker.
So I generated a pod.yaml from one of the worker container that spins up.
I have placed this file on a location accessible by the scheduler pod something like
/opt/airflow/yamls/workerpod.yaml
But when I try to specify this file in pod_template_file
parameter, it gives me these errors
[2020-03-02 22:12:24,115] {pod_launcher.py:84} ERROR - Exception when attempting to create Namespaced Pod.
Traceback (most recent call last):
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/airflow/contrib/kubernetes/pod_launcher.py", line 81, in run_pod_async
resp = self._client.create_namespaced_pod(body=req, namespace=pod.namespace, **kwargs)
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py", line 6115, in create_namespaced_pod
(data) = self.create_namespaced_pod_with_http_info(namespace, body, **kwargs)
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py", line 6206, in create_namespaced_pod_with_http_info
collection_formats=collection_formats)
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 334, in call_api
_return_http_data_only, collection_formats, _preload_content, _request_timeout)
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 168, in __call_api
_request_timeout=_request_timeout)
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 377, in request
body=body)
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/kubernetes/client/rest.py", line 266, in POST
body=body)
File "/opt/rh/rh-python36/root/usr/lib/python3.6/site-packages/kubernetes/client/rest.py", line 222, in request
raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'ab2bc6dc-96f9-4014-8a08-7dae6e008aad', 'Cache-Control': 'no-store', 'Content-Type': 'application/json', 'Date': 'Mon, 02 Mar 2020 22:12:24 GMT', 'Content-Length': '660'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"examplebashoperatorrunme0-c9ca5d619bc54bf2a456e133ad79dd00\" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{0}: 0 is not an allowed group spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000040000, 1000049999] spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden]","reason":"Forbidden","details":{"name":"examplebashoperatorrunme0-c9ca5d619bc54bf2a456e133ad79dd00","kind":"pods"},"code":403}
[2020-03-02 22:12:24,141] {kubernetes_executor.py:863} WARNING - ApiException when attempting to run task, re-queueing. Message: pods "examplebashoperatorrunme0-c9ca5d619bc54bf2a456e133ad79dd00" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{0}: 0 is not an allowed group spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000040000, 1000049999] spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 0: running with the root UID is forbidden]
Just to clarify, the pod.yaml file is generated from same running container that comes from configs in kubernetes section of airflow.cfg that works just fine. The run as user is correct. The SA is correct but still I am getting this error.
I am unsure if I should place this file in relation to where I kick off my kubectl apply
? Since it goes in the airflow.cfg, I didn't think that would be the case but rather should be accessible from within the scheduler container.
One strange thing I noticed is that even though I have specified and seem to be using KubernetesExecutor but when the individual worker pods come on they said LocalExecutor. That's something I had changed in the workerpod.yaml file to KubernetesExecutor.
here is pod yaml file
apiVersion: v1
kind: Pod
metadata:
annotations:
openshift.io/scc: nonroot
labels:
app: airflow-worker
kubernetes_executor: "True"
name: airflow-worker
# namespace: airflow
spec:
affinity: {}
containers:
env:
- name: AIRFLOW_HOME
value: /opt/airflow
- name: AIRFLOW__CORE__EXECUTOR
value: KubernetesExecutor
#value: LocalExecutor
- name: AIRFLOW__CORE__DAGS_FOLDER
value: /opt/airflow/dags
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
valueFrom:
secretKeyRef:
key: MYSQL_CONN_STRING
name: db-secret
image: ourrepo.example.com/airflow-lab:latest
imagePullPolicy: IfNotPresent
name: base
# resources:
# limits:
# cpu: "1"
# memory: 1Gi
# requests:
# cpu: 400m
# memory: 1Gi
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
volumeMounts:
- mountPath: /opt/airflow/dags
name: airflow-dags
readOnly: true
subPath: airflow/dags
- mountPath: /opt/airflow/logs
name: airflow-logs
- mountPath: /opt/airflow/airflow.cfg
name: airflow-config
readOnly: true
subPath: airflow.cfg
# - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
# name: airflow-cluster-access-token-5228g
# readOnly: true
dnsPolicy: ClusterFirst
# imagePullSecrets:
# - name: airflow-cluster-access-dockercfg-85twh
priority: 0
restartPolicy: Never
schedulerName: default-scheduler
securityContext:
# fsGroup: 0
runAsUser: 1001
seLinuxOptions:
level: s0:c38,c12
serviceAccount: airflow-cluster-access
serviceAccountName: airflow-cluster-access
# tolerations:
# - effect: NoSchedule
# key: node.kubernetes.io/memory-pressure
# operator: Exists
volumes:
- name: airflow-dags
persistentVolumeClaim:
claimName: ucdagent
- emptyDir: {}
name: airflow-logs
- configMap:
defaultMode: 420
name: airflow-config
name: airflow-config
# - name: airflow-cluster-access-token-5228g
# secret:
# defaultMode: 420
# secretName: airflow-cluster-access-token-5228g
Here is the working kubernetes config from airflow.cfg
[kubernetes]
#pod_template_file = /opt/airflow/yamls/workerpod.yaml
dags_in_image = False
worker_container_repository = ${AIRFLOW_IMAGE_NAME}
worker_container_tag = ${AIRFLOW_IMAGE_TAG}
worker_container_image_pull_policy = IfNotPresent
delete_worker_pods = False
in_cluster = true
namespace = ${AIRFLOW_NAMESPACE}
airflow_configmap = airflow-config
run_as_user = 1001
dags_volume_subpath = airflow/dags
dags_volume_claim = ucdagent
worker_service_account_name = airflow-cluster-access
[kubernetes_secrets]
AIRFLOW__CORE__SQL_ALCHEMY_CONN = db-secret=MYSQL_CONN_STRING
UPDATE: my airflow version is 1.10.7. I am guessing this is a newer parameters. I am trying to find if this is currently an empty config reference or it has been implemented in latest which is right now 1.10.9
UPDATE: This parameter has not beeen implemented as of 1.10.9