My airflow service runs as a kubernetes deployment, and has two containers, one for the webserver
and one for the scheduler
. I'm running a task using a KubernetesPodOperator, with in_cluster=True
parameters, and it runs well, I can even kubectl logs pod-name
and all the logs show up.
However, the airflow-webserver
is unable to fetch the logs:
*** Log file does not exist: /tmp/logs/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log
*** Fetching from: http://pod-name-7dffbdf877-6mhrn:8793/log/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='pod-name-7dffbdf877-6mhrn', port=8793): Max retries exceeded with url: /log/dag_name/task_name/2020-05-19T23:17:33.455051+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fef6e00df10>: Failed to establish a new connection: [Errno 111] Connection refused'))
It seems as the pod is unable to connect to the airflow logging service, on port 8793. If I kubectl exec bash
into the container, I can curl localhost on port 8080, but not on 80 and 8793.
Kubernetes deployment:
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: pod-name
namespace: airflow
spec:
replicas: 1
selector:
matchLabels:
app: pod-name
template:
metadata:
labels:
app: pod-name
spec:
restartPolicy: Always
volumes:
- name: airflow-cfg
configMap:
name: airflow.cfg
- name: dags
emptyDir: {}
containers:
- name: airflow-scheduler
args:
- airflow
- scheduler
image: registry.personal.io:5000/image/path
imagePullPolicy: Always
volumeMounts:
- name: dags
mountPath: /airflow_dags
- name: airflow-cfg
mountPath: /home/airflow/airflow.cfg
subPath: airflow.cfg
env:
- name: EXECUTOR
value: Local
- name: LOAD_EX
value: "n"
- name: FORWARDED_ALLOW_IPS
value: "*"
ports:
- containerPort: 8793
- containerPort: 8080
- name: airflow-webserver
args:
- airflow
- webserver
- --pid
- /tmp/airflow-webserver.pid
image: registry.personal.io:5000/image/path
imagePullPolicy: Always
volumeMounts:
- name: dags
mountPath: /airflow_dags
- name: airflow-cfg
mountPath: /home/airflow/airflow.cfg
subPath: airflow.cfg
ports:
- containerPort: 8793
- containerPort: 8080
env:
- name: EXECUTOR
value: Local
- name: LOAD_EX
value: "n"
- name: FORWARDED_ALLOW_IPS
value: "*"
note: If airflow is run in dev environment (locally instead of kubernetes) it all works perfectly.
Airflow deletes the pods after task completion, could it be that the pods are just missing so it can't access their logs?
Try set to see if that's the case AIRFLOW__KUBERNETES__DELETE_WORKER_PODS=False
When running airflow on Kubernetes I suggest using remote logging (e.g. s3) this way the logs are kept when the pods are deleted.