Pods on our k8s cluster are scheduled with Airflow
's KubernetesExecutor
, which runs all Task
s in a new pod.
I have a such a Task
for which the pod instantly (after 1 or 2 seconds) crashes, and for which of course I want to see the logs.
This seems hard. As soon the pod crashes, it gets deleted, along with the ability to retrieve crash logs. I already tried all of:
kubectl logs -f <pod> -p
: cannot be used since these pods are named uniquely (courtesy of KubernetesExecutor
).kubectl logs -l label_name=label_value
: I struggle to apply the labels to the pod (if this is a known/used way of working, I'm happy to try further)nfs
is mounted on all pods on a fixed log directory. The failing pod however, does not log to this folder.kubectl logs -f -l dag_id=sample_dag --all-containers (
dag_idlabel is added by
Airflow)
between running and crashing and see Error from server (BadRequest): container "base" in pod "my_pod" is waiting to start: ContainerCreating
. This might give me some clue but:I'm basically looking for the canonical way of retrieving logs from transient pods
You need to enable remote logging. Code sample below is for using S3. In airflow.cfg
set the following:
remote_logging = True
remote_log_conn_id = my_s3_conn
remote_base_log_folder = s3://airflow/logs
The my_s3_conn
can be set in airflow>Admin>Connections. In the Conn Type
dropdown, select S3
.