Pods on our k8s cluster are scheduled with Airflow's KubernetesExecutor, which runs all Tasks in a new pod.
I have a such a Task for which the pod instantly (after 1 or 2 seconds) crashes, and for which of course I want to see the logs.
This seems hard. As soon the pod crashes, it gets deleted, along with the ability to retrieve crash logs. I already tried all of:
kubectl logs -f <pod> -p: cannot be used since these pods are named uniquely (courtesy of KubernetesExecutor).kubectl logs -l label_name=label_value: I struggle to apply the labels to the pod (if this is a known/used way of working, I'm happy to try further)nfs is mounted on all pods on a fixed log directory. The failing pod however, does not log to this folder.kubectl logs -f -l dag_id=sample_dag --all-containers (dag_idlabel is added byAirflow) between running and crashing and see Error from server (BadRequest): container "base" in pod "my_pod" is waiting to start: ContainerCreating. This might give me some clue but:I'm basically looking for the canonical way of retrieving logs from transient pods
You need to enable remote logging. Code sample below is for using S3. In airflow.cfg set the following:
remote_logging = True
remote_log_conn_id = my_s3_conn
remote_base_log_folder = s3://airflow/logsThe my_s3_conn can be set in airflow>Admin>Connections. In the Conn Type dropdown, select S3.