Kubernetes: view logs of crashed Airflow worker pod

2/12/2020

Pods on our k8s cluster are scheduled with Airflow's KubernetesExecutor, which runs all Tasks in a new pod.

I have a such a Task for which the pod instantly (after 1 or 2 seconds) crashes, and for which of course I want to see the logs.

This seems hard. As soon the pod crashes, it gets deleted, along with the ability to retrieve crash logs. I already tried all of:

  • kubectl logs -f <pod> -p: cannot be used since these pods are named uniquely (courtesy of KubernetesExecutor).
  • kubectl logs -l label_name=label_value: I struggle to apply the labels to the pod (if this is a known/used way of working, I'm happy to try further)
  • An shared nfs is mounted on all pods on a fixed log directory. The failing pod however, does not log to this folder.
  • When I am really quick I run kubectl logs -f -l dag_id=sample_dag --all-containers (dag_idlabel is added byAirflow) between running and crashing and see Error from server (BadRequest): container "base" in pod "my_pod" is waiting to start: ContainerCreating. This might give me some clue but:
    • these are only but the last log lines
    • this is really backwards

I'm basically looking for the canonical way of retrieving logs from transient pods

-- Raf
airflow
kubernetes

1 Answer

3/9/2020

You need to enable remote logging. Code sample below is for using S3. In airflow.cfg set the following:

remote_logging = True
remote_log_conn_id = my_s3_conn
remote_base_log_folder = s3://airflow/logs

The my_s3_conn can be set in airflow>Admin>Connections. In the Conn Type dropdown, select S3.

-- alltej
Source: StackOverflow