pod_mutation_hook function not working on airflow running in kubernetes using KubernetesExecutor

4/6/2020

I am attempting to migrate an airflow deployment running in kubernetes from the CeleryExecutor to the KubernetesExecutor. Everything went smoothly in my local development environment (running on minikube), however I need to load a sidecar container in production to run a proxy that allows me to connect to my sql database. After some googling it appears that defining the pod_mutation_hook function in an airflow_local_settings.py file somewhere on the $PYTHONPATH is how one is supposed to accomplish this.

First I tried defining this in a config map per this example. e.g.

apiVersion: v1
kind: ConfigMap
metadata:
  name: airflow-config
  namespace: dev
data:
  ...

  AIRFLOW__KUBERNETES__LOGS_VOLUME_CLAIM: "airflow-logs"

  AIRFLOW__KUBERNETES__AIRFLOW_LOCAL_SETTINGS_CONFIGMAP: "airflow-config"
  ...

  airflow_local_settings.py: |
    from airflow.contrib.kubernetes.pod import Pod

    def pod_mutation_hook(pod: Pod):
        extra_labels = {
            "test-label": "True",
        }
        pod.labels.update(extra_labels)

I specified this configmap in the airflow.cfg file, and it gets picked up and mounted fine, all the other env variables work correctly, but pod_mutation_hook does not appear to run as no labels are added to the resulting pod launched by the kubernetes executor (note that the logs volume claim is also specified here, and works correctly).

Next I tried to define the airflow_local_settings.py file in the image that airflow is launching for the job under $AIRFLOW_HOME/configs/airflow_local_settings.py as suggested in a comment here. I also removed the relevant sections from the airflow-config configmap above. This also appeared to have no effect on the resulting pod created for the job, as it also lacked the specified labels.

So, I am unsure how to proceed at this point, because I don't understand how I am supposed to specify the airflow_local_settings.py file and the pod_mutation_hook function such that they actually mutate the pod before running. Any help would be greatly appreciated. Thank you.

-- WillySchu
airflow
kubernetes
kubernetesexecutor
minikube

2 Answers

5/1/2020

Are you setting "airflow_local_settings_configmap = airflow-configmap" in airflow.cfg field?

-- 1220122
Source: StackOverflow

4/15/2020

I had the same issue, please ensure that airflow_local_settings can be imported from the scheduler. You will have to bake these changes into the images.

WORKDIR ${AIRFLOW_USER_HOME}
ENV PYTHONPATH  $PYTHONPATH:$AIRFLOW_HOME/config/
COPY airflow_local_settings.py $AIRFLOW_HOME/config/airflow_local_settings.py

Using the configmap you highlighted above will get them into the executors but at that point is not needed, so is kind of a useless setting. Feel free to read on the source code:

https://github.com/apache/airflow/blob/8465d66f05baeb73dd4479b019515c069444616e/airflow/settings.py

-- Luis Magana
Source: StackOverflow