Running Airflow tasks on Kubernetes

8/1/2019

I am interested in running specific Airflow tasks on Kubernetes. The airflow workers themselves need not run on Kubernetes. Doing a bit research I came across KubernetesPodOperator. However I found no example of how to configure the operator to run in a cluster. Is it possible to configure the KubernetesPodOperator to run tasks on a remote cluster? The behaviour should be similar to the ECSOperator.

-- Yeezus
airflow
kubernetes

1 Answer

11/27/2019

Airflow can run both within kubernetes and outside of it.

In your case, when you run airflow outside of kubernetes cluster, you need to tell airflow where to find the kubernetes cluster.

In the KubernetesPodOperator source code

client = kube_client.get_kube_client(in_cluster=self.in_cluster,
                                                 cluster_context=self.cluster_context,
                                                 config_file=self.config_file)

You need to set those 3 parameters when initiating the KubernetesPodOperator

Detailed explanation on those parameters are in the source code as well:

:param in_cluster: run kubernetes client with in_cluster configuration.
:param cluster_context: context that points to kubernetes cluster. Ignored when in_cluster is True. If None, current-context is used.
:param config_file: The path to the Kubernetes config file. (templated) If not specified, default value is ``~/.kube/config``
-- fuyi
Source: StackOverflow