I want to have my Cloud Composer environment (Google Cloud's managed Apache Airflow service) start pods on a different kubernetes cluster. How should I do this?
Note that Cloud composer runs airflow on a kubernetes cluster. That cluster is considered to be the composer "environment". Using the default values for the KubernetesPodOperator
, composer will schedule pods on its own cluster. However in this case, I have a different kubernetes cluster on which I want to run the pods.
I can connect to the worker pods and run a gcloud container clusters get-credentials CLUSTERNAME
there, but every now and then the pods get recycled so this is not a durable solution.
I noticed that the KubernetesPodOperator
has both an in_cluster
and a cluster_context
argument, which seem useful. I would expect that this would work:
pod = kubernetes_pod_operator.KubernetesPodOperator(
task_id='my-task',
name='name',
in_cluster=False,
cluster_context='my_cluster_context',
image='gcr.io/my/image:version'
)
But this results in kubernetes.config.config_exception.ConfigException: Invalid kube-config file. Expected object with name CONTEXTNAME in kube-config/contexts list
Although if I run kubectl config get-contexts
in the worker pods, I can see the cluster config listed.
So what I fail to figure out is:
Check out the GKEPodOperator for this.
Example usage from the docs :
operator = GKEPodOperator(task_id='pod_op',
project_id='my-project',
location='us-central1-a',
cluster_name='my-cluster-name',
name='task-name',
namespace='default',
image='perl')