I'm using Cloud composer to orchestrate my airflow instance but not sure how to install packages for the airflow worker bash.
Previously I was running airflow on a google compute engine instance using docker, it was easy to specify requirements via the docker make file.
As someone who is new to kubernetes and Cloud composer, I was wondering if there is something similar I could do for kubernetes/Cloud composer as I previously did for docker?
I looking to install lzop
for unix and also would need to update the gsutil boto config file with s3 credentials.
How do you pull a new docker image into kubernetes and recreate it for pods?
Sorry if my lingo is incorrect, this is new to me
At the moment, if I read the documentation correctly, you cannot modify the images used by Composer. Unless you deploy your custom solution on a Kubernetes cluster, I think you cannot extend it beyond Python Libraries and Airflow Plugins.
You can ssh into each worker Compute Engine instance and install it manually on each machine.
You may try to run apt
it via BashOperator
, but I doubt you will succeed; unfortunately, Composer is still a Beta product, and many features are still in the making.
If I understood your question correctly, then here is my answer.
Kubernetes is an orchestrator for Docker, therefore, in Kubernetes cluster, you can use Airflow images built by you for Docker.
Just build your own image using docker build <you_Dockerfile>
, push it to your registry, and after that, you can use the image in the section image: <your_registry>/<your_image>
in Airflow deployment YAML-file for Kubernetes. Here you can find an example of Setting Up Apache Airflow on Kubernetes.
As to Cloud Composer, it is a fully managed workflow orchestration service built on Apache Airflow. It is not a tool for managing your Airflow. For more information, you can look through the link, there is a short video with a brief explanation: Cloud Composer