How can Apache Airflow's KubernetesPodOperator
pull docker images from a private repository?
The KubernetesPodOperator
has an image_pull_secrets
which you can pass a Secrets
object to authenticate with the private repository. But the secrets object can only represent an environment variable, or a volume - neither of which fit my understanding of how Kubernetes uses secrets to authenticate with private repos.
Using kubectl
you can create the required secret with something like
$ kubectl create secret docker-registry $SECRET_NAME \
--docker-server=https://${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com \
--docker-username=AWS \
--docker-password="${TOKEN}" \
--docker-email="${EMAIL}"
But how can you create the authentication secret in Airflow?
There is secret
object with docker-registry
type according to kubernetes documentation which can be used to authenticate to private repository.
As You mentioned in Your question; You can use kubectl
to create secret of docker-registry
type that you can then try to pass with image_pull_secrets
.
However depending on platform You are using this might have limited or no use at all according to kubernetes documentation:
Configuring Nodes to Authenticate to a Private Registry
Note: If you are running on Google Kubernetes Engine, there will already be a
.dockercfg
on each node with credentials for Google Container Registry. You cannot use this approach.Note: If you are running on AWS EC2 and are using the EC2 Container Registry (ECR), the kubelet on each node will manage and update the ECR login credentials. You cannot use this approach.
Note: This approach is suitable if you can control node configuration. It will not work reliably on GCE, and any other cloud provider that does automatic node replacement.
Note: Kubernetes as of now only supports the
auths
andHttpHeaders
section of docker config. This means credential helpers (credHelpers
orcredsStore
) are not supported.
Making this work on mentioned platforms is possible but it would require automated scripts and third party tools.
Like in Amazon ECR example: Amazon ECR Docker Credential Helper would be needed to periodically pull AWS credentials to docker registry configuration and then have another script to update kubernetes docker-registry secrets.
As for Airflow itself I don't think it has functionality to create its own docker-repository secrets. You can request functionality like that in Apache Airflow JIRA.
P.S.
If You still have issues with Your K8s cluster you might want to create new question on stack addressing them.