Airflow KubePodOperator pull image from private repository

10/8/2019

How can Apache Airflow's KubernetesPodOperator pull docker images from a private repository?

The KubernetesPodOperator has an image_pull_secrets which you can pass a Secrets object to authenticate with the private repository. But the secrets object can only represent an environment variable, or a volume - neither of which fit my understanding of how Kubernetes uses secrets to authenticate with private repos.

Using kubectl you can create the required secret with something like

$ kubectl create secret docker-registry $SECRET_NAME \
              --docker-server=https://${ACCOUNT}.dkr.ecr.${REGION}.amazonaws.com \
              --docker-username=AWS \
              --docker-password="${TOKEN}" \
              --docker-email="${EMAIL}"

But how can you create the authentication secret in Airflow?

-- danodonovan
airflow
kubernetes

1 Answer

10/14/2019

There is secret object with docker-registry type according to kubernetes documentation which can be used to authenticate to private repository.

As You mentioned in Your question; You can use kubectl to create secret of docker-registry type that you can then try to pass with image_pull_secrets.

However depending on platform You are using this might have limited or no use at all according to kubernetes documentation:

Configuring Nodes to Authenticate to a Private Registry

Note: If you are running on Google Kubernetes Engine, there will already be a .dockercfg on each node with credentials for Google Container Registry. You cannot use this approach.

Note: If you are running on AWS EC2 and are using the EC2 Container Registry (ECR), the kubelet on each node will manage and update the ECR login credentials. You cannot use this approach.

Note: This approach is suitable if you can control node configuration. It will not work reliably on GCE, and any other cloud provider that does automatic node replacement.

Note: Kubernetes as of now only supports the auths and HttpHeaders section of docker config. This means credential helpers (credHelpers or credsStore) are not supported.

Making this work on mentioned platforms is possible but it would require automated scripts and third party tools.

Like in Amazon ECR example: Amazon ECR Docker Credential Helper would be needed to periodically pull AWS credentials to docker registry configuration and then have another script to update kubernetes docker-registry secrets.

As for Airflow itself I don't think it has functionality to create its own docker-repository secrets. You can request functionality like that in Apache Airflow JIRA.

P.S.

If You still have issues with Your K8s cluster you might want to create new question on stack addressing them.

-- Piotr Malec
Source: StackOverflow