Google Cloud Composer pulls stale image from Google Container Registry

10/8/2020

I'm trying to run an Airflow task through Google Cloud Composer with the KubernetesPodOperator in an environment built from an image in a private Google Container Registry. The Container Registry and Cloud Composer instances are under the same project and everything worked fine until I updated the image the DAG refers too.

When I update the image in the Container Registry, Cloud Composer keeps using a stale image.

Concretely, in the code below

import datetime
import airflow
from airflow.contrib.operators import kubernetes_pod_operator


YESTERDAY = datetime.datetime.now() - datetime.timedelta(days=1)


# Create Airflow DAG the the pipeline
with airflow.DAG(
        'my_dag',
        schedule_interval=datetime.timedelta(days=1),
        start_date=YESTERDAY) as dag:

    my_task = kubernetes_pod_operator.KubernetesPodOperator(
        task_id='my_task',
        name='my_task',
        cmds=['echo 0'],
        namespace='default',
        image=f'gcr.io/<my_private_repository>/<my_image>:latest')

if I update the image gcr.io/<my_private_repository>/<my_image>:latest in the Container Registry, Cloud Composer keeps using the stale image that is not present anymore in the Container Registry and throws an error.

Is this a bug?

Thanks a lot!

-- Th&#233;ophile Gervet
airflow
google-cloud-composer
kubernetes

1 Answer

10/9/2020

As mentioned in the documentation for KubernetesPodOperator, the default value for image_pull_policy is 'IfNotPresent'. You need to configure your Pod Spec to pull image image always.

Simplest way to do it is setting the image_pull_policy to 'Always'.

Few more ways are mentioned in K8s Container Images documentation.

-- arunvelsriram
Source: StackOverflow