Unable to launch pods on Kubernetes cluster(from airflow DAG)

6/19/2019

I have an airflow DAG whose tasks Im attempting to run on an AWS EKS cluster.The dag tasks are built as docker images and uploaded to AWS ECR.

The dag task:

task_1 = KubernetesPodOperator(namespace='default',
                               image="XXXXX.dkr.ecr.us-west-2.amazonaws.com/com.YYYY/math-demo:v1",
                               labels={"foo": "bar"},
                               name="math-test",
                               task_id="math-task",
                               get_logs=True,
                               dag=dag
                                )

The docker image(on ECR) is of the form "XXXXX.dkr.ecr.us-west-2.amazonaws.com/com.YYYY/math-demo:v1" and the local docker image is math-demo:v1

When I run this task the pods are always in a pending state and never execute.I ran kubectl describe pods and get the following error:

Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "620ee0494e7aaf1776120df10351606c2203c194ca86079fd7198d56fabbc79b" network for pod "math-test-fbdab794": NetworkPlugin cni failed to set up pod "math-test-fbdab794_default" network: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused", failed to clean up sandbox container "620ee0494e7aaf1776120df10351606c2203c194ca86079fd7198d56fabbc79b" network for pod "math-test-fbdab794": NetworkPlugin cni failed to teardown pod "math-test-fbdab794_default" network: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused"]

Any idea on how to solve this?

-- user5347996
airflow
docker
kubernetes

0 Answers