Airflow tasks stuck in queued state

1/21/2021

We're running Airflow 1.10.12, with KubernetesExecutor and KubernetesPodOperator. In the past few days, we’re seeing tasks getting stuck in queued state for a long time (to be honest, unless we restart the scheduler, it will remain stuck in that state), new tasks of the same DAG are getting scheduled properly.

The only thing that helps is either clearing it manually, or restarting the scheduler service

We usually see it happen when we run our E2E tests, which spawns ~20 DAG runs for everyone of our 3 DAGs, due to limited parallelism, some will be queued (which is fine by us)

These are our parallelism params in airflow.cfg

parallelism = 32
dag_concurrency = 16
max_active_runs_per_dag = 16

2 of our DAGs, overwrite the max_active_runs and set it to 10

Any idea what could be causing it?

-- Meny Issakov
airflow
airflow-scheduler
kubernetes

0 Answers