Apache Airflow or Argoproj for long running and DAGs tasks on kubernetes

7/15/2019

We have a lot of the long running, memory/cpu intensive jobs in k8s which are run with celery on kubernetes on google cloud platform. However we have big problems with scaling/retrying/monitoring/alerting/guarantee of delivery. We want to move from celery to some more advanced framework.

There is a comparison: https://github.com/argoproj/argo/issues/849 but it's not enough.

Airflow:

  • has better support in community ~400 vs ~12 tags on SO, 13k stars vs ~3.5k stars
  • python way of defining flows feels better than using just yamls
  • support in GCP as a product: Cloud Composer
  • better dashboard
  • some nice operators like email operator

Argoproj:

  • Native support of Kubernetes (which i suppose is somehow better)
  • Supports CI/CD/events which could useful in the future
  • (Probably) better support for passing results from one job to the another one (in Airflow xcom mechanism)

Our DAGs are not that much complicated. Which of those frameworks should we choose?

-- sacherus
airflow
argoproj
celery
google-cloud-platform
kubernetes

1 Answer

7/16/2019

Idiomatic Airflow isn't really designed to execute long-running jobs by itself. Rather, Airflow is meant to serve as the facilitator for kicking off compute jobs within another service (this is done with Operators) while monitoring the status of the given compute job (this is done with Sensors).

Given your example, any compute task necessary within Airflow would be initiated with the appropriate Operator for the given service being used (Airflow has GCP hooks for simplifying this) and the appropriate Sensor would determine when the task was completed and no longer blocked downstream tasks dependent on that operation.

While not intimately familiar on the details of Argoproj, it appears to be less of a "scheduling system" like Airflow and more of a system used to orchestrate and actually execute much of the compute.

-- Jacob Turpin
Source: StackOverflow