Why argo doesn't finish, when the spark submit of an apache spark app on k8s ends?

10/27/2021

I have an issue with Argo-Spark integration, the main issue is when the spark-submit application takes more processing time on k8s, so when Spark is already finished and de pod is completed, on the Argo side, the dag is still running when the spark application has finished.

The image shows when the spark app is already finished and the dag is still running.

enter image description here

The log shows the trace and is infinite until It's manually stopped.

21/10/27 04:09:41 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Pending)
21/10/27 04:09:41 INFO LoggingPodStatusWatcherImpl: State changed, new state: 
	 pod name: spark-compaction-driver-379250e309d84131960098d6d65bb355
	 namespace: argo-workflows
	 labels: spark-app-selector -> spark-5a47913d3a574b329c5a575397c0c5fe, spark-role -> driver
	 pod uid: ed147197-4f89-41bb-9fb5-5d6dbb0351d7
	 creation time: 2021-10-27T04:09:39Z
	 service account name: argo-workflow
	 volumes: spark-local-dir-1, spark-conf-volume-driver, argo-workflow-token-lrlv9
	 node name: aks-intensive-11775020-vmss000009
	 start time: 2021-10-27T04:09:39Z
	 phase: Running
	 container status: 
		 container name: spark-kubernetes-driver
		 container image: 744752950324.dkr.ecr.us-east-1.amazonaws.com/spark-compaction:0.0.9.9
		 container state: running
		 container started at: 2021-10-27T04:09:41Z
21/10/27 04:09:42 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:43 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:44 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:45 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:46 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:47 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:48 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:49 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:50 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:51 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:52 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:53 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:54 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:55 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:56 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:57 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:58 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:09:59 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:00 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:01 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:02 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:03 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:04 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:05 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:06 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:07 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:08 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:09 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:10 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:11 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:12 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:13 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:14 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:15 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:16 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:17 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:18 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:19 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:20 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:21 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:22 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:23 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:24 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:25 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:26 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:27 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:28 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)
21/10/27 04:10:29 INFO LoggingPodStatusWatcherImpl: Application status for spark-5a47913d3a574b329c5a575397c0c5fe (phase: Running)

The spark-submit

spark-submit:
    ''
    ./bin/spark-submit \
    --master k8s://https://xxxxxxxxxxxxxxxx \
    --deploy-mode cluster \
    --name spark-compaction-exec-{{workflow.outputs.parameters.golbal-generate-uuid-output}} \
    --class xxxxxx.Compaction \
    --conf spark.kubernetes.container.image=${imageSparkCompaction} \
    --conf spark.kubernetes.driver.pod.name=spark-compaction-driver-{{workflow.outputs.parameters.golbal-generate-uuid-output}} \
    --conf spark.kubernetes.namespace=argo \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=argo \
    --conf spark.executor.instances=5 \
    --verbose \
    local:///opt/spark/jars/spark-compaction-0.9.0.jar {{inputs.parameters.database-comp}} {{inputs.parameters.table-name-comp}} {{inputs.parameters.raw-zone-container-comp}} {{inputs.parameters.clean-zone-container-comp}} {{inputs.parameters.back-zone-container-comp}} {{inputs.parameters.data-source-comp}} {{inputs.parameters.file-format-comp}} {{inputs.parameters.erp-env-comp}} {{inputs.parameters.erp-database-comp}} {{inputs.parameters.object-format-comp}} {{inputs.parameters.date-col-ref-comp}} {{inputs.parameters.compaction-key-comp}} {{inputs.parameters.stg-zone-container-comp}}
    '' 
-- javier_orta
apache-spark
argo
kubernetes

0 Answers