I'm running some jobs using Spark on K8S and sometimes my executors will die mid-job. Whenever that happens the driver immediately deletes the failed pod and spawns a new one.
Is there a way to stop Spark from deleting terminated executor pods? It would make debugging the failure a lot easier.
Right now I'm already collecting the logs of all pods to another storage so I can see the logs. But it's quite a hassle to query through logs for every pod and I won't be able to see K8S metadata for them.
This setting was added in SPARK-25515. It sadly isn't available for the currently released version but it should be made available in Spark 3.0.0
use the job.spec.ttlSecondsAfterFinished
to determine how long the pod will exist after the job is completed/failed.
for example:
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-ttl
spec:
ttlSecondsAfterFinished: 100
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
The Job pi-with-ttl
will be eligible to be automatically deleted, 100
seconds after it finishes.
If the field is set to 0, the Job will be eligible to be automatically deleted immediately after it finishes. If the field is unset, this Job won’t be cleaned up by the TTL controller after it finishes.
Note that this TTL mechanism is alpha, with feature gate
TTLAfterFinished
. For more information, see the documentation for TTL controller for finished resources.