Spark on Kubernetes executor cleanup

10/16/2019

I'm running some jobs using Spark on K8S and sometimes my executors will die mid-job. Whenever that happens the driver immediately deletes the failed pod and spawns a new one.

Is there a way to stop Spark from deleting terminated executor pods? It would make debugging the failure a lot easier.

Right now I'm already collecting the logs of all pods to another storage so I can see the logs. But it's quite a hassle to query through logs for every pod and I won't be able to see K8S metadata for them.

-- Jochen Niebuhr
apache-spark
kubernetes

2 Answers

10/16/2019

This setting was added in SPARK-25515. It sadly isn't available for the currently released version but it should be made available in Spark 3.0.0

-- Jochen Niebuhr
Source: StackOverflow

10/16/2019

use the job.spec.ttlSecondsAfterFinished to determine how long the pod will exist after the job is completed/failed.

for example:

apiVersion: batch/v1
kind: Job
metadata:
  name: pi-with-ttl
spec:
  ttlSecondsAfterFinished: 100
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never

The Job pi-with-ttl will be eligible to be automatically deleted, 100 seconds after it finishes.

If the field is set to 0, the Job will be eligible to be automatically deleted immediately after it finishes. If the field is unset, this Job won’t be cleaned up by the TTL controller after it finishes.

Note that this TTL mechanism is alpha, with feature gate TTLAfterFinished. For more information, see the documentation for TTL controller for finished resources.

-- Efrat Levitan
Source: StackOverflow