Is there a way to automatically remove completed Jobs besides making a CronJob to clean up completed Jobs?
The K8s Job Documentation states that the intended behavior of completed Jobs is for them to remain in a completed state until manually deleted. Because I am running thousands of Jobs a day via CronJobs and I don't want to keep completed Jobs around.
You can now set history limits, or disable history altogether, so that failed or successful jobs are not kept around indefinitely. See my answer here. Documentation is here.
To set the history limits:
The
.spec.successfulJobsHistoryLimit
and.spec.failedJobsHistoryLimit
fields are optional. These fields specify how many completed and failed jobs should be kept. By default, they are set to 3 and 1 respectively. Setting a limit to0
corresponds to keeping none of the corresponding kind of jobs after they finish.
The config with 0 limits would look like:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
successfulJobsHistoryLimit: 0
failedJobsHistoryLimit: 0
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
This is possible from version 1.12 Alpha with ttlSecondsAfterFinished
. An example from Clean Up Finished Jobs Automatically:
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-ttl
spec:
ttlSecondsAfterFinished: 100
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
I've found the below to work
To remove failed jobs:
kubectl delete job $(kubectl get jobs | awk '$3 ~ 0' | awk '{print $1}')
To remove completed jobs:
kubectl delete job $(kubectl get jobs | awk '$3 ~ 1' | awk '{print $1}')
i'm using wernight/kubectl's kubectl image
scheduled a cron deleting anything that is
completed
2 - 9 days old
(so I have 2 days to review any failed jobs)it runs every 30mins so i'm not accounting for jobs that are 10+ days old
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: cleanup
spec:
schedule: "*/30 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: kubectl-runner
image: wernight/kubectl
command: ["sh", "-c", "kubectl get jobs | awk '$4 ~ /[2-9]d$/ || $3 ~ 1' | awk '{print $1}' | xargs kubectl delete job"]
restartPolicy: Never
Using jsonpath:
kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.succeeded==1)].metadata.name}')
I recently built a kubernetes-operator to do this task.
After deploy it will monitor selected namespace and delete completed jobs/pods if they completed without errors/restarts.
As stated in the documentation "It is up to the user to delete old jobs", see http://kubernetes.io/docs/user-guide/jobs/#job-termination-and-cleanup
I would run a pod to do this cleanup based on job name and certain conditions, thus letting kubernetes at least take care of the availability of your process here. You could run a recurring job for this (assuming you run kubernetes 1.5).
A simple way to delete them by running a cron job:
kubectl get jobs --all-namespaces | sed '1d' | awk '{ print $2, "--namespace", $1 }' | while read line; do kubectl delete jobs $line; done
Another way using a field-selector:
kubectl delete jobs --field-selector status.successful=1
Another way to run that in a cronjob, similar to another answer would be:
Create a service account with batch/jobs list and delete permission; It uses bitnami kubectl, because the suggested kubectl image does not have the field-selector
option
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: jobs-cleanup
spec:
schedule: "*/30 * * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: my-sa-name
containers:
- name: kubectl-container
image: bitnami/kubectl:latest
command: ["sh", "-c", "kubectl delete jobs --field-selector status.successful=1"]
restartPolicy: Never