How to ensure kubernetes cronjob does not restart on failure

8/2/2018

I have a cronjob that sends out emails to customers. It occasionally fails for various reasons. I do not want it to restart, but it still does.

I am running Kubernetes on GKE. To get it to stop, I have to delete the CronJob and then kill all the pods it creates manually.

This is bad, for obvious reasons.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  creationTimestamp: 2018-06-21T14:48:46Z
  name: dailytasks
  namespace: default
  resourceVersion: "20390223"
  selfLink: [redacted]
  uid: [redacted]
spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      template:
        metadata:
          creationTimestamp: null
        spec:
          containers:
          - command:
            - kubernetes/daily_tasks.sh
            env:
            - name: DB_HOST
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: status.hostIP
            envFrom:
            - secretRef:
                name: my-secrets
            image: [redacted]
            imagePullPolicy: IfNotPresent
            name: dailytasks
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
          dnsPolicy: ClusterFirst
          restartPolicy: Never
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
  schedule: 0 14 * * *
  successfulJobsHistoryLimit: 3
  suspend: true
status:
  active:
  - apiVersion: batch
    kind: Job
    name: dailytasks-1533218400
    namespace: default
    resourceVersion: "20383182"
    uid: [redacted]
  lastScheduleTime: 2018-08-02T14:00:00Z
-- Doug
kubernetes

2 Answers

8/4/2018

It turns out that you have to set a backoffLimit: 0 in combination with restartPolicy: Never in combination with concurrencyPolicy: Forbid.

backoffLimit means the number of times it will retry before it is considered failed. The default is 6.

concurrencyPolicy set to Forbid means it will run 0 or 1 times, but not more.

restartPolicy set to Never means it won't restart on failure.

You need to do all 3 of these things, or your cronjob may run more than once.

spec:
  concurrencyPolicy: Forbid
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      [ADD THIS -->]backoffLimit: 0
      template: 
      ... MORE STUFF ...
-- Doug
Source: StackOverflow

8/2/2018

The kubernetes cronjob resources has a field, suspend in its spec.

You can't do it by default, but if you want to ensure it doesn't run, you could update the script that sends emails and have it patch the cronjob resource to add suspend: true if it fails

Something like this

kubectl patch cronjob <name> -p '{"spec": { "suspend": true }}'
-- jaxxstorm
Source: StackOverflow