Helm failed test job keeps restarting

2/14/2020

I run my django unit tests in my cluster with helm command. They are correctly launched, but when the unit tests fail, kubernetes keeps restarting them again and again, despite "restartPolicy: Never".

How can I force kubernetes/helm to run my unit tests only once, failed or not?

django-test.yml:

apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ .Release.Name }}-django-test"
  labels:
    app: {{ template "myproject.fullname" . }}
    chart: "{{ .Chart.Name }}-{{ .Chart.Version }}"
    release: "{{ .Release.Name }}"
  annotations:
    "helm.sh/hook": test
spec:
  template:
    metadata:
      name: "{{ .Release.Name }}-django-test"
    spec:
      containers:
        - name: main
          image: {{ template "myproject.django.image" . }}
          args:
            - test
      restartPolicy: Never
➜  ~ kubectl get pods --namespace mynamespace
NAME                            READY   STATUS      RESTARTS   AGE
myrel-django-install-jhspp   0/1     Completed   0          78m
myrel-django-test-6dzrq      0/1     Error       0          47m
myrel-django-test-7hpwm      0/1     Error       0          67m
myrel-django-test-nd74f      0/1     Error       0          73m
myrel-django-test-np5qd      0/1     Error       0          39m
myrel-django-test-nqjdw      0/1     Error       0          60m
myrel-django-test-tjbhc      0/1     Error       0          53m
...
-- Vincent J
kubernetes
kubernetes-helm

1 Answer

2/14/2020

Have a look at backoffLimit option.

Pod backoff failure policy

There are situations where you want to fail a Job after some amount of retries due to a logical error in configuration etc. To do so, set .spec.backoffLimit to specify the number of retries before considering a Job as failed. The back-off limit is set by default to 6.

I have not tried this, but I believe it will get you the behaviour you want.

    spec:
      containers:
        - name: main
          image: {{ template "myproject.django.image" . }}
          args:
            - test
      backoffLimit: 1
      restartPolicy: Never

You may also find the following post useful: Understanding backoffLimit in Kubernetes Job

-- Filip Nikolov
Source: StackOverflow