How to limit the retry count of ImagePullBackOff when the docker is not accessible?

4/6/2020

In my kubernetes job, I wrote a wrong docker name, and I find that my job keeps in a state of ImagePullBackOff. I want the job release resources, and stop pulling docker image since the image name is wrong. I noticed that there is a imagePullPolicy configuration, but seems does not meet my request. I want the job always pull docker images, but has a retry count limit for pulling. Does kubernetes support this action?

-- SparkSnail
kubernetes

1 Answer

4/6/2020

You should use backoffLimit and activeDeadlineSeconds as generic controller to quit the job after specified number of failures.

Example

 apiVersion: batch/v1
 kind: Job
 metadata:
   name: myjob
   namespace: default
 spec:
   backoffLimit: 5
   activeDeadlineSeconds: 100
   template:
     metadata:
       name: myjob
     spec:
       containers:
       - name: nginx
         image: wrong_image_name
       restartPolicy: OnFailure

By setting the activeDeadlineSeconds field of the Job to a number of seconds. The activeDeadlineSeconds applies to the duration of the job, no matter how many Pods are created.

Once a Job reaches activeDeadlineSeconds, all of its running Pods are terminated and the Job status will become type: Failed with reason: DeadlineExceeded

-- DT.
Source: StackOverflow