What is the right way to free up kubernetes resources for a kubernetes job that fails pulling the image?

10/24/2019

Context

We have long running kubernetes jobs based on docker containers. The containers needs resources (eg 15gb memory, 2 cpu) and we use autoscaler to scale up new worker nodes at request.

Scenario

Users can select the version of the docker image to be used for a job, eg 1.0.0, 1.1.0, or even a commit hash of the code the image was build from in test environment.

As we leave the docker tag to be freetext, the user can type a non-existing docker tag. Because of this the job pod comes in ImagePullBackOff state. The pod stays in this state and keeps the resources locked so that they cannot be reused by any other job.

Question

What is the right solution, that can be applied in kubernetes itself, for failing the pod immediately or at least quickly if a pull fails due to a non existing docker image:tag?

Possibilities

I looked into backofflimit. I have set it to 0, but this doesn't fail or remove the job. The resources are of course kept as well.

Maybe they can be killed by a cron job. Not sure how to do so.

Ideally, resources should not even be allocated for a job with an unexisting docker image. But I'm not sure if there is a possibility to easily achieve this.

Any other?

-- timv
kubernetes
kubernetes-jobs

3 Answers

10/25/2019

When a Job completes, no more Pods are created, but the Pods are not deleted either.

By default, a Job will run uninterrupted unless a Pod fails (restartPolicy=Never) or a Container exits in error (restartPolicy=OnFailure), at which point the Job defers to the .spec.backoffLimit described above. Once .spec.backoffLimit has been reached the Job will be marked as failed and any running Pods will be terminated.

Another way to terminate a Job is by setting an active deadline. Do this by setting the .spec.activeDeadlineSeconds field of the Job to a number of seconds. The activeDeadlineSeconds applies to the duration of the job, no matter how many Pods are created. Once a Job reaches activeDeadlineSeconds, all of its running Pods are terminated and the Job status will become type: Failed with reason: DeadlineExceeded.

Note that a Job’s .spec.activeDeadlineSeconds takes precedence over its .spec.backoffLimit. Therefore, a Job that is retrying one or more failed Pods will not deploy additional Pods once it reaches the time limit specified by activeDeadlineSeconds, even if the backoffLimit is not yet reached.

Here is more information: jobs.

You can also set-up concurrencyPolicy of cronjob to Replace and replace the currently running job with a new job.

Here is an example:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/2 * * * *"
  concurrencyPolicy: Replace
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster && sleep 420
          restartPolicy: Never

Setting Replace value for concurrencyPolicy flag means if it is time for a new job run and the previous job run hasn’t finished yet, the cron job replaces the currently running job run with a new job run.

Regardless of this solutions your problem lies in wrong images so automated deleting pods or jobs doesn't solve problem. Because if you don't change anything in definition of jobs and images your pods will still fail after creating job again.

Here is example of troubleshooting for Error: ImagePullBackOff Normal BackOff: ImagePullBackOff .

-- MaggieO
Source: StackOverflow

10/24/2019

You can use failedJobsHistoryLimit for failed jobs and successfulJobsHistoryLimit for success jobs

With these two parameters, you can keep your job history clean

.spec.backoffLimit to specify the number of retries before considering a Job as failed.

-- c4f4t0r
Source: StackOverflow

11/1/2019

After Looking at your design, I would recommend to add InitContainer to Job specification to check existence of docker images with the given tag.

If the image with the tag doesn't exist in the registry, InitContainer can report an error and fail the Job's Pod by exiting with non-zero exit code.

After that Job's Pod will be restarted. After certain amount of attempts Job will get Failed state. By configuring .spec.ttlSecondsAfterFinished option, failed jobs can be wiped out.

If a Pod’s init container fails, Kubernetes repeatedly restarts the Pod until the init container succeeds. However, if the Pod has a restartPolicy of Never, Kubernetes does not restart the Pod.

If the image exists, InitContainer script exits with zero exit code and the main Job container image is going to be pulled and container starts.

-- VAS
Source: StackOverflow