Ansible + Kubernetes: how to wait for a Job completion

8/9/2019

Thanks in advance for your time that you spent reading this.

I'm playing with Kubernetes and use Ansible for any interactions with my cluster. Have some playbooks that successfully deploy applications.

My main ansible component I use for deployment is k8s that allow me to apply my yaml configs.

I can successfully wait until deployment completes using

k8s:
    state: present
    definition: config.yaml
    wait: yes
    wait_timeout: 10

But, unfortunately, the same trick doesn't work by default with Kubernetes Jobs. The module simply exits immediately that is clearly described in ansible module, that's true:

For resource kinds without an implementation, wait returns immediately unless wait_condition is set.

To cover such a case, module spec suggests to specify

wait_condition:
  reason: REASON
  type: TYPE
  status: STATUS

The doc also says:

The possible types for a condition are specific to each resource type in Kubernetes. See the API documentation of the status field for a given resource to see possible choices.

I checked API specification and found the same as stated in the following answer:

the only type values are “Complete” and “Failed”, and that they may have a ”True” or ”False” status

So, my QUESTION is simple: is there anyone who know how to use this wait_condition properly? Did you try it already (as for now, it's relatively new feature)?

Any ideas where to look are also appreciated.

UPDATE:

That's a kind of workaround I use now:

- name: Run Job
  k8s:
   state: present
   definition: job_definition.yml

- name: Wait Until Job Is Done
  k8s_facts:
    name: job_name
    kind: Job
  register: job_status
  until: job_status.resources[0].status.active != 1
  retries: 10
  delay: 10
  ignore_errors: yes

- name: Get Final Job Status
  k8s_facts:
    name: job_name
    kind: Job
  register: job_status

- fail:
    msg: "Job Has Been Failed!"
  when: job_status.resources[0].status.failed == 1

But would be better to use the proper module feature directly.

-- Konstantin Dobroliubov
ansible
kubernetes

2 Answers

9/17/2019

wait_condition works for me with jobs, as long as timeout/type/status are set appropriately, based on your job average time process:

        wait: yes
        wait_timeout: 300
        wait_condition:
          type: Complete
          status: True
-- flabatut
Source: StackOverflow

8/19/2019

Kubernetes documentation specifies that:

As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete.

Based on this and the API specification you already linked - we can assume that Job will have condition type Complete set as True when it was successfully executed that many times as you requested.

Hence:

wait_condition:
  type: Complete
  status: True

Should do the "job".

As it is stated in k8s plugin code, reason is ignored when it is not specified.

I didn't test it. Just based on code and documentation so it would be nice if you could confirm that it works or not.

-- Daniel Szot
Source: StackOverflow