Thanks in advance for your time that you spent reading this.
I'm playing with Kubernetes and use Ansible for any interactions with my cluster. Have some playbooks that successfully deploy applications.
My main ansible component I use for deployment is k8s that allow me to apply my yaml configs.
I can successfully wait until deployment completes using
k8s:
state: present
definition: config.yaml
wait: yes
wait_timeout: 10
But, unfortunately, the same trick doesn't work by default with Kubernetes Jobs. The module simply exits immediately that is clearly described in ansible module, that's true:
For resource kinds without an implementation, wait returns immediately unless wait_condition is set.
To cover such a case, module spec suggests to specify
wait_condition:
reason: REASON
type: TYPE
status: STATUS
The doc also says:
The possible types for a condition are specific to each resource type in Kubernetes. See the API documentation of the status field for a given resource to see possible choices.
I checked API specification and found the same as stated in the following answer:
the only type values are “Complete” and “Failed”, and that they may have a ”True” or ”False” status
So, my QUESTION is simple: is there anyone who know how to use this wait_condition properly? Did you try it already (as for now, it's relatively new feature)?
Any ideas where to look are also appreciated.
UPDATE:
That's a kind of workaround I use now:
- name: Run Job
k8s:
state: present
definition: job_definition.yml
- name: Wait Until Job Is Done
k8s_facts:
name: job_name
kind: Job
register: job_status
until: job_status.resources[0].status.active != 1
retries: 10
delay: 10
ignore_errors: yes
- name: Get Final Job Status
k8s_facts:
name: job_name
kind: Job
register: job_status
- fail:
msg: "Job Has Been Failed!"
when: job_status.resources[0].status.failed == 1
But would be better to use the proper module feature directly.
wait_condition works for me with jobs, as long as timeout/type/status are set appropriately, based on your job average time process:
wait: yes
wait_timeout: 300
wait_condition:
type: Complete
status: True
Kubernetes documentation specifies that:
As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete.
Based on this and the API specification you already linked - we can assume that Job will have condition type Complete
set as True
when it was successfully executed that many times as you requested.
Hence:
wait_condition:
type: Complete
status: True
Should do the "job".
As it is stated in k8s plugin code, reason is ignored when it is not specified.
I didn't test it. Just based on code and documentation so it would be nice if you could confirm that it works or not.