How to Properly Update the Status of a Job

8/19/2019

As far as I know, when most people want to know if a Kubernetes (or Spark even) Job is done, they initiate some sort of loop somewhere to periodically check if the Job is finished with the respective API.

Right now, I'm doing that with Kubernetes in the background with the disown (&) operator (bash inside Python below):

import subprocess

cmd = f'''
kubectl wait \\
    --for=condition=complete \\
    --timeout=-1s \\
    job/job_name \\
    > logs/kube_wait_log.txt \\
    &
'''

kube_listen = subprocess.run(
    cmd,
    shell = True,
    stdout = subprocess.PIPE
)

So... I actually have two (correlated) questions:

  1. Is there a better way of doing this in the background with shell other than with the & operator?
  2. The option that I think would be best is actually to use cURL from inside the Job to update my Local Server API that interacts with Kubernetes.
    • However, I don't know how I can perform a cURL from a Job. Is it possible?
    • I imagine you would have to expose ports somewhere but where? And is it really supported? Could you create a Kubernetes Service to manage the ports and connections?
-- Philippe Fanaro
kubernetes
kubernetes-jobs

1 Answer

8/19/2019

If you don't want to block on a process running to completion, you can create a subprocess.Popen instance instead. Once you have this, you can poll() it to see if it's completed. (You should try really really really hard to avoid using shell=True if at all possible.) So one variation of this could look like (untested):

with open('logs/kube_wait_log.txt', 'w') as f:
  with subprocess.Popen(['kubectl', 'wait',
                         '--for=condition=complete',
                         '--timeout=-1s',
                         'job/job_name'],
                         stdin=subprocess.DEVNULL,
                         stdout=f,
                         stderr=subprocess.STDOUT) as p:
    while True:
      if p.poll():
        job_is_complete()
        break
      time.sleep(1)

Better than shelling out to kubectl, though, is using the official Kubernetes Python client library. Rather than using this "wait" operation, you would watch the job object in question and see if its status changes to "completed". This could look roughly like (untested):

from kubernetes import client, watch
jobsv1 = client.BatchV1Api()
w = watch.watch()
for event in w.stream(jobsv1.read_namespaced_job, 'job_name', 'default'):
  job = event['object']
  if job.status.completion_time is not None:
    job_is_complete()
    break

The Job's Pod doesn't need to update its own status with the Kubernetes server. It just needs to exit with a successful status code (0) when it's done, and that will get reflected in the Job's status field.

-- David Maze
Source: StackOverflow