How to build an image for KubeFlow pipeline?

2/3/2020

I recently found out about kubeflow and kubeflow pipeline but it is not clear for me how to build an image from my python program.

Let's assume that I have a simple python function that crops images:

class Image_Proc:
    def crop_image(self, image, start_pixel, end_pixel):
        # crop
        return cropped_image

How shall I containerize this and use it in the KubeFlow pipeline? Do I need to wrap it in an API (with Flask for example) Or do I need to connect to some media/data broker?

How KubeFlow pipeline sends input to this code and transfers the output of this code to the next step?

-- AVarf
kubeflow
kubeflow-pipelines
kubernetes
machine-learning

1 Answer

2/6/2020

Basically you can follow the steps provided by Docker here to create Docker image and publish to Docker Hub (or you can build your own private docker registry, but I think it may be too much work for beginner). Just roughly list steps:

  1. Create Dockerfile. In your Dockerfile, just specify several things: base image (for you case, just use python image from Docker), working directory and what commands to be executed when running this image
  2. Run your Image locally to make sure it works as expected (Install docker first if you haven't), then push to Docker Hub
  3. Once published, you will have the image URL after publishing to Docker Hub, then use that url when you create pipelines in Kubeflow.

Also, you can read this doc to know how to create pipelines (Kubeflow pipeline is just argo workflow). For your case, just fill in inputs and/or outputs sections of the step you want in the pipeline YAML file.

-- Dd__Mad
Source: StackOverflow