How to create an ML pipeline for model retraining and inference [Kubernetes]

7/30/2021

I am working on a task where I need to design an ML pipeline for model retraining and inference on Kubernetes

I read some articles and watched some tutorials, with the help of which I have created 2 apps as described below

  • For Model retraining, I have scheduled a CronJob (Flask App #1)
  • For inference, I have created a separate flask app (Flask App #2)

I don't know how can we transfer the latest trained model from CronJob to the inference flask app

I am a newbie in Kubernetes any suggestion would be of great help

-- arush1836
flask
google-cloud-platform
kubernetes
kubernetes-cronjob
python

1 Answer

8/2/2021

We can make use of the Google Persistent Disk, Kubernetes Volume and Kubernetes Persistent Volume Claim to do so.

I tried replicating a scenario where a Cronjob updates a text file with current time and date each time it creates a Pod. I then created a separate Pod outside the Cronjob to access this text file and was successful. Below are the steps I followed,

  1. Create a Standard Persistent Disk on GCP using the following gcloud command,

    gcloud compute disks create pd-name --size 500G --type pd-standard --zone us-central1-c

  1. Then create a Kubernetes Persistent Volume using the above PD and a Persistent Volume Claim, so that the pods can request for storage on the Persistent Volume using the following configuration,

config.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv
spec:
  storageClassName: "test"
  capacity:
    storage: 10G
  accessModes:
    - ReadWriteOnce
  claimRef:
    namespace: default
    name: pv-claim
  gcePersistentDisk:
    pdName: pd-name
    fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pv-claim
spec:
  storageClassName: "test"
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10G	
  1. Deploy a Cronjob with the PVC configuration and which writes the current time and date into a text file stored on the PV using the following configuration,

Cronjob.yaml:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: cron
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          volumes:
            - name: pv-storage
              persistentVolumeClaim:
                claimName: pv-claim
          containers:
            - name: container
              image: nginx
              volumeMounts:
                - mountPath: "/usr/data"
                  name: pv-storage
              command:
              - /bin/sh
              - -c
              - date >> /usr/data/msg.txt
          restartPolicy: OnFailure

Configure a Pod to Use a PersistentVolume for Storage for more information.

  1. Deploy a Pod with the same PVC configuration to check whether data added by the Cronjob pods is visible through this pod using the following configuration,

Readpod.yaml:

kind: Pod
apiVersion: v1
metadata:
  name: readpod
spec:
  volumes:
    - name: pv-storage
      persistentVolumeClaim:
       claimName: pv-claim
  containers:
    - name: read-container
      image: nginx
      volumeMounts:
        - mountPath: "/usr/data"
          name: pv-storage
  1. Then use kubectl exec command to get a shell to a running container on the above Pod, by using the following commands and we should be able to view the text file in which the cronjob was updating time and date.

    	$ kubectl exec -it  readpod -- /bin/bash
    	$ cd usr/data
    	$ cat msg.txt

You can make use of the above concepts and modify the configuration according to your use case.

-- Ashish
Source: StackOverflow