Share a volume in KubernetesPodOperator?

2/22/2019

I'm using a ffmpeg docker image from a KubernetesPodOperator() inside Airflow for extracting frames from a video.

It works fine, but I am not able to retrieve the frames stored: how can store the frames generated by the Pod directly into my file system (host-machine)?

Update:

From https://airflow.apache.org/kubernetes.html# I think I figured out that I need to work on the volume_mount, volume_config and volume parameters, but still no luck.

Error message:

"message":"Not found: \"test-volume\"","field":"spec.containers[0].volumeMounts[0].name"

PV and PVC:

command kubectl get pv,pvc test-volume gives:

NAME                           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS   REASON   AGE
persistentvolume/test-volume   10Gi       RWO            Retain           Bound    default/test-volume   manual                  3m

NAME                                STATUS   VOLUME        CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/test-volume   Bound    test-volume   10Gi       RWO            manual         3m

Code:

volume_mount = VolumeMount('test-volume',
                           mount_path='/',
                           sub_path=None,
                           read_only=False)

volume_config= {
    'persistentVolumeClaim':
    {
        'claimName': 'test-volume' # uses the persistentVolumeClaim given in the Kube yaml
    }
}

volume = Volume(name="test-volume", configs=volume_config)



with DAG('test_kubernetes',
         default_args=default_args,
         schedule_interval=schedule_interval,
         ) as dag:

        extract_frames = KubernetesPodOperator(namespace='default',
                                  image="jrottenberg/ffmpeg:3.4-scratch",
                                  arguments=[                                    
                                    "-i", "http://www.jell.yfish.us/media/jellyfish-20-mbps-hd-hevc-10bit.mkv",                                    
                                    "test_%04d.jpg"
                                    ],                                  
                                  name="extract-frames",
                                  task_id="extract_frames",
                                  volume=[volume],
                                  volume_mounts=[volume_mount],
                                  get_logs=True
                                  )
-- Rob
airflow
kubernetes
kubernetes-pod
minikube

2 Answers

3/6/2019

Here's some speculation as to what may be wrong:

  1. (Where your error is most likely coming from) KubernetesPodOperator expects parameter "volumes", not "volume"

  2. In general, it's bad practice to mount onto "/" since you will be deleting everything that comes on the image you're running. i.e. you should probably change "mount_path" in your VolumeMount object to something else like "/stored_frames"

-- bricca
Source: StackOverflow

3/25/2020

You should create a test pod to verify your k8s objects (volumes, pod, configmap, secrets,etc) before wrapping that pod creation in the DAG with KubernetesPodOperator. Based from your code above, it can look like this:

apiVersion: v1
kind: Pod
metadata:
 name: "extract-frames-pod"
 namespace: "default"
spec:
 containers:
  - name: "extract-frames"
    image: "jrottenberg/ffmpeg:3.4-scratch"
    command:
    args: ["-i", "http://www.jell.yfish.us/media/jellyfish-20-mbps-hd-hevc-10bit.mkv",  "test_%04d.jpg"]
    imagePullPolicy: IfNotPresent
    volumeMounts:
      - name: "test-volume"
#       do not use "/" for mountPath.
        mountPath: "/images"
 restartPolicy: Never
 volumes:
  - name: "test-volume"
    persistentVolumeClaim:
      claimName: "test-volume"
 serviceAccountName: default

I expect you will get the same error that you had: "message":"Not found: \"test-volume\"","field":"spec.containers[0].volumeMounts[0].name"

Which I think is an issue with your PersistentVolume manifest file. Did you set the path test-volume? Something like:

    path: /test-volume

and does the path exists in the target volume? If not create that directory/folder. That might solve your problem.

-- alltej
Source: StackOverflow