gcsfuse to mount a bucket in GKE and/or python3 boto to stream write?

1/23/2019

I am looking for a way to "write stream" some .mp4 video files -- as they are being generated by some python app -- to a google cloud storage bucket. The python app is containerised and deployed in GKE and currently executes fine as a web service. But the problem is that all the video files are locally generated and stored in a path (tmp/processed) inside the pod.

However, I want the video files to be written to files in a google's storage bucket named my_bucket.

I have read gcsfuse guidelines (https://github.com/maciekrb/gcs-fuse-sample) on how to mount a bucket in Kubernetes pods and also read about boto (https://cloud.google.com/storage/docs/boto-plugin#streaming-transfers) that is used to do the stream transfers to storage buckets.

To mount my_bucket in tmp/processed, I have added the following lines to my app's deployment file (YAML):

        lifecycle:
          postStart:
            exec:
              command:
              - gcsfuse
              - -o
              - nonempty
              - my_bucket
              - tmp/processed
          preStop:
            exec:
              command:
              - fusermount
              - -u
              - tmp/processed/
        securityContext:
          capabilities:
            add:
            - SYS_ADMIN

I haven't used boto yet and thought maybe just mounting would be enough! But, my app gives me input/output error when trying to generate the video file.

Now my question is that do I need to use both gcsfuse and boto, or just mounting the bucket in my GKE pod is enough? And am I doing the mounting right?


UPDATE: I verified that I did the mount correctly using the following command:

kubectl exec -it [POD_NAME] bash

-- Ai Da
boto
bucket
gcsfuse
google-kubernetes-engine
kubernetes

1 Answer

1/30/2019

Problem solved! I only had to mount my bucket within the pod and that was it. The mounting script (as written above in my question) was done correctly. But, the problem that caused the input/output error was due to my GKE cluster that had insufficient permissions. Basically, the cluster didn't have the permission to read/write to storage and a couple of other permissions were needed by the project. So, I created a new cluster using the following command:

gcloud container clusters create [MY_CLUSTER_NAME] \
  --scopes=https://www.googleapis.com/auth/userinfo.email,cloud-platform,https://www.googleapis.com/auth/devstorage.read_write,storage-rw,trace,https://www.googleapis.com/auth/trace.append,https://www.googleapis.com/auth/servicecontrol,compute-rw,https://www.googleapis.com/auth/compute,https://www.googleapis.com/auth/service.management.readonly,https://www.googleapis.com/auth/taskqueue \
  --num-nodes 4 --zone "us-central1-c"

to be able to read/write from/to a storage bucket the cluster had to have the https://www.googleapis.com/auth/devstorage.read_write permission.

Also, that there was no need to use boto and mounting through gcsfuse was enough for me to be able to write stream video files to my_bucket.

-- Ai Da
Source: StackOverflow