Copy files between s3 and kubernetes container

10/20/2020

I'm aware of how to copy files between a local machine and a Kubernetes container:

kubectl cp /local_dir/some_file container:/remote_dir/some_file

And from an s3 bucket to EC2 using AWS CLI:

aws s3 cp remote local

But unsure of how to copy between s3 and Kubernetes. In Googling the issue I've come across the skbn library, but I'm wondering if there's a way to do it without installing a new package.

-- triphook
amazon-s3
amazon-web-services
kubernetes

1 Answer

10/20/2020

Define an initContainer that mounts a PersistentVolume and uses the aws CLI or some other tool to retrieve files from your S3 bucket and copy them to the PV. Then mount this PV in your normal container as well and reference the files you retrieved accordingly.

Bonus points if you configure your initContainer to accept your AWS Access & Secret Access Keys as well as the S3 Bucket via Environment Variable and you pass these in on runtime.

Alternatively retrieve the files as part of a CI/CD pipeline before deploying the containers to Kubernetes. And volume mount them in accordingly using file or block storage.

EDIT: Looks like an AWS S3 Operator exists too. However, the project is not active nor is it officially created by AWS. I would not recommend using it. Especially when you can codify this process as part of a pipeline or your Deployment manifest.

EDIT 2: I also feel compelled to mention that containers are intended to be stateless and this might be why you're asking this question. While kubectl cp exists, it is by and large a non-standard way of getting files into Kubernetes. Normally you would do this through the use of a PersistentVolume that is provided by a StorageClass. With Docker, any changes made to the OS after execution are not persisted; the layer is discarded when the container exits. So rather than copying files into a container like you would a VM, try and use volume mounts to ensure they're there from the start.

-- TJ Zimmerman
Source: StackOverflow