High memory usage when uploading a multipart file to Amazon S3 via streaming?

10/5/2019

The method below in my Java Spring application directly streams and uploads a file to an Amazon S3 bucket. I have researched that using streams will make the uploading of large files (> 100MB videos for my use case) to be more memory efficient. When testing the method with a 25MB file, the memory usage of my Java Spring application in a Kubernetes cluster setup spiked up by 200MB! I also tried a file that was 200MB and the memory spiked up again to ~2GB. There were no out of memory exceptions thrown but the memory usage does not drop back down. Why does this happen?

public void uploadFile(MultipartFile file, String saveFileName) {
        try {
            ObjectMetadata metadata = new ObjectMetadata();

            if (file.getContentType() != null){
                om.setContentType(file.getContentType());
            }

            metadata.setContentLength(file.getSize());

            saveFileName = saveFileName.replaceAll(" ", "");

            InputStream stream = file.getInputStream();

            PutObjectRequest request = new PutObjectRequest(bucketName, saveFileName, stream, metadata);
            request = request.withMetadata(om);

            s3client.putObject(request);

            stream.close();
        } catch (AmazonClientException | IOException exception) {
            // handle exception
        }
    }
-- Andy Tang
amazon-s3
file-upload
java
kubernetes
spring

1 Answer

10/6/2019

There are multiple ways to handle large file uploads.

  1. Write byte array to disk and upload to S3 using a background task maybe.
  2. Store in memory and upload the object directly (bad bad option, unless you set a very low file upload limit)

take a look at this git repo on how the above methods can be achieved

I don't see your use case here. But if you are handling the UI as well consider uploading the files directly from the UI using pre-signed S3 URLs.

-- Lokare Sathyam
Source: StackOverflow