How to read .tar.gz file in GCS Bucket by spring boot app in Google Kubernetes engine

6/7/2018

I have a GCS bucket where I upload .tar.gz files. A notification to the app in Kubernetes Engine is sent by Google PUB-SUB whenever there is a new file upload. App has listener that listens to the notification and has to read the .tar.gz file.

I have set up topic/subscribesr notification from GCS to the APP. I am unable to determine how to read .tar.gz file when notification comes.

I tried below code to read .tar.gz but its not what I should be doing. It gave non readable characters & i am not unzipping it. I used google-cloud-storage lib.

if(map.containsValue("OBJECT_FINALIZE") ){
           Storage storage = StorageOptions.newBuilder()
                    .setProjectId(projectId)
                    .setCredentials(creaCredentialsProvider.getCredentials()).build().getService();


           try(ReadChannel reader = storage.reader(archiveMessage.getBucket(), archiveMessage.getName())){
               ByteBuffer bytes = ByteBuffer.allocate(64 * 1024);
               while (reader.read(bytes) > 0) {
                   bytes.flip();
                   System.out.println((char)bytes.get());
                   bytes.clear();
               }

           }


        }

I looked at persistent disks & transcoding which doesn't look straight forward. Any help with direction I should be going or code samples are really appreciated.

-- karthik
google-cloud-platform
google-cloud-storage
google-kubernetes-engine

1 Answer

8/14/2018

There is no feature available with Transcoding in the Google Cloud Platform that can offer the two-level decompression, especially with the tar.gz file. However, following the response provided in this Stackoverflow link, it seems there is a programmatic way to workaround this issue. You would use the tarfile.TarFile.extractfile method to by-pass having to extracting to disk

-- oakinlaja
Source: StackOverflow