Sending .gz file via CURL to RESTful put creating ZipException in GZIPInputStream

5/31/2018

The application I am creating takes a gzipped file sent to a RESTful PUT, unzips the file and then does further processing like so:

public class Service {

  @PUT
  @Path("/{filename}")
  Response doPut(@Context HttpServletRequest request,
      @PathParam("filename") String filename,
      InputStream inputStream) {

      try {
        GZIPInputStream gzipInputStream = new GZIPInputStream(inputStream);

        // Do Stuff with GZIPInputStream
      } catch (IOException e) {
        e.printStackTrace();
      }
      return null;
  }
}

I am able to successfully send a gzipped file in a unit test like so:

InputStream inputStream = new FileInputStream("src/main/resources/testFile.gz);
Service service = new Service();
service.doPut(mockHttpServletRequest, "testFile.gz", inputStream);
// Verify processing stuff happens

But when I build the application and attempt to CURL the same file from the src/main/resources dir with the following I get a ZipException:

curl -v -k -X PUT --user USER:Password -H "Content-Type: application/gzip" --data-binary @testFile.gz https://myapp.dev.com/testFile.gz

The exception is:

java.util.zip.ZipException: Not in GZIP format
    at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:165)
    at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79)
    at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91)
    at Service.doPut(Service.java:23)
    // etc.

So does anyone have any idea why sending the file via CURL causes the ZipException?

Update: I ended up taking a look at the actual bytes being sent via the InputStream and figured out where the ZipException: Not in GZIP format error was coming from. The first two bytes of a GZIP file are required to be 1F and 8B respectively in order for GZIPInputStream to recognize the data as being in GZIP format. Instead the 8B byte, along with every other byte in the steam that doesn't correspond to a valid UTF-8 character, was transformed into the bytes EF, BF, BD which are the UTF-8 unknown character replacement bytes. Thus the server is reading the GZIP data as UTF-8 characters rather than as binary and is corrupting the data.

The issue I am having now is I can't figure out where I need to change the configuration in order to get the server to treat the compressed data as binary vs UTF-8. The application uses Jax-rs on a Jersey server using Spring-Boot that is deployed in a Kubernetes pod and ran as a service, so something in the setup of one of those technologies needs to be tweaked to prevent improper encoding from being used on the data.

I have tried adding -H "Content-Encoding: gzip" to the curl command, registering the EncodingFilter.class and GZipEncoder.class in jersey ResourceConfig class, adding application/gzip to the server.compression.mime-types in application.propertes, adding the @Consumes("application/gzip") annotation to the doPut method above, and several other things I can't remember off the top of my head but nothing seems to have any effect.

I am seeing the following in the verbose CURL logs:

> PUT /src/main/resources/testFile.gz
> HOST: my.host.com
> Authorization: Basic <authorization stuff>
> User-Agent: curl/7.54.1
> Accept: */*
> Content-Encoding: gzip
> Content-Type: application/gzip
> Content-Length: 31
>
} [31 bytes data]
* upload completely sent off: 31 out of 31 bytes
< HTTP/1.1 500
< X-Application-Context: application
< Content-Type: application/json;charset=UTF-8
< Transfer-Encoding: chunked
< Date: <date stuff>
...etc

Nothing I have done has affected the receiving side

Content-Type: application/json;charset=UTF-8

portion, which I suspect is the issue.

-- user3389672
gzip
jax-rs
jersey
kubernetes
spring-boot

0 Answers