Tuesday, January 7, 2014

Making HTTP content compression work in netty 4

Netty is really a great framework providing all the things needed to build a high performance HTTP server. The nice thing is, that nearly everything comes out of the box and has just to be put together in the right way. And content compression (gzip or deflate) is no exception. However, when it comes to compressing static content I stumbled quite a few times before everything worked as expected:





Update: First of all, widely used tools like wget use HTTP 1.0 and not HTTP 1.1 - therefore we cannot always deliver a chunked response (we have to live with disabling compression then). Also note that the netty guys had pretty much the same idea now: HttpChunkedInput - The problem with HTTP 1.0 or non compressable responses (see SmartContentCompressor below) however remains...

Based on the http/file example provided by netty I used to following approach to serve static files (same as used in netty 3.6.6):

RandomAccessFile raf = new RandomAccessFile(file, "r");
HttpResponse response = new DefaultHttpResponse(HTTP_1_1, OK); 
ctx.write(response);

if (useSendFile) {
    ctx.write(new DefaultFileRegion(raf.getChannel(), 0, fileLength));
} else {
    ctx.write(new ChunkedFile(raf, 0, fileLength, 8192));
}
However, as soon as I added a HttpContentCompressor to the pipeline, Firefox failed with a message like "invalid content encoding".
As it turns out, the HttpContentCompressor expects HttpContent objects as input chunks to be compressed. However, the ChunkedWriteHandler directly sent ByteBufs to the downstream. Also sending a FileRegion (useSendFile=true) left the content compressor unimpressed.
In order to overcome this problem I create a class named ChunkedInputAdapter which takes a ChunkedInput<ByteBuf> and represents ChunkedInput<HttpContent>. However, two things still weren't satisfying: First, FileRegions and the zero-copy capbility still couldn't be used and second, already compressed files like JPEGs will be compressed again. Therefore I sublassed HttpContentCompressor with a class called SmartContentCompressor. This class check if either a header "Content-Encoding: Identity" or a specific content-type or a content-length less than 1 kB is present. In there cases the content compression is bypassed.

Using this combination permits to use both, content compression when it is useful and the zero copy capability if the file is already compressed.
All the sources mentioned above are open sourced under the MIT license and part of the SIRIUS framework.