Skip to main content

Achieving better compression with Deflater

Posted by mister__m on December 26, 2003 at 10:48 AM PST

I've recently been playing more intensively with CVS - I've always used either IDE support for it or any nice GUI client for CVS available - and found out more about GZIP compression than I knew before. That's my main motivation for this post.

It's been quite a while - since JDK 1.1, according to javadocs - Java has been providing support for working with ZLIB compression through its API. The package java.util.zip contains classes for manipulating GZIP and ZIP formats, as well as for coding to compression utilities directly by using the Inflater and Deflater classes.

So, getting straight to code, if you want to compress an object you are writing to a stream:

   public void writeCompressed(OutputStream os, Object toWrite) throws IOException {
      ObjectOutputStream oos = null;

      try {
         oos = new ObjectOutputStream(new GZIPOutputStream(os));
         oos.writeObject(toWrite);
      } finally {
         if (oos != null) {
            try {
                oos.close();
            } catch (IOException ioe) {
                /*
                 * The day someone gives me a sensible explanation why this method
                 * throws an exception (as if there was something I could do about it or
                 * if I cared!), I will be sooooo grateful :-D
                 */
                ioe.printStackTrace();
            }
         }
      }

Besides the ugly try inside the finally block - that deserves a whole post to itself, called "API design we don't get", probably better posted by Hani -, it's pretty simple. I've been working a lot with Prevayler- in a simple way, a very good open-source substitute for databases, faster by far - and as it works with serialization, I thought it would be a good idea to compress the serialized stream it generates. I've written a class you can use with Prevayler that does just that, as part of my open-source project, reusable-components, and it'd been a while since I last modified it. However, after some time manually dealing with CVS, I've noticed GZIP streams can have different compression levels and started wondering if java.util.zip provided support for playing with these.

Indeed, Deflater supports compression levels through a method named setLevel(int). The argument this method takes is yet-another-magical-int-constant-in-the-world, an int argument whose value ranges from 1, a.k.a. BEST_SPEED, to 9, a.k.a. BEST_COMPRESSION. Deflater is used internally by DeflaterOutputStream, which is the superclass of GZIPOutputStream, used in the above example. So if there is a method for setting the compression level, it means it's pretty simple to do it, right? Hum, it's easy, but it could be easier, though.

The problem is that DeflaterOutputStream mantains a reference to its Deflater instance via a protected property named def. It means it is not possible to simply get the Deflater instance and set its compression level. As it is a protected property, though, subclassing GZIPOutputStream will make it accessible. A simple way - in terms of a practical solution, not a very readable one - to do it is using an anonymous inner class with the so-called "anonymous constructor" as shown below:

   public void writeCompressed(OutputStream os, Object toWrite) throws IOException {
      ObjectOutputStream oos = null;

      try {
         oos = new ObjectOutputStream(new GZIPOutputStream(os) {
               {
                   def.setLevel(Deflater.BEST_COMPRESSION);
               }
         });
         oos.writeObject(toWrite);
      } finally {
         if (oos != null) {
            try {
                oos.close();
            } catch (IOException ioe) {
                /*
                 * The day someone gives me a sensible explanation why this method
                 * throws an exception (as if there was something I could do about it or
                 * if I cared!), I will be sooooo grateful :-D
                 */
                ioe.printStackTrace();
            }
         }
      }

Using Deflater.BEST_COMPRESSION instead of the default compression level decreases a reasonable (more than 20kb) stream total size by around 10%, according to my tests. GZIP compression makes my serialized objects 80% smaller, which is good, at least for me. This method may be used to fine-tuning the compression level so less CPU cycles are used to transmit something through the network, for example. After some experiencing, you may be able to figure out an ideal value in your specific case and use it as the compression level for your own GZIPOutputStream. Yet another obscure, hidden feature inside the API, recently found out. :-D

If you happen to be using Prevayler and would like to get smaller snapshots, take a look at reusable-components and download the latest version from here. Also, the Enum class has been enhanced to support anonymous subclasses and minor javadoc clarifications have been made, thanks to Jonathan O'Connor suggestions. If you want to join, I'd also be glad.

Related Topics >>

Comments

Excellent post - albeit quite

Excellent post - albeit quite old, strange how even in Java 7 gzip still doesn't have the compression level as a constructor option. until I read this post I was about to add the compression level to an overridden write method in gzipoutputstream - an anonymous constructor didn't even cross my mind! Thanks!