The Source for Java Technology Collaboration
User: Password:



Scott Oaks's Blog

Don't guess -- test

Posted by sdo on December 09, 2005 at 09:28 AM | Comments (3)

One of the things that always interests me is the relative performance of the collection classes. Recently, I discovered a particular anomaly of the ConcurrentHashMap class.

I've always considered the ConcurrentHashMap class as something to be used in special cases: use a Hashtable, and if you notice a lot of contention for your hashtable, then switch to a ConcurrentHashMap. Of course, you always write your code in terms of the Map interface so that such a switch will be trivial, right?

This conviction stems partly from habit, partly from the fact that I strongly believe that simple code is faster (the Hashtable class is a much simpler implementation), and partly from some microbenchmarks I've run showing that when there is little or no contention, Hashtable is a faster implementation of the Map interface than ConcurrentHashMap. This is particularly true on recent VMs, which do a much better job at uncontended lock acquisition. [On the other hand, the ConcurrentHashMap greatly increases throughput when there is moderate to severe contetion for the map.]

Recently I ran across some newly written code that used ConcurrentHashMap in its initial implementation. It unit tested fine, of course, and we ran some simple performance tests on it, and it was still fine. And then we ran into an interesting test case, where we created thousands of the ConcurrentHashMap objects at a time (each one embedded in an Http session object).

It turns out that the size of an empty ConcurrentHashMap object is 1272 bytes; an empty Hashtable object is just 96 bytes. So forget any minor performance difference in storage and retrieval that might exist between the two; in this case, our GC times when using the ConcurrentHashMap dominated everything else. A simple one line change in the code, and we were back in business.

Will you see this type of thing in your app? Maybe not -- it is admittedly an unusual use case of the collection classes. But I like this example, since it reinforces my basic programming principles: start by using the simpler code, be prepared for changes, and don't expect that you'll understand the performance of your application until you test it under a variety of circumstances.


Bookmark blog post: del.icio.us del.icio.us Digg Digg DZone DZone Furl Furl Reddit Reddit
Comments
Comments are listed in date ascending order (oldest first) | Post Comment

  • Why do you prefer Hashtable over Collections.synchronizedMap()?

    Posted by: coxcu on December 12, 2005 at 07:24 AM

  • IMHO, it suffices to read the API doc of the java.util.concurrent Collection classes to be put off completely.
    If a class's behaviour is so complicated by design, how can you expect anyone to make a substantiated decision about an appropriate use case?
    Following the bug fixes from the initial Java 5 release up to the current Mustang builds further convinced me to shy away from java.util.Concurrent*. I continue to use simple HashMaps, synchronized at the application level (which makes the most sense, anyway).

    Posted by: mcnepp on December 13, 2005 at 06:07 AM

  • I don't really have a strong reason for preferring Hashtable over Collections.synchronizedMap, other than force of habit. Collections.synchronizedMap does have the advantage of working with any kind of map (not just HashMap, the direct equivalent to Hashtable) -- although WeakHashMap is one thing I tend to stay away from (again for performance reasons; there's a performance penalty paid for inserting into a WeakHashMap that to me rarely makes up for its ability to sometimes reclaim memory).

    However, there is a subtle usage reason why using a standard HashMap and synchronizing at the application level is often exactly the right thing to do. Often, the operation you need to perform on elements in a map is complex, and the entire operation needs to be synchronized, e.g:

    Map m = ...;
    Object o = m.get(someKey);
    if (o == null)
    o = ...
    o.increment();
    m.put(someKey, o)

    I've had to debug code where developers use a Hashtable for this, assuming that because the Hashtable is synchronized, their code is threadsafe. But of course it isn't; the entire operation must be synchronized to make the operation threadsafe. So if you must handle the synchronization yourself, then I'd prefer HashMap.

    Posted by: sdo on December 13, 2005 at 11:18 AM





Powered by
Movable Type 3.01D
 Feed java.net RSS Feeds