Skip to main content

Generified and cached empty arrays

Posted by malenkov on December 7, 2009 at 1:53 AM PST

Caching of an empty array is a well-known pattern to improve performance. However, it is difficult to use it in generified classes.

Out of curiosity, I created a custom implementation of the array creation method based on Array.newInstance. To cache empty arrays, I use synchronized WeakHashMap, which maps any given component type to a weak reference to the corresponding empty array. This is not the fastest way, but it does not lead to memory leaks.

private static final Map<Class<?>, Reference<?>> map =
         new WeakHashMap<Class<?>, Reference<?>>();

@SuppressWarnings("unchecked")
public static <T> T[] newInstance(Class<T> type,
                                        int length,
                                        boolean cache) {
    if (!cache || length != 0) {
        return (T[]) Array.newInstance(type, length);
    }
    synchronized (map) {
        Reference<?> ref = map.get(type);
        Object array = (ref == null) ? null : ref.get();
        if (array == null) {
            array = Array.newInstance(type, length);
            map.put(type, new WeakReference(array));
        }
        return (T[]) array;
    }
}

Note that the method returns an array of a given type. I used the SuppressWarnings annotation to suppress warnings about unchecked casts.

Consider the results of the test application. The left column shows the number of iterations. During each iteration, an empty array is created for every given component type. The middle column shows the average time of an empty array creation. The right column shows the average time of creating and retrieving the shared empty array. Frankly, I was surprised that the caching of a single array is slower than caching of several ones.

Attempts Average time (ns)
normal cached
1 490 65676
10 490 1813
100 388 961
1000 392 980
10000 248 223
100000 274 56

1 class per attempt

Attempts Average time (ns)
normal cached
1 262 2278
10 261 69
100 247 64
1000 262 63
10000 257 61
100000 258 62

88 different classes per attempt

Cached empty arrays can be useful for the JVM too. For example, when calling varargs methods without arguments a new empty array is created every time. I do not think that anyone expects such behavior.

What do you think?

Related Topics >>

Comments

Measuring performance...

... that way is always difficult and error prone. There're rounding errors (dividing long values, for instance) and the garbage collector is possibly kicking in during the test. Try this one out: Classes: 1 1 112292.17/463086.43 (ns) 10 718.62/2157.6 (ns) 100 636.96/1973.8 (ns) 1000 57.3/431.0 (ns) 10000 42.54/133.34 (ns) 100000 36.65/115.75 (ns) Classes: 88 1 424.07/1180.59 (ns) 10 102.43/202.93 (ns) 100 24.09/86.4 (ns) 1000 31.45/115.06 (ns) 10000 24.75/109.38 (ns) private static void test(Class<?>... types) { System.out.println("Classes: " + types.length); Class<?>[] empty = new Class<?>[types.length]; int count = 1; test(types, count, false); map.clear(); for (int i = 0; i < 6; i++, count *= 10) { long common = test(empty, count, false); long normal = test(types, count, false) - common; long cached = test(types, count, true) - common; System.out.println(count + "\t" + (normal/100.0) + "/" + (cached/100.0) + " (ns)"); } } private static long test(Class<?>[] types, int count, boolean cache) { long time = -System.nanoTime(); for( int repeat=0; repeat<100; repeat++ ) { for (int i = 0; i < count; i++) { for (Class<?> type : types) { if (type != null) { newInstance(type, 0, cache); } } } time += System.nanoTime(); time /= (long) count; time /= (long) types.length; } return time; } Amazingly, the results are all the way round! As Knuth once said: We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil

optimization

My code is not optimized. I spent more time to write the post on English then to create the code.

Your test code does not show a time needed for creation a single class. The repeat cycle calls the newInstance method at least 100 times. My table has the "100 attempts" row for such case.

JIT

You should run your routine a number of times before you start measuring the time. Otherwise your first attempt will be before your code is compiled by the JIT - which might explain the performance penalty.

Optimized

SoftReferences would be more appropriate here I think, since they are kept as long as there is memory available, WeakReferences will just fade away and imho are seldomly appropriate for caches. See e.g. http://www.javaspecialists.eu/archive/Issue015.html

Did you test also using concurrent threads? Just guessing: the synchronization may eat-up or reverse any performance gained by the caching.

Excuse me, but it's a mistake

Excuse me, but it's a mistake in last method. time = -System.nanoTime(); goes before additional 100-times loop , and time = +System.nanoTime(); goes inside it. Let's change it a little bit: private static long test(Class<?>[] types, int count, boolean cache) { long totalTime = 0; for (int repeat = 0; repeat < 100; repeat++) { long time = -System.nanoTime(); for (int i = 0; i < count; i++) { for (Class<?> type : types) { if (type != null) { newInstance(type, 0, cache); } } } time += System.nanoTime(); time /= (long) count; time /= (long) types.length; totalTime += time; } return totalTime / 100; } We'll get: Classes: 1 1 290/2419 (ns) 10 616/1038 (ns) 100 243/336 (ns) 1000 199/109 (ns) 10000 182/107 (ns) 100000 180/103 (ns) Classes: 88 1 183/123 (ns) 10 192/110 (ns) 100 191/111 (ns) 1000 191/109 (ns) 10000 189/109 (ns) 100000 191/111 (ns) Let's remove synchronized block in the newInstance method (really, we don't need this - really, it's rather low probability of creating 2 different instances of empty array, so possible overhead for this is very small), we'll get: Classes: 1 1 322/2232 (ns) 10 324/1031 (ns) 100 202/211 (ns) 1000 223/67 (ns) 10000 177/70 (ns) 100000 179/69 (ns) Classes: 88 1 184/88 (ns) 10 196/73 (ns) 100 193/72 (ns) 1000 195/73 (ns) 10000 193/72 (ns) 100000 193/73 (ns)

before and after

Look at the common variable. It is used to remove a time you mention.

Cached arrays in the JVM

Well, usually the arrays created for varargs are tiny - and therefor cheap to allocate. WeakReference's aren't free and add a runtime overhead, and when taking multi-threading into account you have to add synchornization / ThreadLocal storage to the game. Furthermore I don't think your test is fair - uncached arrays are also created by reflection - which usually isn't nescessary ;) - Clemens

Soft vs WeakReference

Have you thought about using A SoftReference instead of a WeakReference. I would think you would want to cache empty arrays as long as possible. In my experimentation weak referenced object only live until the next garbage collection and soft references last until the VM decides it really needs that memory for something else. Also is using WeakHashMap with Class objects seems strange, is this to keep from holding a reference to the classloader?

agree with soft references

I use WeakHashMap for automatical cleaning if no more custom class loader.

generified and cached empty arrays

Hello Sergey, 2 things I'd like to comment: You are using a WeakHashMap with Classes as keys. OK, this allows for custom classloaders to unload classes. But why do you also use WeakReferences as the map's values? If you'd simple put the empty arrays directory into the WeakHashMap you'd gain some performance, and the (finite) number of empty array instances would be freed upon JVM termination! Secondly, you're trying to improve 2 things at once: performance and type-safety. Unfortunately, you cannot replace the non-generic "newInstance" method with a generic one because of the lack of support for primitve types. I suggest you rename the method to "newObjectArray" and add an argument check such as if(type.isPrimitive()) throw new IllegalArgumentException("cannot create primitive array"); to prevent the user from receiving a surprising ClassCastException! Cheers, Gernot

primitive types are not allowed

Thanks! I forgot about primitives...

But why do you also use WeakReferences as the map's values?

Because WeakHashMap uses weak references only for keys. Values are hard references. Note that an array instance has a hard reference to the component type.

Fix in the core

A library is the wrong place for such fix. This could be solved trivially: for each loaded Class (including primitives but not including arrays), allocate a zero-length array of that class and reference it from a private field of the reflection metadata, eg. Class.emptyArray. The memory overhead is relatively tiny (a loaded Class needs a ton of metadata), and the GC could be arranged to ignore the emptyArray field so it doesn't block class-GC but doesn't incur any overhead of using a soft reference (gazillions of soft refs in the heap = hell for the GC). Finally, update the array allocator (the bytecode "newarray" or the equivalent native code produced by the JIT) to just return type.emptyArray when length==0. (That's an extra comparison and branch in the allocator, but this is only necessary for arrays which is a small fraction of all allocations so it's not a sensible hit). The only limitation is for arrays-of-arrays, this can't be handled by the proposed design to avoid an infinite loop (as you load class X it creates a cached X[0], that creates a cached X[0][0], etc.), but that's a minor issue.

Now I know there's a compatibility catch - code relying that new X[0] returns a unique object, not == to any existing object, and with a separate monitor, will fail. I suppose such code should be extremely rare because I can't imagine any non-stupid reason to do that. But perhaps the caching optimization could be optional, enabled by a runtime switch. In the worst case, just create a completely new API, similar to Array.newInstance() but with the caching, and let tuned code (or cleaner languages-for-the-JVM) invoke it. A portable library could use your code in JRE < 7, or just delegate to the new AI in JRE >= 7. Everyone's happy.

create empty array on class loaded

It is a good idea.