Skip to main content

Using Enumerations in for-each statements - five times faster than the JRE, without RAM limitations

Posted by mkarg on July 4, 2010 at 10:02 AM PDT

Update (2012-09-24): Maven users, you can directly link this class (LGPL), as I have uploaded it into Maven Central. Simply add a dependency to:

<dependency>
    <groupId>eu.headcrashing.treasure-chest</groupId>
    <artifactId>EnumerationsClass</artifactId>
    <version>1.0.0</version>
</dependency>

I was such happy when years back the JRE was extended by the Collections Framework. It's just so intuitive to use it, and provides virtually anything the average programmer needs. Compared to the older Enumeration interface, this was really much better. And even better was the introduction of the for-each statement, so iterating over a Collection was a breeze. Unfortunately, the old Enumeration interface was already commonly used in lots of APIs like JNDI etc. which couldn't get changed without breaking compatibility, so what to do? The JRE authors solved this problem by providing compatibility factory methods in the Collections class, namely Collections.enumeration() and Collections.list(). While the first was turning a Collection into an Enumeration, the latter allowed using for-each with Enumerations by turning an Enumeration into an ArrayList. So far, so good.

Unfortunately the implementation of Collections.list() is not very smart. While Collections.list() just wraps the collection's iterator by a thin implementation of the Enumeration interface (which needs virtually no time and space), it's a mistery to me why the authors choose to return actually an ArrayList: This decision enforces the implementor to actually allocate RAM large enough to store the complete content of the Enumeration, and then actually copying the content. While this sounds easy these days, don't be surprised now. Its much slower than you think, and needs much more RAM than you think. Ever had the problem that you wanted to use for-each with an Enumeration filled with ten million integers? Good for you if not.

So if CPU time and RAM consumption is of any interest, it might be useful to choose a different solution: Not allocating any RAM, but directly map calls directly to the Enumeration - just as Collections.enumeration() does, but just the reverse way. I wonder why the JRE is not providing this solution, but until it is (and possibly it will never) one has to write this code on his own again and again. Or just download mine, if your project accepts GPLv3: www.headcrashing.eu/code.html

A short test confirms that this code can not only handles endless Enumerations where Collections.list() blasts your RAM, but also is five times faster than Collections.list(). Time to file a RFE now to get Collections extended by this implementation... ;-)

/*
* Copyright 2009-2010 Markus KARG
*
* This file is free software: you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This file is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
* GNU Lesser General Public License for more details.
*
* For a copy of the GNU Lesser General Public License
* see .
*/

package eu.headcrashing.util;

import java.util.Collections;
import java.util.Enumeration;
import java.util.Iterator;

/**
* This class allows Enumerations to be iterated.
*
* @author Markus KARG (markus@headcrashing.eu)
*/
public final class Enumerations {

/**
* Allows using of {@link Enumeration} with the for-each statement. The
* implementation is not using any heap space and such is able to serve
* virtually endless Enumerations, while {@link Collections#list} is limited
* by available RAM. As a result, this implementation is much faster than
* Collections.list.
*
* @param enumeration
*            The original enumeration.
* @return An {@link Iterable} directly calling the original Enumeration.
*/
public static final <T> Iterable<T> iterable(final Enumeration<T> enumeration) {
return new Iterable<T>() {
public final Iterator<T> iterator() {
return new Iterator<T>() {
public final boolean hasNext() {
return enumeration.hasMoreElements();
}

public final T next() {
return enumeration.nextElement();
}

/**
* This method is not implemeted as it is impossible to
* remove something from an Enumeration.
*
* @throws UnsupportedOperationException
*             always.
*/
public final void remove() {
throw new UnsupportedOperationException();
}
};
}
};
}

}

Have fun with it!
Markus


An overview of all my publications can be found on my web site Head Crashing Informations: www.headcrashing.eu

Comments

Interesting - Read about a similar example in Head First DP

Markus,

Just the other day, I was reading about the Adapter pattern in the fantastic HF Design Patterns book, and they gave a similar example of adapting an enumeration into iterator and vice-versa. Was thinking of doing something similar, but you already did it.

I agree, something like this should be in the JDK!

-DW

List not iterable

Unfortunately the implementation of Collections.list() is not very smart.

Really? It seems to me this methods does exactly what it the documentation says it should. It turns an enumeration into an ArrayList.

Here is a snippet from the JavaDoc:

    Returns an array list containing the elements returned by the specified enumeration in the order they are returned by the enumeration.

Your method does not even return an instance of List and so I feel it is unfair to compare it to Collections.list. You have written and adapter from Enumeration->Iterator->Iterable, which is great and it probably should be in the JDK (I have one just like it in my util code), but the headline is inflammatory and misleading. Collections.list is very clear about what it's doing and it should be apparent that this is a poor choice for use in a for each loop.

Sorry for confusion

I never doubted that Collections.list() doesn't do what the API says. Actually I wanted to say that the API itself is not very smart, as it enforces a deep copy, while least programmers need it.

T is missing

Hi Markus, I don't think your implementation will work. You did not specify T. :) Bye, Daniel

Damned markup. ;-)

Daniel, thanks for the tip. Actually the real code works pretty well and includes the 'T'. The blogging software screwed the generics as it interpreted it as markup. Just fixed it.

`Iterable.iterator` should

`Iterable.iterator` should return a fresh `Iterator` every time it is called. That's not too difficult to correct.

It actually does what you asked for.

I don't understand why you say "should": It already does return a new instance each time. Anyways, an Enumeration can only iterated once, just like an iterator. One cannot repeat either.

Re: it actually does what you asked for

If you do two foreach (enhanced for) on the result of your function, the first one will consume all elements of the enumeration so the second one will never enter in its body.

the signature should be:
  <T> Iterable<T> iterable(final Enumeration<? extends T> enumeration)

All your 'final' are useless except the one the parameter 'enumeration'.
Because:
  - final on a static method has no meaning,
  - any VMs can easily see that there is no overriden methods here.
  - all anonymous classes are marked final by javac.

Use final on non static method only if you have security concern.

Rémi

<p>Hi,</p> <p>what would be the reason for using ...

Hi,
what would be the reason for using Enumeration<? extends T> instead of Enumeration<T> in the static method's signature?
I suspected it should allow to assign the iterable created from e.g. an Enumeration<BufferedInputStream> to an Iterable<InputStream>.
But this does not work.
I can, however, write a loop like
for (InputStream in : Enumerations.iterable(someEnum))
where someEnum is an Enumeration<BufferedInputStream>.
But this works for both signatures, and I can't see what other advantage the suggested signature might have over the original one. Is there any?
Thanks,
Martin

Don't understand your problem

The target of this blog entry was not to be able to iterate more than once, but only to be able to use for-each with Enumerations without squandering CPU time and heap space. It was explicitily not the idea to be able to iterate over an Enumeration more than once, as this is not the nature of an Enumeration. If you want to do that, use Collections.list(). It works pretty well for THAT use case (which obviously is a different one).

The discussion about using unnecessary final is out of scope of this blog entry. In my case it is just personal style of writing, which will not get further discussed.