Garbage collection - not a panacea
Sometimes working with a good profiler gives you very valuable insights into performance bottlenecks. Of course, there is a convention of "don't start profiling until your code is production quality", but there are few steps that you may take to spare yourself a lot of trouble.
I've been looking at the source code of the JXM project as part of the bindmark initiative. Altough this project appears to be dead (the last posting dates back to September 2003), it is a perfect demonstration of the techniques that should be avoided from the very beginning. Its performance, both in time and in memory, was dead last among all the frameworks that have been tested, and a quick glance in a profiler reveals why.
A class com.lifecde.jxm.XMLToClassMap provides a function called getField():
public FieldMap getField(Object obj, String name)
FieldMap field = new FieldMap();
The class FieldMap's constructor looks innocuous enough:
TimeZone z = TimeZone.getTimeZone("GMT");
timeFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss z");
dateFormat = new SimpleDateFormat("yyyy-MM-dd z");
timestampFormat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
However, this code is a real performance killer (and not in a good way). First of all, the format members are used only for converting dates and timestamps. Second, as the formats are constant, they should have been declared as static members and initialized only once. Another option would be to cache them in a HashMap of some sort. You can say now - it's not that bad, so we allocate three objects which then get garbage-collected. No harm, no foul. Wrong.
The getField() function is called on every XML tag. On parsing 2000 moderately-sized XML strings (each one about 1KB big), the FieldMap constructor accounts for 2.411.079 allocated objects that totals 25.4% of all allocated memory. In total, 120.000 FieldMap are created in 106.82 seconds which is 27.6% of the total running time. You can ask, how 120.000 constructors resulted in 2.5 million allocations? Just look at the constructor of SimpleDateFormat in JDK and see yourself. In this case simple technique of making the objects static would have resulted in 25% save of both memory and time.
In addition, reflection (or introspection) is heavily used throughout the library. A Class.forName() alone accounts for 12% time and 16.2% memory (resulting in 1.441.039 allocations). Using HashMap (possibly with soft references) would have resulted in additional speedup.
To sum up - the profiler is not some extravagant toy. It should be used from the very beginning. In addition, use static members for constant fields and hash data structures for speedups.