Skip to main content

XML Utility Library

Posted by joshy on June 28, 2010 at 8:15 PM PDT

As part of some open source stuff I've been doing on the side I've had to generate and parse a lot of XML. I like working with the DOM because it's tree structure cleanly matches my needs, but the W3C API is *so* cumbersome. The DOM was designed to be implemented in any language, not just clean OO languages like Java, so any code using it will work but be ugly. After considering a few other XML libraries I decided to write a new one that would work with modern Java 5 language features like generics, enhanced for-each, and varargs. This library is super tiny because it simply wraps the standard javax.xml libraries in the JRE, but gives you a much nicer interface to work with. Here's how to use it (or download it here):

Generating XML

The XMLWriter class provides methods start() and end() to generate nested XML elements. start() has var args to let you set an unlimited number of attributes on your element. Ex: to write out the element foo with attributes version=0.01 and type=bar, do the following:

out.start("foo","version","0.01","type","bar");

XMLWriter also uses method chaining to let you start and end an element on the same line. Here is a complete XML of generating XLM to the foo.xml file with a standard XML header, var args, and method chaining:

XMLWriter out = new XMLWriter(new File("foo.xml"));
out.header();
out.start("foo", "version","0.01","type","bar");
for(int i=0; i<3; i++) {
   out.start("bar","id",""+i).end();
}
out.end();
out.close();

produces:

contents of foo.xml

<?xml version="1.0"?>
<foo
    version='0.01'
    type='bar'
    >
    <bar
        id='0'
        >
    </bar>
    <bar
        id='1'
        >
    </bar>
    <bar
        id='2'
        >
    </bar>
</foo>

Parsing XML

The XMLParser class uses a DOM Parser and XPath to extract the parts of the document you want. Combined with generics and iterators you can conveniently parse your XML in a loop. For example, to parse the document from the previous example back in, grabbing all of the bar elements, then print out their id attributes:

Doc doc = XMLParser.parse(new File("foo.xml"));
for(Elem e : doc.xpath("//bar")) {
    System.out.println("id = " + e.attr("id"));
}

Details

This XML library uses the standard W3C Dom and javax.xml parsers underneath. Each DOM element is wrapped by a custom class with the convenience methods. Only elements returned from an XPath query are wrapped, so if you skip most of the document then most of it will never get wrapped. The Doc and Elem objects have references to the underlying W3C DOM objects.

I have no real plans for this library. I just found it useful for me and thought you might be interested in it. I'll release new versions as I fix bugs and add (tiny) features.

Docs and download here

Comments

Download Links

Joshua, The URLs for the downloads don't work for me. For the JAR I see this URL: http://hudson.joshy.org:9001/job/XMLUtil/lastSuccessfulBuild/artifact/di... And get this error: Server Error The following error occurred: [code=CANT_CONNECT] Could not connect because of networking problems. Contact your system administrator. Thanks, Randy Stegbauer

sorry

sorrya bout that. i'm not sure what happened. i've rebooted hudson.

attributes

For readability I would go for chaining add attribute methods: out.start("foo").attr("version","0.01").attr("type","bar")

Hi. I think the current

Hi. I think the current vararg params are quite readable with line separators:
out.start(
"foo",
"version","0.01",
"type","bar"
);


And why not to use Object instead of String? It's less typesafe but a little bit more readable ;-)
out.start("bar","id",i).end();

Fluid API

I agree with tbee, "fluid API" is a nice way to get things done while keeping readability without being verbose. Except that I would use out.tag("foo").attr("version", "0.01").attr("type", "bar").end(); Actually, I see in the (terse!) JavaDoc that there is already an attr() method, except it is void, so it doesn't allow chaining calls. Also I see no way to add textual data (Value). Last remark: the library is BSD but I see no way to get the source code... :-)

done

I've updated the code to add more method chaining on attr() and end(). I've also added a source bundle on the download page. Sorry 'bout that.