Skip to main content

The FI Project - An Open Source Implementation of a Binary XML Standard

Posted by pelegri on January 6, 2005 at 4:12 AM EST

Last June I talked about Fast Infoset (formally ITU-T Rec. X.891 | ISO/IEC 24824-1), being standardized at the ITU-T and ISO. In a nutshell, Fast Infoset (FI) is a binary encoding of the XML Infoset that has been designed to provide a good tradeoff between reducing encoding and decoding time and reducing the encoded size. The encoding is independent of the Schema of the document being encoded, although it is possible to use external dictionaries and binary encoding algorithms to improve its performance for specific applications. You can think of Fast Infoset as GZIP for XML: like GZIP, you only need to know that the file is encoded to decode it, but unlike GZIP, encode and decoding performance is as important as encoded size.

Interest in Binary XML is increasing. Our group at Sun has been investigating performance bottlenecks for Web Services for a while. We first reported on some of that work at a session in JavaOne 2003 and we later hosted a W3C Workshop on Binary Interchange of XML Infoset. These two events showed a lot of interest which was later confirmed by very positive feedback from a Technology Preview of the Fast WebServices technology. We have seen substantial increase in the interest in the last few months reflecting the desire to use XML and WS in more situations; here is one public indicator. We, and other companies, are working in the W3C XML Binary Characterization WG to determine if there is a single binary XML standard that will work for all use cases. This process will take some time, but in the meantime we believe there is a substantial number of customers that have needs today that will be satisfied with the Fast Infoset standard.

A standard is only as useful as its adoption; to that effect, Sun has created the FI project at Java.Net to develop and make available our implementation of Fast Infoset. The implementation is available under an Open Source license (ASL 2.0) and is intended to be a high quality implementation to be used in production artifacts, including producers, consumers, and intermediaries of Web Services, in XML readers and writers, and in other XML applications. The implementation should be directly usable by Java applications, but should also help implementors in other languages to understand the standard, and should encourage wide adoption of the standard.

The existing code contains partial implementations of the Fast Infoset specification for the SAX API and the StAX API. We will expand the implementation quickly but it is already functional and we have run a number of micro and macro benchmarks with very encouraging performance numbers. Your mileage can (will?) vary, but we are seing 3x to 4x time improvements in micro-benchmarks and 50% improvements in a WS-centered macro-benchmark we use internally when integrating FI into JAX-RPC. We will be contributing to the JAX-RPC project at Java.Net to support FI in that code base. That can then be used in different containers, including Sun's free J2EE AppServer (yes, this is a plug for our product :-}).

Finally, to encourage discussion on Fast Infoset, Fast Webservices, and any other topic related to binary XML, binary WS, et environs, we asked the Java.Net editors to create a new Binary XML and WS forum. We encourage your comments there.

Related Topics >>