Skip to main content

Is binary XML an oxymoron?

Posted by jbob on March 23, 2005 at 12:53 PM PST

news.com recently reported that A W3C committee is recommending that the group create a standard for a binary XML format. The problem they are trying to solve is the inherent inefficiencies of text.

Is this a memory lapse?

It seems we've forgotten what the notion of a Markup Language is all about. XML, like other markup languages such as HTML and WML, tag portions of text documents for one reason or another. HTML marks up text for formatting purposes and XML marks up text to make data embedded in a text document more machine readable.

All of these things are about making documents more useful. Formating documents, embedding data in documents, etc, is the purpose of markup languages.

The other thing we are forgetting is that binary formats are platform optimized. This optimization is a leading cause for incompatibility between dissimilar systems.

Finally, does anyone actually expects there to be a single binary standard if the WC3 actually pursues this? Many in the industry, including Microsoft, are already calling for multiple binary standards for XML.

Multiple binary standards for XML?! This whole thing is becoming a mess before it gets out of the gate.

I like XML. I think it's useful for certain purposes and use it myself for configuration files and for storing offline data. The things that make XML particularly useful are that it's human readable and that it is a standard. Daniel Steinberg provides an excellent example of why human readable data is valuable in his 2003 article on transforming iCal files with Java on O'Reilly's Mac Dev Center site.

I believe the problem with Binary XML movement is that, once again, we are looking for a silver bullet. There are no silver bullets and XML is also not one. Rather than embracing a wonderful technology for what it's good for, we will wind up jeopardizing it as we try to get it to do things that it isn't well designed for. The Fast Infoset Project (FI) provides some immediate relief for document size and performance. I think FI is solving the problem correctly.

All of this reminds me of when the whole Web Services craze started. Everyone just stopped thinking. Everything needed to be XML and everything needed to be Web Services. It was crazy.

During the early years of web services I would give talks to people deciding when and if to adopt emerging technologies. I typically praised XML and warned against what I thought was inefficient use.
4+ years later, my position remains the same. Given the current state of Internet and Wireless bandwidth along with text processing performance, it just doesn't seem desirable to use text as the basis for high volume data transmissions. Text is fat and inefficient for high volume use. Additionally, to secure that text, you must encrypt it which adds additional bandwidth, processing, and memory overhead.

I think the FI project is fixing the problem in the right place and is better than pretending we can all agree on a single binary format for XML. Eduardo Pelegri describes Fast Infoset in his blog as "GZIP for XML" and I think this is the right approach.

Let's use XML for what's it's good at and get better at using it. This includes more efficient document design. Don't put everything including the kitchen sink in your messages/documents and learn to normalize your documents and messages. I believe the answer is a new efficient standard or improvements in text compression and processing.

Whatever happens, I'm counting on Java to continue to make it easy for me to manage and process XML.

Thanks for reading.

Related Topics >>