Skip to main content

overview of XML Parsing in java , different methods and libraries

Posted by kalali on April 24, 2006 at 5:00 AM PDT

what are different technologies and implementations to simply pars some XML files ?
What Java provide you to pars XML files ?
Iin this blog entry After answering the above questions , i will introduce (by name and method of parsing) some XML parser available in java land.we may talk more about each of them in next entries.
First lets see what are different methods to pars XML files. There are Two major XML parsing method , one of them is Old and Standard method using a DOM (Documet Object Model) and the other is Stream and Event Based Parsing.

  • In DOM metod we fetch entiere XML document into memory and create a Objective representation of it in the memory then we can traverse the document using a very rich set of methods which DOM interfaces defined. So wen can access the document tree in any time and change its elements or attributes on emand.
    But it will need a high amount of memory to store the Document Tree , so it is not suitable when you have resource (memory) limitation. as you know Current Browsers use this model, so we can easily use javaScript to access an elements and Change them in runtime base on our needs.Using DOM you can create and write XML files.
  • Other model Use Streaming and is event based, what does it means ? it means that it go trough the document from start to end , and as soon as it(the Streaming parser) sees some text nodes , attribute , Maleformed elements... it will trigger an Event which you are listening for , so it is a one way parsing which in , you can not change the elements or its attributes as you can do in DOM .In Streaming model you do not need a high amount of memory becasue you are not going to create a model of your document in memory.using Stream parsers you can create an XML file.

but we should notice that we have Another categorizing for XML parser refer to parser and client communication mechanism , there are two kind of communication mechanism between parsers and client that use those parsers , by this categorizing we have two kind of parses :

  • Push parser
  • Pull parser

Parsers that we have had before StAX are push parsers, which parser push XML data to client whether it needs those data or not or even it is ready to recieve data or not.
Pull Parsers which come along with StaX parsers provide another mechanism for letting Client to have the data . it give the ability of asking for XML data to client , so Client will recive Data when it asks for those Data.
Now you should know that DOM / SAX are both push parsers , and the only pull parser implementation that i used is StAX.
some features that make StAX a suitable parsing method when you need just to pars xml files while there is no need to create or update an XML file, you should know that StAX is one was parser as SAX parsers are.

  • With StAX you can write XML documents too , meanwhile with SAX you can not write XML documents.
  • using StAX as i will show you in next part is much easier than using SAX
  • In pull parsing your own code has the control of Parser client Thread becasue its you that engage with Parser MeanWhile in a push parser it is parse that is mainStream in your code becase it parse and give the parsed data back to you whether you ask for it or not
  • You can works with several XML Streams (parsing and processing them ) in pull parsing mode when your client code is one thread, meanwhile you can not do this in a push parsing mode

Now Lets see what XML parsers support which models:

Model / XML Parsing Library Crimson
 1.1.3
Xerces
1.4.4
JAXP
1.3

NekoPull1
0.2.4   

Piccolo
1.04
StAX
1.2 RC
dom4j
1.6.1
Push Parsing Yes Yes Yes No Yes No Yes
Pull Parsing No No No Yes No Yes No
Stream Parsing  implementation (SAX) Yes Yes Yes Yes Yes Yes Yes
DOM Parsing implementation Yes Yes Yes No No No Yes

 

1- It Extends  XNI to provide pull parsing.


Related Topics >>