StAX Parser - Overview


StAX is a JAVA based API to parse XML document in a similar way as SAX parser does. But there are two major points of difference between the two APIs −

  • StAX is a PULL API whereas, SAX is a PUSH API. It means in case of StAX parser, the client application needs to ask the StAX parser to get information from XML whenever it needs but in case of the SAX parser, client application is required to get information when the SAX parser notifies the client application that information is available.

  • StAX API can read as well as write XML documents. Using SAX API, xml can be only read.

Following are the features of StAX API −

  • Reads an XML document from top to bottom, recognizing the tokens that make up a well-formed XML document.

  • Tokens are processed in the same order as they appear in the document.

  • Reports the application program on the nature of tokens that the parser has encountered as they occur.

  • The application program provides an "event" reader which acts as an iterator and iterates over the event to get the required information. Another reader available is the "cursor" reader which acts as a pointer to xml nodes.

  • As the events are identified, xml elements can be retrieved from the event object and can be processed further.

When to use?

You should use a StAX parser when −

  • You can process the XML document in a linear fashion from top to bottom.

  • The document is not deeply nested.

  • You are processing a very large XML document the DOM tree of which will consume too much memory. Typical DOM implementations use ten bytes of memory to represent one byte of XML.

  • The problem to be solved involves only part of the XML document.

  • Data is available as soon as it is seen by the parser, so StAX works well for an XML document that arrives over a stream.

Disadvantages of SAX

  • We have no random access to an XML document since it is processed in a forward-only manner.

  • If you need to keep track of data the parser has seen or change the order of items, you must write the code and store the data on your own.

XMLEventReader Class

This class provides the iterator of events which can be used to iterate over events as they occur while parsing the XML document

  • StartElement asStartElement() − Used to retrieve value and attributes of element.

  • EndElement asEndElement() − Called at the end of an element.

  • Characters asCharacters() − Can be used to obtain characters such as CDATA, whitespace, etc.

XMLEventWriter Class

This interface specifies methods for creating an event.

  • add(Event event) − Adds event containing elements to XML.

XMLStreamReader Class

This class provide iterator of events which can be used to iterate over events as they occur while parsing the XML document

  • int next() − Used to retrieve next event.

  • boolean hasNext() − Used to check further events exists or not

  • String getText() − Used to get text of an element

  • String getLocalName() − Used to get name of an element

XMLStreamWriter Class

This interface specifies methods for creating an event.

  • writeStartElement(String localName) − Adds start element of a given name.

  • writeEndElement(String localName) − Adds end element of a given name.

  • writeAttribute(String localName, String value) − Writes attribute to an element.