After, my post concerning the recommandation XSLT of XSL, here, I would present simple examples of parsing XML stream with the JAXP (Java APIs for XML Processing) API which is a common interface for creating, parsing and manipulating XML documents using the standard SAX, DOM and XSLTs.
XML parsing
XML has become indispensable in Information Systems Architectures and J2EE. Used as a standard format for data exchange, standardized by the W3C, the XML document is present everywhere in applications, databases, and is at the heart of EAI exchanges.
In this fact, the knowledge of the APIs of XML parsing like DOM, SAX is often necessary in the development of a J2EE application. Understand the differences, strengths and weaknesses of these APIs is important to avoid performance problems that may be encountered on these complex APIs.
So, to process the XML documents, an application needs an XML parser to tokenize and retrieve the data/objects in the XML streams. An XML parser is the programme between the application and the XML documents which reads a XML stream, ensures that is well-formed, and may validate the document against a DTD or schema definition XSD.
There are two standard APIs for parsing XML documents:
1. SAX (Simple API for XML)
2. DOM (Document Object Model)
The JAXP (Java APIs for XML Processing) provides a common interface for creating, parsing and manipulating XML documents using the standard SAX, DOM and XSLTs.
XML document
Before to begin with presentation and examples, here, the XML document people.xml used in the below examples:
01 | <? xml version = "1.0" encoding = "UTF-8" ?> |
03 | < person ID = "01245cdf45x" > |
05 | < name >Malcolm X</ name > |
06 | < name >Malik Shabazz</ name > |
07 | < name >Malcolm Little</ name > |
08 | < born >19 May 1925</ born > |
09 | < died >21 February 1965</ died > |
10 | < nationality >american</ nationality > |
12 | < person ID = "012qsabc3456002" > |
14 | < name >Mahatma Gandhi</ name > |
15 | < born >2 October 1869</ born > |
16 | < died >30 January 1948</ died > |
17 | < nationality >Indian</ nationality > |
19 | < person ID = "0457d7887897" > |
21 | < name >John F. Kennedy</ name > |
23 | < name >Jack Kennedy</ name > |
24 | < born >20 January 1961</ born > |
25 | < died >22 November 1963</ died > |
26 | < nationality >american</ nationality > |
SAX (Simple API for XML)
SAX is an event-driven API. A SAX Parser reports a document to an application as a series of events in callback methods of a handler. These callback methods are called when events occur during parsing for document start, document end, element start-tags, element end-tags, attributes, text context, entities, processing instructions, comments and others:

Below is a simple JAXP SAX parser to display all persons in the people.xml:
01 | public class TestParsingXmlWithSAX { |
03 | private String currentElement; |
04 | private int peopleCount = 1 ; |
07 | public TestParsingXmlWithSAX() { |
10 | SAXParserFactory factory = SAXParserFactory.newInstance(); |
13 | SAXParser saxParser = factory.newSAXParser(); |
16 | InputStream xmlStream = TestParsingXmlWithSAX. class .getResourceAsStream( "people.xml" ); |
19 | saxParser.parse(xmlStream, new MySaxHandler()); |
21 | } catch (Exception e) { |
27 | public static void main(String args[]) { |
28 | new TestParsingXmlWithSAX(); |
34 | class MySaxHandler extends DefaultHandler { |
38 | public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { |
39 | currentElement = qName; |
40 | if (currentElement.equals( "person" )) { |
41 | System.out.println( "Person " + peopleCount); |
43 | String personId = attributes.getValue( "ID" ); |
44 | System.out.println( "\tID:\t" + personId); |
50 | public void endElement(String uri, String localName, String qName) throws SAXException { |
56 | public void characters( char [] chars, int start, int length) throws SAXException { |
57 | if (currentElement.equals( "title" )) { |
58 | System.out.println( "\tTitle:\t" + new String(chars, start, length)); |
60 | } else if (currentElement.equals( "name" )) { |
61 | System.out.println( "\tName:\t" + new String(chars, start, length)); |
… the ouputs in console would be:
DOM (Document Object Model)
DOM is an object-oriented API. The DOM parser builds a tree structure which represents an XML document. Then, the application can manipulate the nodes of this tree. The DOM API defines the mechanism for querying, traversing and manipulating the object model built:

Below is a simple JAXP DOM parser to display all persons in the people.xml:
01 | public class TestParsingXmlWithDOM { |
03 | public static void main(String[] args) throws Exception { |
06 | DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); |
08 | DocumentBuilder docBuilder = factory.newDocumentBuilder(); |
11 | InputStream xmlStream = TestParsingXmlWithDOM. class .getResourceAsStream( "people.xml" ); |
15 | Document doc = docBuilder.parse(xmlStream); |
24 | NodeList list = doc.getElementsByTagName( "*" ); |
27 | for ( int i = 0 ; i < list.getLength(); i++) { |
30 | Element element = (Element) list.item(i); |
31 | String nodeName = element.getNodeName(); |
33 | if (nodeName.equals( "person" )) { |
35 | System.out.println( "PERSON " + peopleCount); |
36 | String personId = element.getAttribute( "ID" ); |
37 | System.out.println( "\tID:\t" + personId); |
39 | } else if (nodeName.equals( "title" )) { |
40 | System.out.println( "\tTitle:\t" + element.getChildNodes().item( 0 ).getNodeValue()); |
42 | } else if (nodeName.equals( "name" )) { |
43 | System.out.println( "\tName:\t" + element.getChildNodes().item( 0 ).getNodeValue()); |
… the ouputs in console would be:
Source: test_xml_parsing.zip
That’s all!!!
Huseyin OZVEREN
Related