Tuesday, February 4, 2014

Reading XML with Java DOM API

Let's learn how to read an XML file with Java DOM API. There are other API's which we can use to parse and manipulate XML files apart from Java DOM API, such as JAXP. But Java DOM API is very simple and in my opinion any Java developer needs to know DOM API regarless of why he/she need to choose any other API.

Ok.. Let's say we need to parse the following XML file.
<?xml version = "1.0"?>

<letter>
    <contact type="sender">
        <name>  Jane Doe  </name>
        <address1>Box 12345</address1>
        <address2>15 Any Ave.</address2>
        <city>Othertown</city>
        <state>Otherstate</state>
        <zip>67890</zip>
        <phone>555-4321</phone>
        <flag gender="F" age="23">
        </flag>
    </contact>
    <contact type="receiver">
        <name>John Doe</name>
        <address1>123 Main St.</address1>
        <address2></address2>
        <city>Anytown</city>
        <state>Anystate</state>
        <zip>12345</zip>
        <phone>555-1234</phone>
        <flag gender="M">
        </flag>
    </contact>
    <salutation>Dear Sir:   </salutation>
    <paragraph>It is our privilege to inform you about our new databasemanaged with XML. This new system allows you to reduce theload on your inventory list server by having the client machineperform the work of sorting and filtering the data. </paragraph>
    <paragraph>Please visit our website for availability and pricing.   </paragraph>
    <closing>Sincerely, </closing>
    <signature>Ms. Jane Doe </signature>
</letter>
Ok... Shall we start looking into Java code that will read content of this XML?
package xmlreading;

import java.io.File;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

public class XmlReading {
    public static void main(String[] args) {
        try {
            File xmlFile = new File("letter.xml");
         
            DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
         
            Document document = documentBuilder.parse(xmlFile);
         
            Element documentElement = document.getDocumentElement() ;
            documentElement.normalize();
         
            NodeList rootLists = documentElement.getElementsByTagName("contact");
         
            if(rootLists != null && rootLists.getLength() > 0) {
                for(int k = 0 ; k < rootLists.getLength() ; k++) {
                    Node contactNode = rootLists.item(k);
                 
                    if(contactNode != null && contactNode.getNodeType() == Node.ELEMENT_NODE) {
                        Element nodeElement = (Element) contactNode ;
                     
                        System.out.println("Contact type : " + nodeElement.getAttribute("type"));
                     
                        // This is one way of reading with exact tag name
                        System.out.println(nodeElement.getElementsByTagName("name").item(0).getTextContent());
                        System.out.println(nodeElement.getElementsByTagName("address1").item(0).getTextContent());
                        System.out.println(nodeElement.getElementsByTagName("address2").item(0).getTextContent());
                        System.out.println(nodeElement.getElementsByTagName("city").item(0).getTextContent());
                        System.out.println(nodeElement.getElementsByTagName("state").item(0).getTextContent());
                        System.out.println(nodeElement.getElementsByTagName("zip").item(0).getTextContent());
                        System.out.println(nodeElement.getElementsByTagName("phone").item(0).getTextContent());
                     
                        // Following wil help read the node name and it's content programmatically
                        NodeList childNodes = contactNode.getChildNodes();
                     
                        if(childNodes != null && childNodes.getLength() > 0) {
                            for(int j = 0 ; j < childNodes.getLength() ; j++) {
                                Node chileNode = childNodes.item(j);
                                if(chileNode != null && chileNode.getNodeType() == Node.ELEMENT_NODE) {
                                    Element contactChileElement = (Element) chileNode ;
                                 
                                    System.out.println("Contact child : " + chileNode.getNodeName() + "  "
                                            + chileNode.getTextContent());
                                }
                            }
                        }
                    }
                }
            }
         
        } catch (ParserConfigurationException | SAXException ex) {
            System.err.println("error occurred");
        } catch (IOException ex) {
            System.err.println("error occurred : " + ex);
        }
    }
}
The output of the code is as below.
Contact type : sender
  Jane Doe
Box 12345
15 Any Ave.
Othertown
Otherstate
67890
555-4321
Contact child : name    Jane Doe
Contact child : address1  Box 12345
Contact child : address2  15 Any Ave.
Contact child : city  Othertown
Contact child : state  Otherstate
Contact child : zip  67890
Contact child : phone  555-4321
Contact child : flag
  
Contact type : receiver
John Doe
123 Main St.

Anytown
Anystate
12345
555-1234
Contact child : name  John Doe
Contact child : address1  123 Main St.
Contact child : address2
Contact child : city  Anytown
Contact child : state  Anystate
Contact child : zip  12345
Contact child : phone  555-1234
Contact child : flag
To give a short insight of the code, we first create a File instamce with the XML file we need to parse. Then we create a DocumentBuilderFactory instance and then a DocumentBuilder from the factory method newDocumentBuilder(). Then we invoke the parse() method on DocumentBuilder, which will parse the XML and returns a Document object. Line 25 is not really needed, though, it's highly recommended. We can then invoke methods on Element instance; documentElement, to read data out.

documentElement.getElementsByTagName("contact");  

Return a NodeList (which is a list of nodes) of all the nodes having the node name "contact". Then we iterate through each such node. And we can take a Node instance from iterating through each element in NodeList. We can determine the node type by getNodeType() method of Node instance. If it's of type ELEMENT_NODE. Node instance is of type Element. Element is a sub type of Node. Therefore we can invoke all the Node methods from Element instance as well.

I recomment you look at Java API documentation for complete reference of the Java DOM API. Hope you find this post a quick start to read XML. Enjoy your day!!!

No comments:

Post a Comment