How to read XML file in Java – (DOM Parser)
The DOM interface is the easiest XML parser to understand, and use. It parses an entire XML document and loads it into memory, modelling it with Object for easy traversal or manipulation.
Note
DOM Parser is slow and consume a lot of memory if it load a XML document which contains a lot of data. Please consider SAX parser as solution for it, SAX is faster than DOM and use less memory.
DOM Parser is slow and consume a lot of memory if it load a XML document which contains a lot of data. Please consider SAX parser as solution for it, SAX is faster than DOM and use less memory.
DOM Parser Example
A DOM XML parser read below XML file and print out each elements one by one.
File : file.xml
File : ReadXMLFile.java – A Java class to read above XML file.
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.File;
public class ReadXMLFile {
public static void main(String argv[]) {
try {
File fXmlFile = new File("c:\\file.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("staff");
System.out.println("-----------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("First Name : " + getTagValue("firstname", eElement));
System.out.println("Last Name : " + getTagValue("lastname", eElement));
System.out.println("Nick Name : " + getTagValue("nickname", eElement));
System.out.println("Salary : " + getTagValue("salary", eElement));
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
private static String getTagValue(String sTag, Element eElement) {
NodeList nlList = eElement.getElementsByTagName(sTag).item(0).getChildNodes();
Node nValue = (Node) nlList.item(0);
return nValue.getNodeValue();
}
}
Result:
Root element :company-----------------------
First Name : yongLast Name : mook kimNick Name : mkyongSalary : 100000
First Name : lowLast Name : yin fongNick Name : fong fongSalary : 200000
How to read XML file in Java – (SAX Parser)
SAX parser is work differently than DOM parser, it neither load any XML document into memory nor create any object representation of the XML document. Instead, the SAX parser use callback function (
org.xml.sax.helpers.DefaultHandler) to informs clients of the XML document structure.Note
SAX Parser is faster and uses less memory than DOM parser.
See following SAX callback methods :
§ startDocument() and endDocument() – Method called at the start and end of an XML document.
§ startElement() and endElement() – Method called at the start and end of a document element.
§ characters() – Method called with the text contents in between the start and end tags of an XML document element.
File : file.xml
File : ReadXMLFile.java – A Java class to read above XML file.
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXMLFile {
public static void main(String argv[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bfname = false;
boolean blname = false;
boolean bnname = false;
boolean bsalary = false;
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
System.out.println("Start Element :" + qName);
if (qName.equalsIgnoreCase("FIRSTNAME")) {
bfname = true;
}
if (qName.equalsIgnoreCase("LASTNAME")) {
blname = true;
}
if (qName.equalsIgnoreCase("NICKNAME")) {
bnname = true;
}
if (qName.equalsIgnoreCase("SALARY")) {
bsalary = true;
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length) throws SAXException {
if (bfname) {
System.out.println("First Name : " + new String(ch, start, length));
bfname = false;
}
if (blname) {
System.out.println("Last Name : " + new String(ch, start, length));
blname = false;
}
if (bnname) {
System.out.println("Nick Name : " + new String(ch, start, length));
bnname = false;
}
if (bsalary) {
System.out.println("Salary : " + new String(ch, start, length));
bsalary = false;
}
}
};
saxParser.parse("c:\\file.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Result:
Start Element :companyStart Element :staffStart Element :firstnameFirst Name : yongEnd Element :firstnameStart Element :lastnameLast Name : mook kimEnd Element :lastnameStart Element :nicknameNick Name : mkyongEnd Element :nicknameStart Element :salarySalary : 100000
End Element :salaryEnd Element :staffStart Element :staffStart Element :firstnameFirst Name : lowEnd Element :firstnameStart Element :lastnameLast Name : yin fongEnd Element :lastnameStart Element :nicknameNick Name : fong fongEnd Element :nicknameStart Element :salarySalary : 200000
End Element :salaryEnd Element :staffEnd Element :companyWarning
This example may encounter exceptions for UTF-8 XML file.
This example may encounter exceptions for UTF-8 XML file.
if you parse a XML file which contains some special UTF-8 characters, it will prompts “Invalid byte 1 of 1-byte UTF-8 sequence” exception.
com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:
Invalid byte 1 of 1-byte UTF-8 sequence.
See following xml file which contain a special UTF-8 characters “§”
To fix it, just override the SAX input source like this :
File file = new File("c:\\file-utf.xml");
InputStream inputStream= new FileInputStream(file);
Reader reader = new InputStreamReader(inputStream,"UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
saxParser.parse(is, handler);


No comments:
Post a Comment