Java XML - SPLessons

Java XML XPath Parser

Home > Lesson > Chapter 13
SPLessons 5 Steps, 3 Clicks
5 Steps - 3 Clicks

Java XML XPath Parser

Java XML XPath Parser

shape Description

XPath stands for XML Path Language which is a W3C standard and it is used for finding information of an XML document. Java XPath Parser can very quickly become a complex subject almost warrants title on its own. XPath is the fundamental part of XSLT and other XML topics like XQuery and XPointer. XPath literally defines a path into and XML document and this is similar to the file path defining a path to a file that's in a directory. For example, if there is XML document with root tag and element a tag underneath the root and element b underneath element a then, the path starts with forward slash(/) which can be written like below.
- /root_tag/element_a/element_b
The result of an XPath does not need to be a unique node. In a file system, it is not true. In a file system, only one file can exist in a directory at one time otherwise, name conflicts occurs. But in XPath, multiple elements are allowed all existing at the same level. In this case, XPath not just returns only one node, but a collection of nodes which is known as Node Set. XPath can be used on its own. But it is most often used with other XML technologies.

XPath Importance

Syntax: XPath has very compact syntax which is quick to pick up. Path Expression: The way expression is composed utilizing a progression of location steps. Context Node: The context hub which is the point in the archive from which the expression is assessed and this is the beginning stage in the XML report from which the way goes out and discovers its outcome. Some paths are absolute such that they always start from the absolute location and some paths are relatively similar to the way that paths are specified in HTML file to style sheets where absolute or relative paths can be chosen. Axis: In XPath, there is axis which is the relationship between context node and the nodes that are selected by the expression. Predicates: These are further refinements that are made to selection process i.e. only elements that have certain text or attributes are found.

XPath Expressions

shape Description

The path expression is written using a series of "location steps" which is used to select node or list of nodes from an XML document. Following are XPath expressions.
Path Expression Description
/ Discover the root tag in the report.
/root_tag Finds the root tag, yet just in the event that it is named "root_tag"
//element_a Discovers all the element_a labels, at whatever point they show up in the record.
text() Chooses the content substance of the present hub
@name Chooses the "name" property of the present hub
/doc/chapter[5]/section[2] Chooses the second segment of the fifth section of the archive
body/p[last()] Chooses the last "p" tag in the "body" tag
.. Selects the parent of the current node

shape Example

Below is the example of XPATH parser which parses a XML document. Step-1: Initially, the packages that are related to XML are imported. [java]import org.w3c.dom.*; import org.xml.sax.*; import javax.xml.parsers.*; import javax.xml.xpath.*; import java.io.*;[/java] In the above code, org.w3c.dom defines the DOM programming interfaces for XML documents specified by W3C and javax.xml.parsers defines DocumentBuilderFactory class and DocumentBuilder class, which returns an object that implements W3C Document interface. Step-2: Then create the above mentioned DocumentBuilderFactory class and DocumentBuilder class. [java] DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); [/java] Step-3: A XML document should be created from file or a stream. In the below code, parse() is used to parse document. [java] StringBuilder xmlStringBuilder = new StringBuilder(); xmlStringBuilder.append("<?xml version="1.0"?> <class> </class>"); ByteArrayInputStream input = new ByteArrayInputStream( xmlStringBuilder.toString().getBytes("UTF-8")); Document doc = builder.parse(input); [/java] Step-4: Here, XPath is build. [java]XPath xPath = XPathFactory.newInstance().newXPath();[/java] Step-5: Now Path expression has to be written and to be evaluated. [java] String expression = "/class/student"; NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(doc, XPathConstants.NODESET); [/java] Step-6: With the help of for loop, the nodes in the NodeList are to be iterated. [java] for (int i = 0; i < nodeList.getLength(); i++) { Node nNode = nodeList.item(i); ... } [/java] Step-7: The attributes and sub-elements are tested. [java] //returns specific attribute getAttribute("attributeName"); //returns a Map (table) of names/values getAttributes(); [/java] [java] //returns a list of subelements of specified name getElementsByTagName("subelementName"); //returns a list of all child nodes getChildNodes(); [/java] If kept all the above steps at once, below is the input XML document that has to be parsed. [xml] <?xml version="1.0"?> <class> <student rollno="393"> <firstname>John</firstname> <lastname>Mike</lastname> <nickname>Jom</nickname> <marks>85</marks> </student> <student rollno="493"> <firstname>Rafeal</firstname> <lastname>Nadal</lastname> <nickname>Rafa</nickname> <marks>95</marks> </student> <student rollno="593"> <firstname>Samuel</firstname> <lastname>Johnson</lastname> <nickname>Sam</nickname> <marks>90</marks> </student> </class> [/xml] XPathParserDemo.java [java] package com.splessons.xml; import java.io.File; import java.io.IOException; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.ParserConfigurationException; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathConstants; import javax.xml.xpath.XPathExpressionException; import javax.xml.xpath.XPathFactory; import org.w3c.dom.Document; import org.w3c.dom.NodeList; import org.w3c.dom.Node; import org.w3c.dom.Element; import org.xml.sax.SAXException; public class XPathParserDemo { public static void main(String[] args) { try { File inputFile = new File("input.txt"); DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder dBuilder; dBuilder = dbFactory.newDocumentBuilder(); Document doc = dBuilder.parse(inputFile); doc.getDocumentElement().normalize(); XPath xPath = XPathFactory.newInstance().newXPath(); String expression = "/class/student"; NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(doc, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { Node nNode = nodeList.item(i); System.out.println("\nCurrent Element :" + nNode.getNodeName()); if (nNode.getNodeType() == Node.ELEMENT_NODE) { Element eElement = (Element) nNode; System.out.println("Student roll no : " + eElement.getAttribute("rollno")); System.out.println("First Name : " + eElement .getElementsByTagName("firstname") .item(0) .getTextContent()); System.out.println("Last Name : " + eElement .getElementsByTagName("lastname") .item(0) .getTextContent()); System.out.println("Nick Name : " + eElement .getElementsByTagName("nickname") .item(0) .getTextContent()); System.out.println("Marks : " + eElement .getElementsByTagName("marks") .item(0) .getTextContent()); } } } catch (ParserConfigurationException e) { e.printStackTrace(); } catch (SAXException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } catch (XPathExpressionException e) { e.printStackTrace(); } } } [/java] The printStackTrace() technique prints this throwable and its backtrace to the standard goof stream. It prints a stack follow for this Throwable protest on the blunder yield stream that is the estimation of the field System.err. Output: [java] Current Element :student Student roll no : 393 First Name : John Last Name : Mike Nick Name : Jom Marks : 85 Current Element :student Student roll no : 493 First Name : Rafeal Last Name : Nadal Nick Name : Rafa Marks : 95 Current Element :student Student roll no : 593 First Name : Samuel Last Name : Johnson Nick Name : Sam Marks : 90 [/java]

Summary

shape Key Points

  • XPath is a syntax used to describe parts of an XML document.
  • XPath not just returns only one node, but a collection of nodes which is known as Node Set.
  • Path Expression selects list of nodes from an XML document.