Java XPath Parser - Overview


The XPath is an official recommendation of the World Wide Web Consortium (W3C). It defines a language to find information in an XML file. It is used to traverse elements and attributes of an XML document. XPath provides various type of expressions which can be used to enquire relevant information from the XML document.

What is XPath?

  • Structure Definations - XPath defines the parts of an XML document like element, attribute, text, namespace, processing-instruction, comment, and document nodes

  • Path Expressions XPath provides powerful path expressions select nodes or list of nodes in XML documents.

  • Standard FunctionsXPath provides a rich library of standard functions for manipulation of string values, numeric values, date and time comparison, node and QName manipulation, sequence manipulation, Boolean values etc.

  • Major part of XSLTXPath is one of the major element in XSLT standard and is must have knowledge in order to work with XSLT documents.

  • W3C recommendationXPath is official recommendation of World Wide Web Consortium (W3C).

Here is the input text file we need to parse:

<?xml version="1.0"?>
   <student rollno="393">
   <student rollno="493">
   <student rollno="593">

XPath Expressions

XPath uses a path expression to select node or list of nodes from an xml document. Following is the list of useful paths and expression to select any node/ list of nodes from an xml document.

node-nameSelect all nodes with the given name "nodename"
/Selection starts from the root node
//Selection starts from the current node that match the selection
.Selects the current node
..Selects the parent of the current node
@Selects attributes
studentExample: Selects all nodes with the name "student"
class/studentExample: Selects all student elements that are children of class
//studentSelects all student elements no matter where they are in the document


Predicate are used to find specifi node or a node containing specific value and are defined using [...] .

/class/student[1]Selects the first student element that is the child of the class element.
/class/student[last()]Selects the last student element that is the child of the class element.
/class/student[last()-1]Selects the last but one student element that is the child of the class element.
//student[@rollno='493']Selects all the student elements that have an attribute named rollno with a value of '493'