XPath

You use XPath with XModify to address parts of a document. XPath is important in Nexaweb application development for XModify statements, XModify server responses, Macros and Document Data Binding.

This document provides a reference to some of the XPath syntax for your convienence. For complete information about XPath, see the XPath Specification.

Overview

The primary purpose of XPath is to address parts of an XML document. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.

Node Tree

XPath operates on an XML document as a tree.

The tree contains the following types of nodes:

  • Root
  • Element
  • Text
  • Attribute
  • Namespace
  • Processing instruction
  • Comment

For every type of node, there is a way of determining a string-value for it. For some node types, the string-value is part of the node; for others, the string-value is computed from the string-value of descendant nodes.

Some node types also have an expanded-name, which is a pair consisting of a local part and a namespace URI.

Node TypeDescription
Root

The root of the tree only. The element node for the document element is a child of the root node.

Has a children processing instruction and comment nodes for processing instructions and comments that occur in the prolog and after the end of the document element.

Expanded-name: N/A

String-value: The concatenation of the string-values of all text node descendants of the root node in document order.

Element

One for every element in the document.

Expanded-name: Computed by expanding the qualified name (QName) of the element specified in the tag in accordance with the XML Namespaces Recommendation [XML Names]. The namespace URI of the element's expanded-name will be null if the QName has no prefix and there is no applicable default namespace.

String-value: The concatenation of the string-values of all text node descendants of the element node in document order.

Attribute

Each element node has an associated set of attribute nodes; the element is the parent of each of these attribute nodes; however, an attribute node is not a child of its parent element.

Expanded-name: Computed by expanding the QName specified in the tag in the XML document in accordance with the XML Namespaces Recommendation [XML Names]. The namespace URI of the attribute's name is null if the QName of the attribute does not have a prefix.

String-value: The normalized value as specified by the XML Recommendation [XML]. An attribute whose normalized value is a zero-length string is not treated specially: it results in an attribute node whose string-value is a zero-length string.

Namespace

Each element has an associated set of namespace nodes, one for each distinct namespace prefix that is in scope for the element and one for the default namespace if one is in scope for the element. The element is the parent of each of these namespace nodes; however, a namespace node is not a child of its parent element. An element has a namespace node for:

  • Every attribute on the element whose name starts with xmlns:
  • Every attribute on an ancestor element whose name starts xmlns: unless the element itself or a nearer ancestor redeclares the prefix
  • An XMLNS attribute, if the element or some ancestor has an xmlns attribute, and the value of the xmlns attribute for the nearest such element is non-empty

Expanded-name: Tthe local part being the namespace prefix (empty if the namespace node is for the default namespace) and the namespace URI always null.

String-value: The namespace URI that is being bound to the namespace prefix. If relative, it must be resolved as a namespace URI in an expanded-name.

Processing Instruction

There is a processing instruction node for every processing instruction, except for one that occurs within the document type declaration.

Expanded-name: The local part is the processing instruction's target; the namespace URI is null.

String-value of a processing instruction node is the part of the processing instruction following the target and any whitespace. It does not include the terminating ?>.

Comment

One for every comment, except for comments occuring within the document type declaration.

Expanded-name: N/A

String-value: Content of the comment not including the opening <!-- or the closing -->.

 

Document Order

The tree includes a document order defined on all the nodes in the document as follows:

  • The root node is the first node.
  • Element nodes occur in order of their start-tag in the XML (after expansion of entities).
  • Element nodes occur before their children.
  • Attribute and namespace nodes of an element occur before the children of the element.
  • Namespace nodes occur before attribute nodes.
  • The relative order of namespace and attribute nodes is implementation-dependent.

Root nodes and element nodes have an ordered list of child nodes as follows:

  • Nodes never share children.
  • Every node other than the root node has exactly one parent, which is either an element node or the root node.
  • A root node or an element node is the parent of each of its child nodes.
  • Descendants of a node are the children of the node and the descendants of the children of the node. 
     

Location Paths 

Path TypeDescription
Absolute location path

Consists of / optionally followed by a relative location path. A / by itself selects the root node of the document containing the context node. If it is followed by a relative location path, then the location path selects the set of nodes that would be selected by the relative location path relative to the root node of the document containing the context node.

Relative location path

Consists of a sequence of one or more location steps separated by /.

The steps in a relative location path are composed together from left to right. Each step in turn selects a set of nodes relative to a context node.

The initial sequence of steps selects a set of nodes relative to a context node. Each node in that set is used as a context node for the following step. The sets of nodes identified by that step are unioned together. The set of nodes identified by the composition of the steps is this union. For example, child::div/child::para selects the para element children of the div element children of the context node, or, in other words, the para element grandchildren that have div parents.

A location step has three parts:

  • An axis, which specifies the tree relationship between the nodes selected by the location step and the context node,
  • A node test, which specifies the node type and expanded-name of the nodes selected by the location step
  • Zero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step.

Location Path Abbreviated Syntax

SyntaxSelects
ParaThe para element children of the context node.
*

All element children of the context node.

Text()All text node children of the context node.
@nameThe name attribute of the context node.
@*All the attributes of the context node.
Para[1]The first para child of the context node.
para[last()]The last para child of the context node
*/paraAll para grandchildren of the context node.
/doc/chapter[5]/section[2]The second section of the fifth chapter of the doc.
chapter//paraThe para element descendants of the chapter element children of the context node.
//paraAll the para descendants of the document root.
All para elements in the same document as the context node
..The parent of the context node
../@langThe lang attribute of the parent of the context node.
para[@type="warning"]All para children of the context node that have a type attribute with value warning.
para[@type="warning"][5]The fifth para child of the context node that has a type attribute with value warning.
para[5][@type="warning"]The fifth para child of the context node if that child has a type attribute with value warning.
chapter[title="Introduction"]The chapter children of the context node that have one or more title children with string-value equal to Introduction.
chapter[title]The chapter children of the context node that have one or more title children.
employee[@secretary and @assistant]All the employee children of the context node that have both a secretary attribute and an assistant attribute.

 

General Abbreviation Guidelines

 

Syntax ElementDescription
Child

Is effectively the default axis; therefore you can omit it from a location step.

div/para = child::div/child::para

@

Short for: attribute::

For example: para[@type="warning"] is short for child::para[attribute::type="warning"].

Each selects para children with a type attribute with value equal to warning.

 //

Short for: /descendant-or-self::node()

For example: //para = /descendant-or-self::node()/child::para

Each selects para element in the document (even a para element that is a document element).

div//para

Short for: div/descendant-or-self::node()/child::para.

Each selects all para descendants of div children.

.

Short for: self::node.

Useful in conjuction with //.

For example: .// = self::node()/descendant-or-self node()/child::para

Each selects all para descendant element of the context node

..

Short for: parent::node()

For example: ../title = parent::node()/child::title

Each selects the title children of the parent of the context node.

Axes

This section lists and provides descriptions of the axis types and axes available for use in specifying a path.

Axis Types

 

TypeDescription
Forward AxisOnly ever contains the context node or nodes that are after it in document order.
Reverse AxisOnly ever contains the context node or nodes before it in document order.

List of Axes 

 

AxisDescriptionType
ancestorContains the ancestors of the context node, which consist of the parent of context node and the parent's parent and so on. Thus, the ancestor axis always includes the root node, unless the context node is the root node. Reverse
ancestor-or-self

Contains the context node and the ancestors of the context node. Thus, the ancestor axis always includes the root node.

 Reverse
attributeContains the attributes of the context node. This axis will be empty unless the context node is an element.Forward
childContains the children of the context node.Forward
descendantContains the descendants of the context node. Since a descendant is a child or a child of a child and so on, the descendant axis never contains attribute or namespace nodes.Forward
descendant-or-selfContains the context node and the descendants of the context node.Forward
followingContains all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes.Forward
following-siblingContains all the following siblings of the context node. If the context node is an attribute node or namespace node, the following-sibling axis is empty.Forward
namespaceContains the namespace nodes of the context node. The context node must be an element otherwise the axis will be empty unless. Forward
parentContains the parent of the context node, if one exists. Forward
precedingContains all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes. Reverse
preceding-siblingContains all the preceding siblings of the context node expcept if the context node is an attribute or namespace node, in which case, it is emptyReverse
selfContains only the context node itself.Forward

Every axis has a principal node type. If an axis can contain elements, then the principal node type is element; otherwise, it is the type of the nodes that the axis can contain.

 

AxisPrincipal Node Type
attributeattribute
namespacenamespace
All other axeselement

Node Test

Specifies the node type and expanded-name of the nodes selected by the location step.

The initial node-set is filtered by the first predicate to generate a new node-set; this new node-set is then filtered using the second predicate, and so on. The final node-set is the node-set selected by the location step.

A node test * is true for any node of the principal node type.

Examples:

Node TestDescription
child::*Selects all element children of the context node.
attribute::*Selects all attributes of the context node.

The node test text() is true for any text node.

Examples:

Node TestDescription
child:: text()Selects the text node children of the context node.
attribute::text()Selects all attributes of the context node.

The node test comment() is true for any comment node.

The node test processing-instruction() is true for any text processing-instruction. This node test may have an argument that is Literal; if so, is true for any processing instruction that has a name equal to the value of the Literal.

Predicates

Proximity Positions:

PositionDescription
Forward axisPosition of node within node set in document order.
Reverse AxisPosition of node within node set in reverse document order.
  • First position = 1
  • Para[3] is equivalent to para[position()=3]

Functions

This section lists the core function library that all XPath implementations support. Each function takes zero or more arguments and returns a single result.

For a function in the core function library, arguments and result are of the four basic types:

  • node-set (an unordered collection of nodes without duplicates)
  • boolean (true or false)
  • number (a floating-point number)
  • string (a sequence of UCS characters)
Function TypeFunctionSyntaxDescription
Node Setlastnumber last()Returns a number equal to the context size from the expression evaluation context.
 positionnumber position()Returns a number equal to the context position from the expression evaluation context.
 countnumber count(node-set)Returns the number of nodes in the argument node-set.
 idnode-set id(object)

Selects elements by their unique ID.

For example:
id("foo") selects the element with unique ID foo
id("foo")/child::para[position()=5] selects the fifth para child of the element with unique ID foo

 local-namestring local-name(node-set?)Returns the local part of the expanded-name of the node in the argument node-set that is first in document order.
 namespace-uristring namespace-uri(node-set?)Returns the namespace URI of the expanded-name of the node in the argument node-set that is first in document order. 
 namestring name(node-set?)Returns a string containing a qualified name (QName) representing the expanded-name of the node in the argument node-set that is first in document order.
Stringstringstring string(object?)Converts an object to a string.
 concatstring concat(string, string, string*)Returns the concatenation of its arguments.
 starts-withboolean starts-with(string, string)Returns true if the first argument string starts with the second argument string, and otherwise returns false.
 containsboolean contains(string, string)Returns true if the first argument string contains the second argument string, and otherwise returns false.
 substring-beforestring substring-before(string, string)Returns the substring of the first argument string that precedes the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string. For example, substring-before("1999/04/01","/") returns 1999.
 substring-afterstring substring-after(string, string)Returns the substring of the first argument string that follows the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string. For example, substring-after("1999/04/01","/") returns 04/01, and substring-after("1999/04/01","19") returns 99/04/01.
 substringstring substring(string, number, number?)Returns the substring of the first argument starting at the position specified in the second argument with length specified in the third argument. For example, substring("12345",2,3) returns "234". If the third argument is not specified, it returns the substring starting at the position specified in the second argument and continuing to the end of the string. For example, substring("12345",2) returns "2345".
 string-lengthnumber string-length(string?)Returns the number of characters in the string.
 normalize spacestring normalize-space(string?)Returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space.
 translatestring translate(string, string, string)Returns the first argument string with occurrences of characters in the second argument string replaced by the character at the corresponding position in the third argument string. For example, translate("bar","abc","ABC") returns the string BAr.
Booleanbooleanboolean boolean(object)

Converts its argument to a boolean as follows:

  • Number - true if and only if it is neither positive or negative zero nor NaN
  • Node-set - true if and only if it is non-empty
  • String - true if and only if its length is non-zero
  • Object of a type other than the four basic types is converted to a boolean in a way that is dependent on that type
 notboolean not(boolean)Returns true if its argument is false, and false otherwise.
 trueboolean true()Returns true.
 falseboolean false()Returns false.
 langboolean lang(string)Returns true or false depending on whether the language of the context node as specified by xml:lang attributes is the same as or is a sublanguage of the language specified by the argument string.
Numbernumbernumber number(object?)

Converts its argument to a number as follows:

  • a string that consists of optional whitespace followed by an optional minus sign followed by a Number followed by whitespace is converted to the IEEE 754 number that is nearest (according to the IEEE 754 round-to-nearest rule) to the mathematical value represented by the string; any other string is converted to NaN
  • boolean true is converted to 1; boolean false is converted to 0
  • a node-set is first converted to a string as if by a call to the string function and then converted in the same way as a string argument
  • an object of a type other than the four basic types is converted to a number in a way that is dependent on that type
 sumnumber sum(node-set)Returns the sum, for each node in the argument node-set, of the result of converting the string-values of the node to a number.
 floornumber floor(number)Returns the largest (closest to positive infinity) number that is not greater than the argument and that is an integer.
 ceilingnumber ceiling(number)Returns the smallest (closest to negative infinity) number that is not less than the argument and that is an integer.
 roundnumber round(number)

Returns the number that is closest to the argument and that is an integer. If there are two such numbers, then the one that is closest to positive infinity is returned.

If the argument is:

  • NaN - NaN is returned.
  • Positive infinity - positive infinity is returned.
  • Negative infinity - negative infinity is returned.
  • Positive zero - positive zero is returned.
  • Negative zero - negative zero is returned.
  • Less than zero, but greater than or equal to -0.5 - negative zero is returned.