XPath
You use XPath with XModify to address parts of a document. XPath is important in Nexaweb application development for XModify statements, XModify server responses, Macros and Document Data Binding.
This document provides a reference to some of the XPath syntax for your convienence. For complete information about XPath, see the XPath Specification.
Overview
The primary purpose of XPath is to address parts of an XML document. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.
Node Tree
XPath operates on an XML document as a tree.
The tree contains the following types of nodes:
- Root
- Element
- Text
- Attribute
- Namespace
- Processing instruction
- Comment
For every type of node, there is a way of determining a string-value for it. For some node types, the string-value is part of the node; for others, the string-value is computed from the string-value of descendant nodes.
Some node types also have an expanded-name, which is a pair consisting of a local part and a namespace URI.
Node Type | Description |
Root | The root of the tree only. The element node for the document element is a child of the root node. Has a children processing instruction and comment nodes for processing instructions and comments that occur in the prolog and after the end of the document element. Expanded-name: N/A String-value: The concatenation of the string-values of all text node descendants of the root node in document order. |
Element | One for every element in the document. Expanded-name: Computed by expanding the qualified name (QName) of the element specified in the tag in accordance with the XML Namespaces Recommendation [XML Names]. The namespace URI of the element's expanded-name will be null if the QName has no prefix and there is no applicable default namespace. String-value: The concatenation of the string-values of all text node descendants of the element node in document order. |
Attribute | Each element node has an associated set of attribute nodes; the element is the parent of each of these attribute nodes; however, an attribute node is not a child of its parent element. Expanded-name: Computed by expanding the QName specified in the tag in the XML document in accordance with the XML Namespaces Recommendation [XML Names]. The namespace URI of the attribute's name is null if the QName of the attribute does not have a prefix. String-value: The normalized value as specified by the XML Recommendation [XML]. An attribute whose normalized value is a zero-length string is not treated specially: it results in an attribute node whose string-value is a zero-length string. |
Namespace | Each element has an associated set of namespace nodes, one for each distinct namespace prefix that is in scope for the element and one for the default namespace if one is in scope for the element. The element is the parent of each of these namespace nodes; however, a namespace node is not a child of its parent element. An element has a namespace node for:
Expanded-name: Tthe local part being the namespace prefix (empty if the namespace node is for the default namespace) and the namespace URI always null. String-value: The namespace URI that is being bound to the namespace prefix. If relative, it must be resolved as a namespace URI in an expanded-name. |
Processing Instruction | There is a processing instruction node for every processing instruction, except for one that occurs within the document type declaration. Expanded-name: The local part is the processing instruction's target; the namespace URI is null. String-value of a processing instruction node is the part of the processing instruction following the target and any whitespace. It does not include the terminating ?>. |
Comment | One for every comment, except for comments occuring within the document type declaration. Expanded-name: N/A String-value: Content of the comment not including the opening <!-- or the closing -->. |
Document Order
The tree includes a document order defined on all the nodes in the document as follows:
- The root node is the first node.
- Element nodes occur in order of their start-tag in the XML (after expansion of entities).
- Element nodes occur before their children.
- Attribute and namespace nodes of an element occur before the children of the element.
- Namespace nodes occur before attribute nodes.
- The relative order of namespace and attribute nodes is implementation-dependent.
Root nodes and element nodes have an ordered list of child nodes as follows:
- Nodes never share children.
- Every node other than the root node has exactly one parent, which is either an element node or the root node.
- A root node or an element node is the parent of each of its child nodes.
- Descendants of a node are the children of the node and the descendants of the children of the node.
Location Paths
Path Type | Description |
Absolute location path | Consists of / optionally followed by a relative location path. A / by itself selects the root node of the document containing the context node. If it is followed by a relative location path, then the location path selects the set of nodes that would be selected by the relative location path relative to the root node of the document containing the context node. |
Relative location path | Consists of a sequence of one or more location steps separated by /. The steps in a relative location path are composed together from left to right. Each step in turn selects a set of nodes relative to a context node. The initial sequence of steps selects a set of nodes relative to a context node. Each node in that set is used as a context node for the following step. The sets of nodes identified by that step are unioned together. The set of nodes identified by the composition of the steps is this union. For example, child::div/child::para selects the para element children of the div element children of the context node, or, in other words, the para element grandchildren that have div parents. |
A location step has three parts:
- An axis, which specifies the tree relationship between the nodes selected by the location step and the context node,
- A node test, which specifies the node type and expanded-name of the nodes selected by the location step
- Zero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step.
Location Path Abbreviated Syntax
Syntax | Selects |
Para | The para element children of the context node. |
* | All element children of the context node. |
Text() | All text node children of the context node. |
@name | The name attribute of the context node. |
@* | All the attributes of the context node. |
Para[1] | The first para child of the context node. |
para[last()] | The last para child of the context node |
*/para | All para grandchildren of the context node. |
/doc/chapter[5]/section[2] | The second section of the fifth chapter of the doc. |
chapter//para | The para element descendants of the chapter element children of the context node. |
//para | All the para descendants of the document root. All para elements in the same document as the context node |
.. | The parent of the context node |
../@lang | The lang attribute of the parent of the context node. |
para[@type="warning"] | All para children of the context node that have a type attribute with value warning. |
para[@type="warning"][5] | The fifth para child of the context node that has a type attribute with value warning. |
para[5][@type="warning"] | The fifth para child of the context node if that child has a type attribute with value warning. |
chapter[title="Introduction"] | The chapter children of the context node that have one or more title children with string-value equal to Introduction. |
chapter[title] | The chapter children of the context node that have one or more title children. |
employee[@secretary and @assistant] | All the employee children of the context node that have both a secretary attribute and an assistant attribute. |
General Abbreviation Guidelines
Syntax Element | Description |
Child | Is effectively the default axis; therefore you can omit it from a location step. div/para = child::div/child::para |
@ | Short for: attribute:: For example: para[@type="warning"] is short for child::para[attribute::type="warning"]. Each selects para children with a type attribute with value equal to warning. |
// | Short for: /descendant-or-self::node() For example: //para = /descendant-or-self::node()/child::para Each selects para element in the document (even a para element that is a document element). |
div//para | Short for: div/descendant-or-self::node()/child::para. Each selects all para descendants of div children. |
. | Short for: self::node. Useful in conjuction with //. For example: .// = self::node()/descendant-or-self node()/child::para Each selects all para descendant element of the context node |
.. | Short for: parent::node() For example: ../title = parent::node()/child::title Each selects the title children of the parent of the context node. |
Axes
This section lists and provides descriptions of the axis types and axes available for use in specifying a path.
Axis Types
Type | Description |
Forward Axis | Only ever contains the context node or nodes that are after it in document order. |
Reverse Axis | Only ever contains the context node or nodes before it in document order. |
List of Axes
Axis | Description | Type |
ancestor | Contains the ancestors of the context node, which consist of the parent of context node and the parent's parent and so on. Thus, the ancestor axis always includes the root node, unless the context node is the root node. | Reverse |
ancestor-or-self | Contains the context node and the ancestors of the context node. Thus, the ancestor axis always includes the root node. | Reverse |
attribute | Contains the attributes of the context node. This axis will be empty unless the context node is an element. | Forward |
child | Contains the children of the context node. | Forward |
descendant | Contains the descendants of the context node. Since a descendant is a child or a child of a child and so on, the descendant axis never contains attribute or namespace nodes. | Forward |
descendant-or-self | Contains the context node and the descendants of the context node. | Forward |
following | Contains all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes. | Forward |
following-sibling | Contains all the following siblings of the context node. If the context node is an attribute node or namespace node, the following-sibling axis is empty. | Forward |
namespace | Contains the namespace nodes of the context node. The context node must be an element otherwise the axis will be empty unless. | Forward |
parent | Contains the parent of the context node, if one exists. | Forward |
preceding | Contains all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes. | Reverse |
preceding-sibling | Contains all the preceding siblings of the context node expcept if the context node is an attribute or namespace node, in which case, it is empty | Reverse |
self | Contains only the context node itself. | Forward |
Every axis has a principal node type. If an axis can contain elements, then the principal node type is element; otherwise, it is the type of the nodes that the axis can contain.
Axis | Principal Node Type |
attribute | attribute |
namespace | namespace |
All other axes | element |
Node Test
Specifies the node type and expanded-name of the nodes selected by the location step.
The initial node-set is filtered by the first predicate to generate a new node-set; this new node-set is then filtered using the second predicate, and so on. The final node-set is the node-set selected by the location step.
A node test * is true for any node of the principal node type.
Examples:
Node Test | Description |
child::* | Selects all element children of the context node. |
attribute::* | Selects all attributes of the context node. |
The node test text() is true for any text node.
Examples:
Node Test | Description |
child:: text() | Selects the text node children of the context node. |
attribute::text() | Selects all attributes of the context node. |
The node test comment() is true for any comment node.
The node test processing-instruction() is true for any text processing-instruction. This node test may have an argument that is Literal; if so, is true for any processing instruction that has a name equal to the value of the Literal.
Predicates
Proximity Positions:
Position | Description |
Forward axis | Position of node within node set in document order. |
Reverse Axis | Position of node within node set in reverse document order. |
- First position = 1
- Para[3] is equivalent to para[position()=3]
Functions
This section lists the core function library that all XPath implementations support. Each function takes zero or more arguments and returns a single result.
For a function in the core function library, arguments and result are of the four basic types:
- node-set (an unordered collection of nodes without duplicates)
- boolean (true or false)
- number (a floating-point number)
- string (a sequence of UCS characters)
Function Type | Function | Syntax | Description |
Node Set | last | number last() | Returns a number equal to the context size from the expression evaluation context. |
position | number position() | Returns a number equal to the context position from the expression evaluation context. | |
count | number count(node-set) | Returns the number of nodes in the argument node-set. | |
id | node-set id(object) | Selects elements by their unique ID. For example: | |
local-name | string local-name(node-set?) | Returns the local part of the expanded-name of the node in the argument node-set that is first in document order. | |
namespace-uri | string namespace-uri(node-set?) | Returns the namespace URI of the expanded-name of the node in the argument node-set that is first in document order. | |
name | string name(node-set?) | Returns a string containing a qualified name (QName) representing the expanded-name of the node in the argument node-set that is first in document order. | |
String | string | string string(object?) | Converts an object to a string. |
concat | string concat(string, string, string*) | Returns the concatenation of its arguments. | |
starts-with | boolean starts-with(string, string) | Returns true if the first argument string starts with the second argument string, and otherwise returns false. | |
contains | boolean contains(string, string) | Returns true if the first argument string contains the second argument string, and otherwise returns false. | |
substring-before | string substring-before(string, string) | Returns the substring of the first argument string that precedes the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string. For example, substring-before("1999/04/01","/") returns 1999. | |
substring-after | string substring-after(string, string) | Returns the substring of the first argument string that follows the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string. For example, substring-after("1999/04/01","/") returns 04/01, and substring-after("1999/04/01","19") returns 99/04/01. | |
substring | string substring(string, number, number?) | Returns the substring of the first argument starting at the position specified in the second argument with length specified in the third argument. For example, substring("12345",2,3) returns "234". If the third argument is not specified, it returns the substring starting at the position specified in the second argument and continuing to the end of the string. For example, substring("12345",2) returns "2345". | |
string-length | number string-length(string?) | Returns the number of characters in the string. | |
normalize space | string normalize-space(string?) | Returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space. | |
translate | string translate(string, string, string) | Returns the first argument string with occurrences of characters in the second argument string replaced by the character at the corresponding position in the third argument string. For example, translate("bar","abc","ABC") returns the string BAr. | |
Boolean | boolean | boolean boolean(object) | Converts its argument to a boolean as follows:
|
not | boolean not(boolean) | Returns true if its argument is false, and false otherwise. | |
true | boolean true() | Returns true. | |
false | boolean false() | Returns false. | |
lang | boolean lang(string) | Returns true or false depending on whether the language of the context node as specified by xml:lang attributes is the same as or is a sublanguage of the language specified by the argument string. | |
Number | number | number number(object?) | Converts its argument to a number as follows:
|
sum | number sum(node-set) | Returns the sum, for each node in the argument node-set, of the result of converting the string-values of the node to a number. | |
floor | number floor(number) | Returns the largest (closest to positive infinity) number that is not greater than the argument and that is an integer. | |
ceiling | number ceiling(number) | Returns the smallest (closest to negative infinity) number that is not less than the argument and that is an integer. | |
round | number round(number) | Returns the number that is closest to the argument and that is an integer. If there are two such numbers, then the one that is closest to positive infinity is returned. If the argument is:
|