XPath

XPath

You use XPath with XModify to address parts of a document. XPath is important in Nexaweb application development for XModify statements, XModify server responses, Macros and Document Data Binding.

This document provides a reference to some of the XPath syntax for your convienence. For complete information about XPath, see the XPath Specification.

Overview

The primary purpose of XPath is to address parts of an XML document. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax. XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.

Node Tree

XPath operates on an XML document as a tree.

The tree contains the following types of nodes:

  • Root

  • Element

  • Text

  • Attribute

  • Namespace

  • Processing instruction

  • Comment

For every type of node, there is a way of determining a string-value for it. For some node types, the string-value is part of the node; for others, the string-value is computed from the string-value of descendant nodes.

Some node types also have an expanded-name, which is a pair consisting of a local part and a namespace URI.

Node Type

Description

Root

The root of the tree only. The element node for the document element is a child of the root node.

Has a children processing instruction and comment nodes for processing instructions and comments that occur in the prolog and after the end of the document element.

Expanded-name: N/A

String-value: The concatenation of the string-values of all text node descendants of the root node in document order.

Element

One for every element in the document.

Expanded-name: Computed by expanding the qualified name (QName) of the element specified in the tag in accordance with the XML Namespaces Recommendation [XML Names]. The namespace URI of the element's expanded-name will be null if the QName has no prefix and there is no applicable default namespace.

String-value: The concatenation of the string-values of all text node descendants of the element node in document order.

Attribute

Each element node has an associated set of attribute nodes; the element is the parent of each of these attribute nodes; however, an attribute node is not a child of its parent element.

Expanded-name: Computed by expanding the QName specified in the tag in the XML document in accordance with the XML Namespaces Recommendation [XML Names]. The namespace URI of the attribute's name is null if the QName of the attribute does not have a prefix.

String-value: The normalized value as specified by the XML Recommendation [XML]. An attribute whose normalized value is a zero-length string is not treated specially: it results in an attribute node whose string-value is a zero-length string.

Namespace

Each element has an associated set of namespace nodes, one for each distinct namespace prefix that is in scope for the element and one for the default namespace if one is in scope for the element. The element is the parent of each of these namespace nodes; however, a namespace node is not a child of its parent element. An element has a namespace node for:

  • Every attribute on the element whose name starts with xmlns:

  • Every attribute on an ancestor element whose name starts xmlns: unless the element itself or a nearer ancestor redeclares the prefix

  • An XMLNS attribute, if the element or some ancestor has an xmlns attribute, and the value of the xmlns attribute for the nearest such element is non-empty

Expanded-name: Tthe local part being the namespace prefix (empty if the namespace node is for the default namespace) and the namespace URI always null.

String-value: The namespace URI that is being bound to the namespace prefix. If relative, it must be resolved as a namespace URI in an expanded-name.

Processing Instruction

There is a processing instruction node for every processing instruction, except for one that occurs within the document type declaration.

Expanded-name: The local part is the processing instruction's target; the namespace URI is null.

String-value of a processing instruction node is the part of the processing instruction following the target and any whitespace. It does not include the terminating ?>.

Comment

One for every comment, except for comments occuring within the document type declaration.

Expanded-name: N/A

String-value: Content of the comment not including the opening <!-- or the closing -->.

 

Document Order

The tree includes a document order defined on all the nodes in the document as follows:

  • The root node is the first node.

  • Element nodes occur in order of their start-tag in the XML (after expansion of entities).

  • Element nodes occur before their children.

  • Attribute and namespace nodes of an element occur before the children of the element.

  • Namespace nodes occur before attribute nodes.

  • The relative order of namespace and attribute nodes is implementation-dependent.

Root nodes and element nodes have an ordered list of child nodes as follows:

  • Nodes never share children.

  • Every node other than the root node has exactly one parent, which is either an element node or the root node.

  • A root node or an element node is the parent of each of its child nodes.

  • Descendants of a node are the children of the node and the descendants of the children of the node. 
     

Location Paths 

Path Type

Description

Absolute location path

Consists of / optionally followed by a relative location path. A / by itself selects the root node of the document containing the context node. If it is followed by a relative location path, then the location path selects the set of nodes that would be selected by the relative location path relative to the root node of the document containing the context node.

Relative location path

Consists of a sequence of one or more location steps separated by /.

The steps in a relative location path are composed together from left to right. Each step in turn selects a set of nodes relative to a context node.

The initial sequence of steps selects a set of nodes relative to a context node. Each node in that set is used as a context node for the following step. The sets of nodes identified by that step are unioned together. The set of nodes identified by the composition of the steps is this union. For example, child::div/child::para selects the para element children of the div element children of the context node, or, in other words, the para element grandchildren that have div parents.

A location step has three parts:

  • An axis, which specifies the tree relationship between the nodes selected by the location step and the context node,

  • A node test, which specifies the node type and expanded-name of the nodes selected by the location step

  • Zero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step.

Location Path Abbreviated Syntax

Syntax

Selects

Para

The para element children of the context node.

*

All element children of the context node.

Text()

All text node children of the context node.

@name

The name attribute of the context node.

@*

All the attributes of the context node.

Para[1]

The first para child of the context node.

para[last()]

The last para child of the context node

*/para

All para grandchildren of the context node.

/doc/chapter[5]/section[2]

The second section of the fifth chapter of the doc.

chapter//para

The para element descendants of the chapter element children of the context node.

//para

All the para descendants of the document root.
All para elements in the same document as the context node

..

The parent of the context node

../@lang

The lang attribute of the parent of the context node.

para[@type="warning"]

All para children of the context node that have a type attribute with value warning.

para[@type="warning"][5]

The fifth para child of the context node that has a type attribute with value warning.

para[5][@type="warning"]

The fifth para child of the context node if that child has a type attribute with value warning.

chapter[title="Introduction"]

The chapter children of the context node that have one or more title children with string-value equal to Introduction.

chapter[title]

The chapter children of the context node that have one or more title children.

employee[@secretary and @assistant]

All the employee children of the context node that have both a secretary attribute and an assistant attribute.

 

General Abbreviation Guidelines

 

Syntax Element

Description

Child

Is effectively the default axis; therefore you can omit it from a location step.

div/para = child::div/child::para

@

Short for: attribute::

For example: para[@type="warning"] is short for child::para[attribute::type="warning"].

Each selects para children with a type attribute with value equal to warning.

 //

Short for: /descendant-or-self::node()

For example: //para = /descendant-or-self::node()/child::para

Each selects para element in the document (even a para element that is a document element).

div//para

Short for: div/descendant-or-self::node()/child::para.

Each selects all para descendants of div children.

.

Short for: self::node.

Useful in conjuction with //.

For example: .// = self::node()/descendant-or-self node()/child::para

Each selects all para descendant element of the context node

..

Short for: parent::node()

For example: ../title = parent::node()/child::title

Each selects the title children of the parent of the context node.

Axes

This section lists and provides descriptions of the axis types and axes available for use in specifying a path.

Axis Types

 

Type

Description

Forward Axis

Only ever contains the context node or nodes that are after it in document order.

Reverse Axis

Only ever contains the context node or nodes before it in document order.

List of Axes 

 

Axis

Description

Type

ancestor

Contains the ancestors of the context node, which consist of the parent of context node and the parent's parent and so on. Thus, the ancestor axis always includes the root node, unless the context node is the root node.

 Reverse

ancestor-or-self

Contains the context node and the ancestors of the context node. Thus, the ancestor axis always includes the root node.

 Reverse

attribute

Contains the attributes of the context node. This axis will be empty unless the context node is an element.

Forward

child

Contains the children of the context node.

Forward

descendant

Contains the descendants of the context node. Since a descendant is a child or a child of a child and so on, the descendant axis never contains attribute or namespace nodes.

Forward

descendant-or-self

Contains the context node and the descendants of the context node.

Forward

following

Contains all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes.

Forward

following-sibling

Contains all the following siblings of the context node. If the context node is an attribute node or namespace node, the following-sibling axis is empty.

Forward

namespace

Contains the namespace nodes of the context node. The context node must be an element otherwise the axis will be empty unless.

 Forward

parent

Contains the parent of the context node, if one exists.

 Forward

preceding

Contains all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes.

 Reverse

preceding-sibling

Contains all the preceding siblings of the context node expcept if the context node is an attribute or namespace node, in which case, it is empty

Reverse

self

Contains only the context node itself.

Forward

Every axis has a principal node type. If an axis can contain elements, then the principal node type is element; otherwise, it is the type of the nodes that the axis can contain.

 

Axis

Principal Node Type

attribute

attribute

namespace

namespace

All other axes

element

Node Test

Specifies the node type and expanded-name of the nodes selected by the location step.

The initial node-set is filtered by the first predicate to generate a new node-set; this new node-set is then filtered using the second predicate, and so on. The final node-set is the node-set selected by the location step.

A node test * is true for any node of the principal node type.

Examples:

Node Test

Description

child::*

Selects all element children of the context node.

attribute::*

Selects all attributes of the context node.

The node test text() is true for any text node.

Examples:

Node Test

Description

child:: text()

Selects the text node children of the context node.

attribute::text()

Selects all attributes of the context node.

The node test comment() is true for any comment node.

The node test processing-instruction() is true for any text processing-instruction. This node test may have an argument that is Literal; if so, is true for any processing instruction that has a name equal to the value of the Literal.

Predicates

Proximity Positions:

Position

Description

Forward axis

Position of node within node set in document order.

Reverse Axis

Position of node within node set in reverse document order.

  • First position = 1

  • Para[3] is equivalent to para[position()=3]

Functions

This section lists the core function library that all XPath implementations support. Each function takes zero or more arguments and returns a single result.

For a function in the core function library, arguments and result are of the four basic types:

  • node-set (an unordered collection of nodes without duplicates)

  • boolean (true or false)

  • number (a floating-point number)

  • string (a sequence of UCS characters)

Function Type

Function

Syntax

Description

Node Set

last

number last()

Returns a number equal to the context size from the expression evaluation context.

 

position

number position()

Returns a number equal to the context position from the expression evaluation context.

 

count

number count(node-set)

Returns the number of nodes in the argument node-set.

 

id

node-set id(object)

Selects elements by their unique ID.

For example:
id("foo") selects the element with unique ID foo
id("foo")/child::para[position()=5] selects the fifth para child of the element with unique ID foo

 

local-name

string local-name(node-set?)

Returns the local part of the expanded-name of the node in the argument node-set that is first in document order.

 

namespace-uri

string namespace-uri(node-set?)

Returns the namespace URI of the expanded-name of the node in the argument node-set that is first in document order. 

 

name

string name(node-set?)

Returns a string containing a qualified name (QName) representing the expanded-name of the node in the argument node-set that is first in document order.

String

string

string string(object?)

Converts an object to a string.

 

concat

string concat(string, string, string*)

Returns the concatenation of its arguments.

 

starts-with

boolean starts-with(string, string)

Returns true if the first argument string starts with the second argument string, and otherwise returns false.

 

contains

boolean contains(string, string)

Returns true if the first argument string contains the second argument string, and otherwise returns false.

 

substring-before

string substring-before(string, string)

Returns the substring of the first argument string that precedes the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string. For example, substring-before("1999/04/01","/") returns 1999.

 

substring-after

string substring-after(string, string)

Returns the substring of the first argument string that follows the first occurrence of the second argument string in the first argument string, or the empty string if the first argument string does not contain the second argument string. For example, substring-after("1999/04/01","/") returns 04/01, and substring-after("1999/04/01","19") returns 99/04/01.

 

substring

string substring(string, number, number?)

Returns the substring of the first argument starting at the position specified in the second argument with length specified in the third argument. For example, substring("12345",2,3) returns "234". If the third argument is not specified, it returns the substring starting at the position specified in the second argument and continuing to the end of the string. For example, substring("12345",2) returns "2345".

 

string-length

number string-length(string?)

Returns the number of characters in the string.

 

normalize space

string normalize-space(string?)

Returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space.

 

translate

string translate(string, string, string)

Returns the first argument string with occurrences of characters in the second argument string replaced by the character at the corresponding position in the third argument string. For example, translate("bar","abc","ABC") returns the string BAr.

Boolean