.\" Automatically generated by Pod::Man 4.10 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "XML::XQL::Tutorial 3pm" .TH XML::XQL::Tutorial 3pm "2019-03-01" "perl v5.28.1" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" XML::XQL::Tutorial \- Describes the XQL query syntax .SH "DESCRIPTION" .IX Header "DESCRIPTION" This document describes basic the features of the \s-1XML\s0 Query Language (\s-1XQL.\s0) A proposal for the \s-1XML\s0 Query Language (\s-1XQL\s0) specification was submitted to the \s-1XSL\s0 Working Group in September 1998. The spec can be found at . Since it is only a proposal at this point, things may change, but it is very likely that the final version will be close to the proposal. Most of this document was copied straight from the spec. .PP See also the \s-1XML::XQL\s0 man page. .SH "INTRODUCTION" .IX Header "INTRODUCTION" \&\s-1XQL\s0 (\s-1XML\s0 Query Language) provides a natural extension to the \s-1XSL\s0 pattern language. It builds upon the capabilities \s-1XSL\s0 provides for identifying classes of nodes, by adding Boolean logic, filters, indexing into collections of nodes, and more. .PP \&\s-1XQL\s0 is designed specifically for \s-1XML\s0 documents. It is a general purpose query language, providing a single syntax that can be used for queries, addressing, and patterns. \&\s-1XQL\s0 is concise, simple, and powerful. .PP \&\s-1XQL\s0 is designed to be used in many contexts. Although it is a superset of \s-1XSL\s0 patterns, it is also applicable to providing links to nodes, for searching repositories, and for many other applications. .PP Note that the term \s-1XQL\s0 is a working term for the language described in this proposal. It is not their intent that this term be used permanently. Also, beware that another query language exists called XML-QL, which uses a syntax very similar to \s-1SQL.\s0 .PP The \s-1XML::XQL\s0 module has added functionality to the \s-1XQL\s0 spec, called \fI\s-1XQL+\s0\fR. To allow only \s-1XQL\s0 functionality as described in the spec, use the XML::XQL::Strict module. Note that the \s-1XQL\s0 spec makes the distinction between core \s-1XQL\s0 and \s-1XQL\s0 extensions. This implementation makes no distinction and the Strict module, therefore, implements everything described in the \s-1XQL\s0 spec. See the \s-1XML::XQL\s0 man page for more information about the Strict module. This tutorial will clearly indicate when referring to \s-1XQL+.\s0 .SH "XQL Patterns" .IX Header "XQL Patterns" This section describes the core \s-1XQL\s0 notation. These features should be part of every \s-1XQL\s0 implementation, and serve as the base level of functionality for its use in different technologies. .PP The basic syntax for \s-1XQL\s0 mimics the \s-1URI\s0 directory navigation syntax, but instead of specifying navigation through a physical file structure, the navigation is through elements in the \s-1XML\s0 tree. .PP For example, the following \s-1URI\s0 means find the foo.jpg file within the bar directory: .PP .Vb 1 \& bar/foo.jpg .Ve .PP Similarly, in \s-1XQL,\s0 the following means find the collection of fuz elements within baz elements: .PP .Vb 1 \& baz/fuz .Ve .PP Throughout this document you will find numerous samples. They refer to the data shown in the sample file at the end of this man page. .SH "Context" .IX Header "Context" A \fIcontext\fR is the set of nodes against which a query operates. For the entire query, which is passed to the XML::XQL::Query constructor through the \fIExpr\fR option, the context is the list of input nodes that is passed to the \fBquery()\fR method. .PP \&\s-1XQL\s0 allows a query to select between using the current context as the input context and using the 'root context' as the input context. The 'root context' is a context containing only the root-most element of the document. When using \s-1XML::DOM,\s0 this is the Document object. .PP By default, a query uses the current context. A query prefixed with '/' (forward slash) uses the root context. A query may optionally explicitly state that it is using the current context by using the './' (dot, forward slash) prefix. Both of these notations are analogous to the notations used to navigate directories in a file system. .PP The './' prefix is only required in one situation. A query may use the '//' operator to indicate recursive descent. When this operator appears at the beginning of the query, the initial '/' causes the recursive decent to perform relative to the root of the document or repository. The prefix './/' allows a query to perform a recursive descent relative to the current context. .IP "Examples:" 4 .IX Item "Examples:" Find all author elements within the current context. Since the period is really not used alone, this example forward-references other features: .Sp .Vb 1 \& ./author .Ve .Sp Note that this is equivalent to: .Sp .Vb 1 \& author .Ve .Sp Find the root element (bookstore) of this document: .Sp .Vb 1 \& /bookstore .Ve .Sp Find all author elements anywhere within the current document: .Sp .Vb 1 \& //author .Ve .Sp Find all books where the value of the style attribute on the book is equal to the value of the specialty attribute of the bookstore element at the root of the document: .Sp .Vb 1 \& book[/bookstore/@specialty = @style] .Ve .SH "Query Results" .IX Header "Query Results" The collection returned by an \s-1XQL\s0 expression preserves document order, hierarchy, and identity, to the extent that these are defined. That is, a collection of elements will always be returned in document order without repeats. Note that the spec states that the order of attributes within an element is undefined, but that this implementation does keep attributes in document order. See the \s-1XML::XQL\s0 man page for more details regarding \&\fIDocument Order\fR. .SH "Collections \- 'element' and '.'" .IX Header "Collections - 'element' and '.'" The collection of all elements with a certain tag name is expressed using the tag name itself. This can be qualified by showing that the elements are selected from the current context './', but the current context is assumed and often need not be noted explicitly. .IP "Examples:" 4 .IX Item "Examples:" Find all first-name elements. These examples are equivalent: .Sp .Vb 1 \& ./first\-name \& \& first\-name .Ve .Sp Find all unqualified book elements: .Sp .Vb 1 \& book .Ve .Sp Find all first.name elements: .Sp .Vb 1 \& first.name .Ve .SH "Selecting children and descendants \- '/' and '//'" .IX Header "Selecting children and descendants - '/' and '//'" The collection of elements of a certain type can be determined using the path operators ('/' or '//'). These operators take as their arguments a collection (left side) from which to query elements, and a collection indicating which elements to select (right side). The child operator ('/')selects from immediate children of the left-side collection, while the descendant operator ('//') selects from arbitrary descendants of the left-side collection. In effect, the '//' can be thought of as a substitute for one or more levels of hierarchy. Note that the path operators change the context as the query is performed. By stringing them together users can 'drill down' into the document. .IP "Examples:" 4 .IX Item "Examples:" Find all first-name elements within an author element. Note that the author children of the current context are found, and then first-name children are found relative to the context of the author elements: .Sp .Vb 1 \& author/first\-name .Ve .Sp Find all title elements, one or more levels deep in the bookstore (arbitrary descendants): .Sp .Vb 1 \& bookstore//title .Ve .Sp Note that this is different from the following query, which finds all title elements that are grandchildren of bookstore elements: .Sp .Vb 1 \& bookstore/*/title .Ve .Sp Find emph elements anywhere inside book excerpts, anywhere inside the bookstore: .Sp .Vb 1 \& bookstore//book/excerpt//emph .Ve .Sp Find all titles, one or more levels deep in the current context. Note that this situation is essentially the only one where the period notation is required: .Sp .Vb 1 \& .//title .Ve .SH "Collecting element children \- '*'" .IX Header "Collecting element children - '*'" An element can be referenced without using its name by substituting the '*' collection. The '*' collection returns all elements that are children of the current context, regardless of their tag name. .IP "Examples:" 4 .IX Item "Examples:" Find all element children of author elements: .Sp .Vb 1 \& author/* .Ve .Sp Find all last-names that are grand-children of books: .Sp .Vb 1 \& book/*/last\-name .Ve .Sp Find the grandchildren elements of the current context: .Sp .Vb 1 \& */* .Ve .Sp Find all elements with specialty attributes. Note that this example uses subqueries, which are covered in Filters, and attributes, which are discussed in Finding an attribute: .Sp .Vb 1 \& *[@specialty] .Ve .SH "Finding an attribute \- '@'" .IX Header "Finding an attribute - '@'" Attribute names are preceded by the '@' symbol. \s-1XQL\s0 is designed to treat attributes and sub-elements impartially, and capabilities are equivalent between the two types wherever possible. .PP Note: attributes cannot contain subelements. Thus, attributes cannot have path operators applied to them in a query. Such expressions will result in a syntax error. The \s-1XQL\s0 spec states that attributes are inherently unordered and indices cannot be applied to them, but this implementation allows it. .IP "Examples:" 4 .IX Item "Examples:" Find the style attribute of the current element context: .Sp .Vb 1 \& @style .Ve .Sp Find the exchange attribute on price elements within the current context: .Sp .Vb 1 \& price/@exchange .Ve .Sp The following example is not valid: .Sp .Vb 1 \& price/@exchange/total .Ve .Sp Find all books with style attributes. Note that this example uses subqueries, which are covered in Filters: .Sp .Vb 1 \& book[@style] .Ve .Sp Find the style attribute for all book elements: .Sp .Vb 1 \& book/@style .Ve .SH "XQL Literals" .IX Header "XQL Literals" \&\s-1XQL\s0 query expressions may contain literal values (i.e. constants.) Numbers (integers and floats) are wrapped in XML::XQL::Number objects and strings in XML::XQL::Text objects. Booleans (as returned by \fBtrue()\fR and \fBfalse()\fR) are wrapped in XML::XQL::Boolean objects. .PP Strings must be enclosed in single or double quotes. Since \s-1XQL\s0 does not allow escaping of special characters, it's impossible to create a string with both a single and a double quote in it. To remedy this, \s-1XQL+\s0 has added the q// and qq// string delimiters which behave just like they do in Perl. .PP For Numbers, exponential notation is not allowed. Use the \s-1XQL+\s0 function \fBeval()\fR to circumvent this problem. See \s-1XML::XQL\s0 man page for details. .PP The empty list or undef is represented by [] (i.e. reference to empty array) in this implementation. .IP "Example" 4 .IX Item "Example" Integer Numbers: .Sp .Vb 2 \& 234 \& \-456 .Ve .Sp Floating point Numbers: .Sp .Vb 2 \& 1.23 \& \-0.99 .Ve .Sp Strings: .Sp .Vb 2 \& "some text with \*(Aqsingle\*(Aq quotes" \& \*(Aqtext with "double" quotes\*(Aq .Ve .Sp Not allowed: .Sp .Vb 1 \& 1.23E\-4 (use eval("1.23E\-4", "Number") in XQL+) \& \& "can\*(Aqt use \e"double \e"quotes" (use q/can\*(Aqt use "double" quotes/ in XQL+) .Ve .SH "Grouping \- '()'" .IX Header "Grouping - '()'" Parentheses can be used to group collection operators for clarity or where the normal precedence is inadequate to express an operation. .SH "Filters \- '[]'" .IX Header "Filters - '[]'" Constraints and branching can be applied to any collection by adding a filter clause '[ ]' to the collection. The filter is analogous to the \s-1SQL WHERE\s0 clause with \s-1ANY\s0 semantics. The filter contains a query within it, called the subquery. The subquery evaluates to a Boolean, and is tested for each element in the collection. Any elements in the collection failing the subquery test are omitted from the result collection. .PP For convenience, if a collection is placed within the filter, a Boolean \s-1TRUE\s0 is generated if the collection contains any members, and a \s-1FALSE\s0 is generated if the collection is empty. In essence, an expression such as author/degree implies a collection-to-Boolean conversion function like the following mythical 'there\-exists\-a' method. .PP .Vb 1 \& author[.there\-exists\-a(degree)] .Ve .PP Note that any number of filters can appear at a given level of an expression. Empty filters are not allowed. .IP "Examples:" 4 .IX Item "Examples:" Find all books that contain at least one excerpt element: .Sp .Vb 1 \& book[excerpt] .Ve .Sp Find all titles of books that contain at least one excerpt element: .Sp .Vb 1 \& book[excerpt]/title .Ve .Sp Find all authors of books where the book contains at least one excerpt, and the author has at least one degree: .Sp .Vb 1 \& book[excerpt]/author[degree] .Ve .Sp Find all books that have authors with at least one degree: .Sp .Vb 1 \& book[author/degree] .Ve .Sp Find all books that have an excerpt and a title: .Sp .Vb 1 \& book[excerpt][title] .Ve .SS "Any and all semantics \- '$any$' and '$all$'" .IX Subsection "Any and all semantics - '$any$' and '$all$'" Users can explicitly indicate whether to use any or all semantics through the \f(CW$any\fR$ and \f(CW$all\fR$ keywords. .PP \&\f(CW$any\fR$ flags that a condition will hold true if any item in a set meets that condition. \f(CW$all\fR$ means that all elements in a set must meet the condition for the condition to hold true. .PP \&\f(CW$any\fR$ and \f(CW$all\fR$ are keywords that appear before a subquery expression within a filter. .IP "Examples:" 4 .IX Item "Examples:" Find all author elements where one of the last names is Bob: .Sp .Vb 1 \& author[last\-name = \*(AqBob\*(Aq] \& \& author[$any$ last\-name = \*(AqBob\*(Aq] .Ve .Sp Find all author elements where none of the last-name elements are Bob: .Sp .Vb 1 \& author[$all$ last\-name != \*(AqBob\*(Aq] .Ve .Sp Find all author elements where the first last name is Bob: .Sp .Vb 1 \& author[last\-name[0] = \*(AqBob\*(Aq] .Ve .SH "Indexing into a collection \- '[]' and '$to$'" .IX Header "Indexing into a collection - '[]' and '$to$'" \&\s-1XQL\s0 makes it easy to find a specific node within a set of nodes. Simply enclose the index ordinal within square brackets. The ordinal is 0 based. .PP A range of elements can be returned. To do so, specify an expression rather than a single value inside of the subscript operator (square brackets). Such expressions can be a comma separated list of any of the following: .PP .Vb 4 \& n Returns the nth element \& \-n Returns the element that is n\-1 units from the last element. \& E.g., \-1 means the last element. \-2 is the next to last element. \& m $to$ n Returns elements m through n, inclusive .Ve .IP "Examples:" 4 .IX Item "Examples:" Find the first author element: .Sp .Vb 1 \& author[0] .Ve .Sp Find the third author element that has a first-name: .Sp .Vb 1 \& author[first\-name][2] .Ve .Sp Note that indices are relative to the parent. In other words, consider the following data: .Sp .Vb 8 \& \& \& \& \& \& \& \& .Ve .Sp The following expression will return the first y from each of the x's: .Sp .Vb 1 \& x/y[0] .Ve .Sp The following will return the first y from the entire set of y's within x's: .Sp .Vb 1 \& (x/y)[0] .Ve .Sp The following will return the first y from the first x: .Sp .Vb 1 \& x[0]/y[0] .Ve .Sp Find the first and fourth author elements: .Sp .Vb 1 \& author[0,3] .Ve .Sp Find the first through fourth author elements: .Sp .Vb 1 \& author[0 $to$ 3] .Ve .Sp Find the first, the third through fifth, and the last author elements: .Sp .Vb 1 \& author[0, 2 $to$ 4, \-1] .Ve .Sp Find the last author element: .Sp .Vb 1 \& author[\-1] .Ve .SH "Boolean Expressions" .IX Header "Boolean Expressions" Boolean expressions can be used within subqueries. For example, one could use Boolean expressions to find all nodes of a particular value, or all nodes with nodes in particular ranges. Boolean expressions are of the form ${op}$, where {op} may be any expression of the form {b|a} \- that is, the operator takes lvalue and rvalue arguments and returns a Boolean result. .PP Note that the \s-1XQL\s0 Extensions section defines additional Boolean operations. .SS "Boolean \s-1AND\s0 and \s-1OR\s0 \- '$and$' and '$or$'" .IX Subsection "Boolean AND and OR - '$and$' and '$or$'" \&\f(CW$and\fR$ and \f(CW$or\fR$ are used to perform Boolean ands and ors. .PP The Boolean operators, in conjunction with grouping parentheses, can be used to build very sophisticated logical expressions. .PP Note that spaces are not significant and can be omitted, or included for clarity as shown here. .IP "Examples:" 4 .IX Item "Examples:" Find all author elements that contain at least one degree and one award. .Sp .Vb 1 \& author[degree $and$ award] .Ve .Sp Find all author elements that contain at least one degree or award and at least one publication. .Sp .Vb 1 \& author[(degree $or$ award) $and$ publication] .Ve .SS "Boolean \s-1NOT\s0 \- '$not$'" .IX Subsection "Boolean NOT - '$not$'" \&\f(CW$not\fR$ is a Boolean operator that negates the value of an expression within a subquery. .IP "Examples:" 4 .IX Item "Examples:" Find all author elements that contain at least one degree element and that contain no publication elements. .Sp .Vb 1 \& author[degree $and$ $not$ publication] .Ve .Sp Find all author elements that contain publications elements but do not contain either degree elements or award elements. .Sp .Vb 1 \& author[$not$ (degree $or$ award) $and$ publication] .Ve .SH "Union and intersection \- '$union$', '|' and '$intersect$'" .IX Header "Union and intersection - '$union$', '|' and '$intersect$'" The \f(CW$union\fR$ operator (shortcut is '|') returns the combined set of values from the query on the left and the query on the right. Duplicates are filtered out. The resulting list is sorted in document order. .PP Note: because this is a union, the set returned may include 0 or more elements of each element type in the list. To restrict the returned set to nodes that contain at least one of each of the elements in the list, use a filter, as discussed in Filters. .PP The \f(CW$intersect\fR$ operator returns the set of elements in common between two sets. .IP "Examples:" 4 .IX Item "Examples:" Find all first-names and last-names: .Sp .Vb 1 \& first\-name $union$ last\-name .Ve .Sp Find all books and magazines from a bookstore: .Sp .Vb 1 \& bookstore/(book | magazine) .Ve .Sp Find all books and all authors: .Sp .Vb 1 \& book $union$ book/author .Ve .Sp Find the first-names, last-names, or degrees from authors within either books or magazines: .Sp .Vb 1 \& (book $union$ magazine)/author/(first\-name $union$ last\-name $union$ degree) .Ve .Sp Find all books with author/first\-name equal to 'Bob' and all magazines with price less than 10: .Sp .Vb 1 \& book[author/first\-name = \*(AqBob\*(Aq] $union$ magazine[price $lt$ 10] .Ve .SH "Equivalence \- '$eq$', '=', '$ne$' and '!='" .IX Header "Equivalence - '$eq$', '=', '$ne$' and '!='" The '=' sign is used for equality; '!=' for inequality. Alternatively, \f(CW$eq\fR$ and \f(CW$ne\fR$ can be used for equality and inequality. .PP Single or double quotes can be used for string delimiters in expressions. This makes it easier to construct and pass \s-1XQL\s0 from within scripting languages. .PP For comparing values of elements, the \fBvalue()\fR method is implied. That is, last-name < 'foo' really means last\-name!\fBvalue()\fR < 'foo'. .PP Note that filters are always with respect to a context. That is, the expression book[author] means for every book element that is found, see if it has an author subelement. Likewise, book[author = 'Bob'] means for every book element that is found, see if it has a subelement named author whose value is 'Bob'. One can examine the value of the context as well, by using the . (period). For example, book[. = 'Trenton'] means for every book that is found, see if its value is 'Trenton'. .IP "Examples:" 4 .IX Item "Examples:" Find all author elements whose last name is Bob: .Sp .Vb 1 \& author[last\-name = \*(AqBob\*(Aq] \& \& author[last\-name $eq$ \*(AqBob\*(Aq] .Ve .Sp Find all authors where the from attribute is not equal to 'Harvard': .Sp .Vb 1 \& degree[@from != \*(AqHarvard\*(Aq] \& \& degree[@from $ne$ \*(AqHarvard\*(Aq] .Ve .Sp Find all authors where the last-name is the same as the /guest/last\-name element: .Sp .Vb 1 \& author[last\-name = /guest/last\-name] .Ve .Sp Find all authors whose text is 'Matthew Bob': .Sp .Vb 1 \& author[. = \*(AqMatthew Bob\*(Aq] \& \& author = \*(AqMatthew Bob\*(Aq .Ve .SS "Comparison \- '<', '<=', '>', '>=', '$lt', '$ilt$' etc." .IX Subsection "Comparison - '<', '<=', '>', '>=', '$lt', '$ilt$' etc." A set of binary comparison operators is available for comparing numbers and strings and returning Boolean results. \&\f(CW$lt\fR$, \f(CW$le\fR$, \f(CW$gt\fR$, \f(CW$ge\fR$ are used for less than, less than or equal, greater than, or greater than or equal. These same operators are also available in a case insensitive form: \f(CW$ieq\fR$, \f(CW$ine\fR$, \f(CW$ilt\fR$, \&\f(CW$ile\fR$, \f(CW$igt\fR$, \f(CW$ige\fR$. .PP <, <=, > and >= are allowed short cuts for \f(CW$lt\fR$, \f(CW$le\fR$, \f(CW$gt\fR$ and \f(CW$ge\fR$. .IP "Examples:" 4 .IX Item "Examples:" Find all author elements whose last name is bob and whose price is > 50 .Sp .Vb 1 \& author[last\-name = \*(AqBob\*(Aq $and$ price $gt$ 50] .Ve .Sp Find all authors where the from attribute is not equal to 'Harvard': .Sp .Vb 1 \& degree[@from != \*(AqHarvard\*(Aq] .Ve .Sp Find all authors whose last name begins with 'M' or greater: .Sp .Vb 1 \& author[last\-name $ge$ \*(AqM\*(Aq] .Ve .Sp Find all authors whose last name begins with 'M', 'm' or greater: .Sp .Vb 1 \& author[last\-name $ige$ \*(AqM\*(Aq] .Ve .Sp Find the first three books: .Sp .Vb 1 \& book[index() $le$ 2] .Ve .Sp Find all authors who have more than 10 publications: .Sp .Vb 1 \& author[publications!count() $gt$ 10] .Ve .SS "\s-1XQL+\s0 Match operators \- '$match$', '$no_match$', '=~' and '!~'" .IX Subsection "XQL+ Match operators - '$match$', '$no_match$', '=~' and '!~'" \&\s-1XQL+\s0 defines additional operators for pattern matching. The \f(CW$match\fR$ operator (shortcut is '=~') returns \s-1TRUE\s0 if the lvalue matches the pattern described by the rvalue. The \f(CW$no_match\fR$ operator (shortcut is '!~') returns \s-1FALSE\s0 if they match. Both lvalue and rvalue are first cast to strings. .PP The rvalue string should have the syntax of a Perl rvalue, that is the delimiters should be included and modifiers are allowed. When using delimiters other than slashes '/', the 'm' should be included. The rvalue should be a string, so don't forget the quotes! (Or use the q// or qq// delimiters in \s-1XQL+,\s0 see \s-1XML::XQL\s0 man page.) .PP Note that you can't use the Perl substitution operator s/// here. Try using the \&\s-1XQL+\s0 \fBsubst()\fR function instead. .IP "Examples:" 4 .IX Item "Examples:" Find all authors whose name contains bob or Bob: .Sp .Vb 1 \& author[first\-name =~ \*(Aq/[Bb]ob/\*(Aq] .Ve .Sp Find all book titles that don't contain 'Trenton' (case-insensitive): .Sp .Vb 1 \& book[title !~ \*(Aqm!trenton!i\*(Aq] .Ve .SS "Oher \s-1XQL+\s0 comparison operators \- '$isa', '$can$'" .IX Subsection "Oher XQL+ comparison operators - '$isa', '$can$'" See the \s-1XML::XQL\s0 man page for other operators available in \s-1XQL+.\s0 .SS "Comparisons and vectors" .IX Subsection "Comparisons and vectors" The lvalue of a comparison can be a vector or a scalar. The rvalue of a comparison must be a scalar or a value that can be cast at runtime to a scalar. .PP If the lvalue of a comparison is a set, then any (exists) semantics are used for the comparison operators. That is, the result of a comparison is true if any item in the set meets the condition. .SS "Comparisons and literals" .IX Subsection "Comparisons and literals" The spec states that the lvalue of an expression cannot be a literal. That is, \fI'1' = a\fR is not allowed. This implementation allows it, but it's not clear how useful that is. .SS "Casting of literals during comparison" .IX Subsection "Casting of literals during comparison" Elements, attributes and other \s-1XML\s0 node types are casted to strings (Text) by applying the \fBvalue()\fR method. The \fBvalue()\fR method calls the \fBtext()\fR method by default, but this behavior can be altered by the user, so the \fBvalue()\fR method may return other \s-1XQL\s0 data types. .PP When two values are compared, they are first casted to the same type. See the \s-1XML::XQL\s0 man page for details on casting. .PP Note that the \s-1XQL\s0 spec is not very clear on how values should be casted for comparison. Discussions with the authors of the \s-1XQL\s0 spec revealed that there was some disagreement and their implementations differed on this point. This implementation is closest to that of Joe Lapp from webMethods, Inc. .SH "Methods \- '\fBmethod()\fP' or 'query!\fBmethod()\fP'" .IX Header "Methods - 'method()' or 'query!method()'" \&\s-1XQL\s0 makes a distinction between functions and methods. See the \s-1XML::XQL\s0 man page for details. .PP \&\s-1XQL\s0 provides methods for advanced manipulation of collections. These methods provide specialized collections of nodes (see Collection methods), as well as information about sets and nodes. .PP Methods are of the form \fImethod(arglist)\fR .PP Consider the query book[author]. It will find all books that have authors. Formally, we call the book corresponding to a particular author the reference node for that author. That is, every author element that is examined is an author for one of the book elements. (See the Annotated \s-1XQL BNF\s0 Appendix for a much more thorough definition of reference node and other terms. See also the \&\s-1XML::XQL\s0 man page.) Methods always apply to the reference node. .PP For example, the \fBtext()\fR method returns the text contained within a node, minus any structure. (That is, it is the concatenation of all text nodes contained with an element and its descendants.) The following expression will return all authors named 'Bob': .PP .Vb 1 \& author[text() = \*(AqBob\*(Aq] .Ve .PP The following will return all authors containing a first-name child whose text is 'Bob': .PP .Vb 1 \& author[first\-name!text() = \*(AqBob\*(Aq] .Ve .PP The following will return all authors containing a child named Bob: .PP .Vb 1 \& author[*!text() = \*(AqBob\*(Aq] .Ve .PP Method names are case sensitive. See the \s-1XML::XQL\s0 man page on how to define your own methods and functions. .SS "Information methods" .IX Subsection "Information methods" The following methods provide information about nodes in a collection. These methods return strings or numbers, and may be used in conjunction with comparison operators within subqueries. .IP "Method: \fBtext()\fR" 4 .IX Item "Method: text()" The \fBtext()\fR method concatenates text of the descendents of a node, normalizing white space along the way. White space will be preserved for a node if the node has the xml:space attribute set to 'preserve', or if the nearest ancestor with the xml:space attribute has the attribute set to \&'preserve'. When white space is normalized, it is normalized across the entire string. Spaces are used to separate the text between nodes. When entity references are used in a document, spacing is not inserted around the entity refs when they are expanded. .Sp In this implementation, the method may receive an optional parameter to indicate whether the \fBtext()\fR of Element nodes should include the \fBtext()\fR of its Element descendants. See \s-1XML::XQL\s0 man page for details. .Sp Examples: .Sp Find the authors whose last name is 'Bob': .Sp .Vb 1 \& author[last\-name!text() = \*(AqBob\*(Aq] .Ve .Sp Note this is equivalent to: .Sp .Vb 1 \& author[last\-name = \*(AqBob\*(Aq] .Ve .Sp Find the authors with value 'Matthew Bob': .Sp .Vb 1 \& author[text() = \*(AqMatthew Bob\*(Aq] \& \& author[. = \*(AqMatthew Bob\*(Aq] \& \& author = \*(AqMatthew Bob\*(Aq .Ve .IP "Method: \fBrawText()\fR" 4 .IX Item "Method: rawText()" The \fBrawText()\fR method is similar to the \fBtext()\fR method, but it does not normalize whitespace. .Sp In this implementation, the method may receive an optional parameter to indicate whether the \fBrawText()\fR of Element nodes should include the \&\fBrawText()\fR of its Element descendants. See \s-1XML::XQL\s0 man page for details. .IP "Method: \fBvalue()\fR" 4 .IX Item "Method: value()" Returns a type cast version of the value of a node. If no data type is provided, returns the same as \fBtext()\fR. .RS 4 .IP "Shortcuts" 4 .IX Item "Shortcuts" For the purposes of comparison, value( )is implied if omitted. In other words, when two items are compared, the comparison is between the value of the two items. Remember that in absence of type information, \&\fBvalue()\fR returns \fBtext()\fR. .Sp The following examples are equivalent: .Sp .Vb 1 \& author[last\-name!value() = \*(AqBob\*(Aq $and$ first\-name!value() = \*(AqJoe\*(Aq] \& \& author[last\-name = \*(AqBob\*(Aq $and$ first\-name = \*(AqJoe\*(Aq] \& \& price[@intl!value() = \*(Aqcanada\*(Aq] \& \& price[@intl = \*(Aqcanada\*(Aq] .Ve .RE .RS 4 .RE .IP "Method: \fBnodeType()\fR" 4 .IX Item "Method: nodeType()" Returns a number to indicate the type of the node. The values were based on the node type values in the \s-1DOM:\s0 .Sp .Vb 9 \& element 1 \& attribute 2 \& text 3 \& entity 6 (not in XQL spec) \& PI 7 \& comment 8 \& document 9 \& doc. fragment 10 (not in XQL spec) \& notation 11 (not in XQL spec) .Ve .Sp Note that in \s-1XQL,\s0 CDATASection nodes and EntityReference nodes also return 3, whereas in the \s-1DOM\s0 CDATASection returns 4 and EntityReference returns 5. Use the \s-1XQL+\s0 method \fBDOM_nodeType()\fR to get \s-1DOM\s0 node type values. See the \s-1XML::DOM\s0 man page for node type values of nodes not mentioned here. .IP "Method: nodeTypeString" 4 .IX Item "Method: nodeTypeString" Returns the name of the node type in lowercase or an empty string. The following node types are currently supported 1 (element), 2 (attribute), 3 (text), 7 (processing_instruction), 8 (comment), 9 (document) .IP "Method: \fBnodeName()\fR" 4 .IX Item "Method: nodeName()" Returns the tag name for Element nodes and the attribute name of attributes. .SS "Collection index methods" .IX Subsection "Collection index methods" .IP "Method: \fBindex()\fR" 4 .IX Item "Method: index()" Returns the index of the value within the search context (i.e. with the input list of the subquery.) This is not necessarily the same as the index of a node within its parent node. Note that the \s-1XQL\s0 spec doesn't explain it well. .RS 4 .IP "Examples:" 4 .IX Item "Examples:" Find the first 3 degrees: .Sp .Vb 1 \& degree[index() $lt$ 3] .Ve .Sp Note that it skips over other nodes that may exist between the degree elements. .Sp Consider the following data: .Sp .Vb 8 \& \& \& \& \& \& \& \& .Ve .Sp The following expression will return the first y from each x: .Sp .Vb 1 \& x/y[index() = 0] .Ve .Sp This could also be accomplished by (see Indexing into a Collection): .Sp .Vb 1 \& x/y[0] .Ve .RE .RS 4 .RE .IP "Method: \fBend()\fR" 4 .IX Item "Method: end()" The \fBend()\fR method returns true for the last element in the search context. Again, the \s-1XQL\s0 spec does not explain it well. .RS 4 .IP "Examples:" 4 .IX Item "Examples:" Find the last book: .Sp .Vb 1 \& book[end()] .Ve .Sp Find the last author for each book: .Sp .Vb 1 \& book/author[end()] .Ve .Sp Find the last author from the entire set of authors of books: .Sp .Vb 1 \& (book/author)[end()] .Ve .RE .RS 4 .RE .SS "Aggregate methods" .IX Subsection "Aggregate methods" .IP "Method: count( [\s-1QUERY\s0] )" 4 .IX Item "Method: count( [QUERY] )" Returns the number of values inside the search context. In \s-1XQL+,\s0 when the optional \s-1QUERY\s0 parameter is supplied, it returns the number of values returned by the \s-1QUERY.\s0 .SS "Namespace methods" .IX Subsection "Namespace methods" The following methods can be applied to a node to return namespace information. .IP "Method: \fBbaseName()\fR" 4 .IX Item "Method: baseName()" Returns the local name portion of the node, excluding the prefix. Local names are defined only for element nodes and attribute nodes. The local name of an element node is the local portion of the node's element type name. The local name of an attribute node is the local portion of the node's attribute name. If a local name is not defined for the reference node, the method evaluates to the empty set. .IP "Method: \fBnamespace()\fR" 4 .IX Item "Method: namespace()" Returns the \s-1URI\s0 for the namespace of the node. Namespace URIs are defined only for element nodes and attribute nodes. The namespace \s-1URI\s0 of an element node is the namespace \s-1URI\s0 associated with the node's element type name. The namespace \s-1URI\s0 of an attribute node is the namespace \s-1URI\s0 associated with the node's attribute name. If a namespace \&\s-1URI\s0 is not defined for the reference node, the method evaluates to the empty set. .IP "Method: \fBprefix()\fR" 4 .IX Item "Method: prefix()" Returns the prefix for the node. Namespace prefixes are defined only for element nodes and attribute nodes. The namespace prefix of an element node is the shortname for the namespace of the node's element type name. The namespace prefix of an attribute node is the shortname for the namespace of the node's attribute name. If a namespace prefix is not defined for the reference node, the method evaluates to the empty set. .Sp The spec states: A node's namespace prefix may be defined within the query expression, within the document under query, or within both the query expression and the document under query. If it is defined in both places the prefixes may not agree. In this case, the prefix assigned by the query expression takes precedence. In this implementation you cannot define the namespace for a query, so this can never happen. .RS 4 .IP "Examples:" 4 .IX Item "Examples:" Find all unqualified book elements. Note that this does not return my:book elements: .Sp .Vb 1 \& book .Ve .Sp Find all book elements with the prefix 'my'. Note that this query does not return unqualified book elements: .Sp .Vb 1 \& my:book .Ve .Sp Find all book elements with a 'my' prefix that have an author subelement: .Sp .Vb 1 \& my:book[author] .Ve .Sp Find all book elements with a 'my' prefix that have an author subelement with a my prefix: .Sp .Vb 1 \& my:book[my:author] .Ve .Sp Find all elements with a prefix of 'my': .Sp .Vb 1 \& my:* .Ve .Sp Find all book elements from any namespace: .Sp .Vb 1 \& *:book .Ve .Sp Find any element from any namespace: .Sp .Vb 1 \& * .Ve .Sp Find the style attribute with a 'my' prefix within a book element: .Sp .Vb 1 \& book/@my:style .Ve .RE .RS 4 .Sp All attributes of an element can be returned using @*. This is potentially useful for applications that treat attributes as fields in a record. .IP "Examples:" 4 .IX Item "Examples:" Find all attributes of the current element context: .Sp .Vb 1 \& @* .Ve .Sp Find style attributes from any namespace: .Sp .Vb 1 \& @*:style .Ve .Sp Find all attributes from the 'my' namespace, including unqualified attributes on elements from the 'my' namespace: .Sp .Vb 1 \& @my:* .Ve .RE .RS 4 .RE .SH "Functions" .IX Header "Functions" This section defines the functions of \s-1XQL.\s0 The spec states that: \&\s-1XQL\s0 defines two kinds of functions: collection functions and pure functions. Collection functions use the search context of the Invocation instance, while pure functions ignore the search context, except to evaluate the function's parameters. A collection function evaluates to a subset of the search context, and a pure function evaluates to either a constant value or to a value that depends only on the function's parameters. .PP Don't worry if you don't get it. Just use them! .SS "Collection functions" .IX Subsection "Collection functions" The collection functions provide access to the various types of nodes in a document. Any of these collections can be constrained and indexed. The collections return the set of children of the reference node meeting the particular restriction. .IP "Function: \fBtextNode()\fR" 4 .IX Item "Function: textNode()" The collection of text nodes. .IP "Function: \fBcomment()\fR" 4 .IX Item "Function: comment()" The collection of comment nodes. .IP "Function: \fBpi()\fR" 4 .IX Item "Function: pi()" The collection of processing instruction nodes. .IP "Function: element( [\s-1NAME\s0] )" 4 .IX Item "Function: element( [NAME] )" The collection of all element nodes. If the optional text parameter is provided, it only returns element children matching that particular name. .IP "Function: attribute( [\s-1NAME\s0] )" 4 .IX Item "Function: attribute( [NAME] )" The collection of all attribute nodes. If the optional text parameter is provided, it only returns attributes matching that particular name. .IP "Function: \fBnode()\fR" 4 .IX Item "Function: node()" The collection of all non-attribute nodes. .RS 4 .IP "Examples:" 4 .IX Item "Examples:" Find the second text node in each p element in the current context: .Sp .Vb 1 \& p/textNode()[1] .Ve .Sp Find the second comment anywhere in the document. See Context for details on setting the context to the document root: .Sp .Vb 1 \& //comment()[1] .Ve .RE .RS 4 .RE .SS "Other \s-1XQL\s0 Functions" .IX Subsection "Other XQL Functions" .IP "Function: ancestor(\s-1QUERY\s0)" 4 .IX Item "Function: ancestor(QUERY)" Finds the nearest ancestor matching the provided query. It returns either a single element result or an empty set []. Note that this node is never the reference node itself. .RS 4 .IP "Examples:" 4 .IX Item "Examples:" Find the nearest book ancestor of the current element: .Sp .Vb 1 \& ancestor(book) .Ve .Sp Find the nearest ancestor author element that is contained in a book element: .Sp .Vb 1 \& ancestor(book/author) .Ve .RE .RS 4 .RE .IP "Function: id(\s-1NAME\s0)" 4 .IX Item "Function: id(NAME)" Pure function that evaluates to a set. The set contains an element node that has an 'id' attribute whose value is identical to the string that the Text parameter quotes. The element node may appear anywhere within the document under query. If more than one element node meets these criteria, the function evaluates to a set that contains the first node appearing in a document ordering of the nodes. .IP "Function: \fBtrue()\fR and \fBfalse()\fR" 4 .IX Item "Function: true() and false()" Pure functions that each evaluate to a Boolean. \*(L"\fBtrue()\fR\*(R" evaluates to 'true', and \*(L"\fBfalse()\fR\*(R" evaluates to 'false'. These functions are useful in expressions that are constructed using entity references or variable substitution, since they may replace an expression found in an instance of Subquery without violating the syntax required by the instance of Subquery. They return an object of type XML::XQL::Boolean. .IP "Function: date(\s-1QUERY\s0)" 4 .IX Item "Function: date(QUERY)" \&\*(L"date\*(R" is a pure function that typecasts the value of its parameter to a set of dates. If the parameter matches a single string, the value of the function is a set containing a single date. If the parameter matches a \s-1QUERY,\s0 the value of the function is a set of dates, where the set contains one date for each member of the set to which the parameter evaluates. .Sp \&\s-1XQL\s0 does not define the representation of the date value, nor does it define how the function translates parameter values into dates. This implementation uses the Date::Manip module to parse dates, which accepts almost any imaginable format. See \s-1XML::XQL\s0 to plug in your own Date implementation. .Sp Include the XML::XQL::Date package to add the \s-1XQL\s0 date type and the \fBdate()\fR function, like this: .Sp .Vb 1 \& use XML::XQL::Date; .Ve .IP "Perl builtin functions and other \s-1XQL+\s0 functions" 4 .IX Item "Perl builtin functions and other XQL+ functions" \&\s-1XQL+\s0 provides \s-1XQL\s0 function wrappers for most Perl builtin functions. It also provides other cool functions like \fBsubst()\fR, \fBmap()\fR, and \fBeval()\fR that allow you to modify documents and embed perl code. If this is still not enough, you can add your own function and methods. See \s-1XML::XQL\s0 man page for details. .SH "Sequence Operators \- ';' and ';;'" .IX Header "Sequence Operators - ';' and ';;'" The whitepaper 'The Design of \s-1XQL\s0' by Jonathan Robie, which can be found at describes the sequence operators ';;' (precedes) and ';' (immediately precedes.) Although these operators are not included in the \s-1XQL\s0 spec, I thought I'd add them anyway. .SS "Immediately Precedes \- ';'" .IX Subsection "Immediately Precedes - ';'" .IP "Example:" 4 .IX Item "Example:" With the following input: .Sp .Vb 12 \& \& \& \& \& \& \& \& \& \& \& \&
Shady GroveAeolian
Over the River, CharlieDorian
.Ve .Sp Find the \s-1TD\s0 node that contains \*(L"Shady Grove\*(R" and the \s-1TD\s0 node that immediately follows it: .Sp .Vb 1 \& //(TD="Shady Grove" ; TD) .Ve .PP Note that in \s-1XML::DOM\s0 there is actually a text node with whitespace between the two \s-1TD\s0 nodes, but those are ignored by this operator, unless the text node has 'xml:space' set to 'preserve'. See ??? for details. .SS "Precedes \- ';;'" .IX Subsection "Precedes - ';;'" .IP "Example:" 4 .IX Item "Example:" With the following input (from Hamlet): .Sp .Vb 9 \& \& MARCELLUS \& Tis gone! \& Exit Ghost \& We do it wrong, being so majestical, \& To offer it the show of violence; \& For it is, as the air, invulnerable, \& And our vain blows malicious mockery. \& .Ve .Sp Return the \s-1STAGEDIR\s0 and all the LINEs that follow it: .Sp .Vb 1 \& SPEECH//( STAGEDIR ;; LINE ) .Ve .Sp Suppose an actor playing the ghost wants to know when to exit; that is, he wants to know who says what line just before he is supposed to exit. The line immediately precedes the stagedir, but the speaker may occur at any time before the line. In this query, we will use the \*(L"precedes\*(R" operator (\*(L";;\*(R") to identify a speaker that precedes the line somewhere within a speech. Our ghost can find the required information with the following query, which selects the speaker, the line, and the stagedir: .Sp .Vb 1 \& SPEECH//( SPEAKER ;; LINE ; STAGEDIR="Exit Ghost") .Ve .SH "Operator Precedence" .IX Header "Operator Precedence" The following table lists operators in precedence order, highest precedence first, where operators of a given row have the same precedence. The table also lists the associated productions: .PP .Vb 10 \& Production Operator(s) \& \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\- \& Grouping ( ) \& Filter [ ] \& Subscript [ ] \& Bang ! \& Path / // \& Match $match$ $no_match$ =~ !~ (XQL+ only) \& Comparison = != < <= > >= $eq$ $ne$ $lt$ $le$ $gt$ \& $ge$ $ieq$ $ine$ $ilt$ $ile$ $igt$ $ige$ \& Intersection $intersect$ \& Union $union$ | \& Negation $not$ \& Conjunction $and$ \& Disjunction $or$ \& Sequence ; ;; .Ve .SH "Sample XML Document \- bookstore.xml" .IX Header "Sample XML Document - bookstore.xml" This file is also stored in samples/bookstore.xml that comes with the \&\s-1XML::XQL\s0 distribution. .PP .Vb 10 \& \& \& \& \& Seven Years in Trenton \& \& Joe \& Bob \& Trenton Literary Review Honorable Mention \& \& 12 \& \& \& History of Trenton \& \& Mary \& Bob \& \& Selected Short Stories of \& Mary Bob \& \& \& 55 \& \& \& Tracking Trenton \& 2.50 \& \& \& \& Trenton Today, Trenton Tomorrow \& \& Toni \& Bob \& B.A. \& Ph.D. \& Pulizer \& Still in Trenton \& Trenton Forever \& \& 6.50 \& \&

It was a dark and stormy night.

\&

But then all nights in Trenton seem dark and \& stormy to someone who has gone through what \& I have.

\& \& Trenton \& misery \& \&
\&
\& \& Who\*(Aqs Who in Trenton \& Robert Bob \& \&
.Ve .SH "SEE ALSO" .IX Header "SEE ALSO" The Japanese version of this document can be found on-line at .PP \&\s-1XML::XQL\s0, XML::XQL::Date, XML::XQL::Query and \s-1XML::XQL::DOM\s0