NAME¶
treeql - Query tree objects
SYNOPSIS¶
package require
Tcl 8.2
package require
snit
package require
struct::list
package require
struct::set
package require
treeql ?1.3.1?
treeql objectname -tree tree ?
-query
query? ?
-nodes nodes? ?
args...?
qo query args...
qo result
qo discard
DESCRIPTION¶
This package provides objects which can be used to query and transform tree
objects following the API of tree objects created by the package
struct::tree.
The tree query and manipulation language used here, TreeQL, is inspired by Cost
(See section
References for more information).
treeql, the package, is a fairly thin query facility over tree-structured
data types. It implements an ordered set of nodes (really a list) which are
generated and filtered through the application of TreeQL operators to each
node in turn.
API¶
TREEQL CLASS API¶
The command
treeql is a
snit::type which implements the Treeql
Query Language. This means that it follows the API for class commands as
specified by the package
snit. Its general syntax is
- treeql objectname -tree tree
?-query query? ?-nodes nodes?
?args...?
- The command creates a new tree query object and returns the
fully qualified name of the object command as its result. The API the
returned command is following is described in the section TreeQL OBJECT
API
Each query object is associated with a single tree object. This is
the object all queries will be run against.
If the option -nodes was specified then its argument is treated as a
list of nodes. This list is used to initialize the node set. It defaults
to the empty list.
If the option -query was specified then its argument will be
interpreted as an object, the parent query of this query. It
defaults to the object itself. All queries will be interpreted in the
environment of this object.
Any arguments coming after the options are treated as a query and run
immediately, after the node set has been initialized. This uses the
same syntax for the query as the method query.
The operations of the TreeQL available for this are explained in the section
about The Tree Query Language. This section also explains the term
node set used above.
TREEQL OBJECT API¶
As
treeql has been implemented in
snit all the standard methods of
snit-based classes are available to the user and therefore not listed
here. Please read the documentation for
snit for what they are and what
functionality they provide
The methods provided by the package
treeql itself are listed and
explained below.
- qo query args...
- This method interprets its arguments as a series of TreeQL
operators and interpretes them from the left to right (i.e. first to
last). Note that the first operator uses the node set currently
known to the object to perform its actions. In other words, the node
set is not cleared, or modified in other ways, before the query
is run. This allows the user to run several queries one after the other
and have each use the results of the last. Any initialization has to be
done by any query itself, using TreeQL operators. The result of the method
is the node set after the last operator of the query has been
executed.
Note that uncaught errors will leave the node set of the
object in an intermediate state, per the TreeQL operators which were
executed successfully before the error occurred.
The above means in detail that:
- [1]
- The first argument is interpreted as the name of a query
operator, the number of arguments required by that operator is then
determined, and taken from the immediately following arguments.
Because of this operators cannot have optional arguments, all arguments have
to be present as defined. Failure to do this will, at least, confuse the
query interpreter, but more likely cause errors.
- [2]
- The operator is applied to the current node set, yielding a
new node set, and/or manipulating the tree object the query object is
connected to.
- [3]
- The arguments used (i.e. operator name and arguments) are
removed from the list of method arguments, and then the whole process is
repeated from step [1], until the list of arguments is empty or an error
occurred.
# q is the query object.
q query root children get data
# The above query
# - Resets the node set to the root node - root
# - Adds the children of root to the set - children
# - Replaces the node set with the - get data
# values for the attribute 'data',
# for all nodes in the set which
# have such an attribute.
# - And returns this information.
# Below we can see the same query, but rewritten
# to show the structure as it is seen by the query
# interpreter.
q query \\
root \\
children \\
get data
The operators of the TreeQL language available for this are explained in the
section about
The Tree Query Language. This section also explains the
term
node set used above.
- qo result
- This method returns a list containing the current node
set.
- qo discard
- This method returns the current node set (like method
result), but also destroys the query object ( qo). This is
useful when constructing and using sub-queries (%AUTO% objects immediately
destroyed after use).
THE TREE QUERY LANGUAGE¶
This and the following sections specify the Tree Query Language used by the
query objects of this package in detail.
First we explain the general concepts underneath the language which are required
to comprehend it. This is followed by the specifications for all the available
query operators. They fall into eight categories, and each category has its
own section.
- [1]
- TreeQL Concepts
- [2]
- Structural generators
- [3]
- Attribute Filters
- [4]
- Attribute Mutators
- [5]
- Attribute String Accessors
- [6]
- Sub-queries
- [7]
- Node Set Operators
- [8]
- Node Set Iterators
- [9]
- Typed node support
TREEQL CONCEPTS¶
The main concept which has to be understood is that of the
node set. Each
query object maintains exactly one such
node set, and essentially all
operators use it and input argument and for their result. This structure
simply contains the handles of all nodes which are currently of interest to
the query object. To name it a
set is a bit of a misnomer, because
- [1]
- A node (handle) can occur in the structure more than once,
and
- [2]
- the order of nodes in the structure is important as well.
Whenever an operator processes all nodes in the node set it will do so in
the order they occur in the structure.
Regarding the possible multiple occurrence of a node, consider a node set
containing two nodes A and B, both having node P as their immediate parent.
Application of the TreeQL operator "parent" will then add P to the
new node set twice, once per node it was parent of. I.e. the new node set will
then be {P P}.
STRUCTURAL GENERATORS¶
All tree-structural operators locate nodes in the tree based on a structural
relation ship to the nodes currently in the set and then replace the current
node set with the set of nodes found Nodes which fulfill such a relationship
multiple times are added to the result as often as they fulfill the
relationship.
It is important to note that the found nodes are collected in a separate storage
area while processing the node set, and are added to (or replacing) the
current node set only after the current node set has been processed
completely. In other words, the new nodes are
not processed by the
operator as well and do not affect the iteration.
When describing an operator the variable
N will be used to refer to any
node in the node set.
- ancestors
- Replaces the current node set with the ancestors for all
nodes N in the node set, should N have a parent. In other
words, nodes without a parent do not contribute to the new node set. In
other words, uses all nodes on the path from node N to root, in
this order (root last), for all nodes N in the node set. This
includes the root, but not the node itself.
- rootpath
- Replaces the current node set with the ancestors for all
nodes N in the node set, should N have a parent. In other
words, nodes without a parent do not contribute to the new node set. In
contrast to the operator ancestors the nodes are added in reverse
order however, i.e. the root node first.
- parent
- Replaces the current node set with the parent of node
N, for all nodes N in the node set, should N have a
parent. In other words, nodes without a parent do not contribute to the
new node set.
- children
- Replaces the current node set with the immediate children
of node N, for all nodes N in the node set, should N
have children. In other words, nodes without children do not contribute to
the new node set.
- left
- Replaces the current node set with the previous/left
sibling for all nodes N in the node set, should N have
siblings to the left. In other words, nodes without left siblings do not
contribute to the new node set.
- right
- Replaces the current node set with the next/right sibling
for all nodes N in the node set, should N have siblings to
the right. In other words, nodes without right siblings do not contribute
to the new node set.
- prev
- Replaces the current node set with all previous/left
siblings of node N, for all nodes N in the node set, should
N have siblings to the left. In other words, nodes without left
siblings are ignored. The left sibling adjacent to the node is added
first, and the leftmost sibling last (reverse tree order).
- esib
- Replaces the current node set with all previous/left
siblings of node N, for all nodes N in the node set, should
N have siblings to the left. In other words, nodes without left
siblings are ignored. The leftmost sibling is added first, and the left
sibling adjacent to the node last (tree order).
The method name is a shorthand for Earlier SIBling.
- next
- Replaces the current node set with all next/right siblings
of node N, for all nodes N in the node set, should N
have siblings to the right. In other words, nodes without right siblings
do not contribute to the new node set. The right sibling adjacent to the
node is added first, and the rightmost sibling last (tree order).
- root
- Replaces the current node set with a node set containing a
single node, the root of the tree.
- tree
- Replaces the current node set with a node set containing
all nodes found in the tree. The nodes are added in pre-order (parent
first, then children, the latter from left to right, first to last).
- descendants
- Replaces the current node set with the nodes in all
subtrees rooted at node N, for all nodes N in the node set,
should N have children. In other words, nodes without children do
not contribute to the new node set.
This is like the operator children, but covers the children of
children as well, i.e. all the proper descendants. "Rooted at
N" means that N itself is not added to the new set,
which is also implied by proper descendants.
- subtree
- Like operator descendants, but includes the node
N. In other words:
Replaces the current node set with the nodes of the subtree of node
N, for all nodes N in the node set, should N have
children. In other words, nodes without children do not contribute to the
new node set. I.e this is like the operator children, but covers
the children of children, etc. as well. "Of N" means that
N itself is added to the new set.
- forward
- Replaces the current node set with the nodes in the
subtrees rooted at the right siblings of node N, for all nodes
N in the node set, should N have right siblings, and they
children. In other words, nodes without right siblings, and them without
children are ignored.
This is equivalent to the operator sequence
- later
- This is an alias for the operator forward.
- backward
- Replaces the current node set with the nodes in the
flattened previous subtrees, in reverse tree order.
This is nearly equivalent to the operator sequence
- The only difference is that this uses the nodes in reverse
order.
- earlier
- Replaces the current node set with the nodes in the
flattened previous subtrees, in tree order.
This is equivalent to the operator sequence
ATTRIBUTE FILTERS¶
These operators filter the node set by reference to attributes of nodes and
their properties. Filter means that all nodes not fulfilling the criteria are
removed from the node set. In other words, the node set is replaced by the set
of nodes fulfilling the filter criteria.
- hasatt attr
- Reduces the node set to nodes which have an attribute named
attr.
- withatt attr value
- Reduces the node set to nodes which have an attribute named
attr, and where the value of that attribute is equal to
value (The "==" operator is string equal
-nocase).
- withatt! attr val
- This is the same as withatt, but all nodes in the
node set have to have the attribute, and the "==" operator is
string equal, i.e. no -nocase. The operator will fail with
an error if they don't have the attribute.
- attof attr vals
- Reduces the node set to nodes which which have an attribute
named attr and where the value of that attribute is contained in
the list vals of legal values. The contained-in operator used here
does glob matching (using the attribute value as pattern) and ignores the
case of the attribute value, but not for the elements of
vals.
- attmatch attr match
- Same as withatt, but string match is used as
the "==" operator, and match is the pattern checked for.
Note that match is a interpreted as a partial argument
list for string match. This means that it is interpreted as
a list containing the pattern, and the pattern element can be preceded by
options understand by string match, like -nocase. This is
especially important should the pattern contain spaces. It has to be
wrapped into a list for correct interpretation by this operator
ATTRIBUTE MUTATORS¶
These operators change node attributes within the underlying tree. In other
words, all these operators have
side effects.
- set attr val
- Sets the attribute attr to the value val, for
all nodes N in the node set. The operator will fail if a node does
not have an attribute named attr. The tree will be left in a
partially modified state.
- unset attr
- Unsets the attribute attr, for all nodes N in
the node set. The operator will fail if a node does not have an attribute
named attr. The tree will be left in a partially modified
state.
ATTRIBUTE STRING ACCESSORS¶
These operators retrieve the values of node attributes from the underlying tree.
The collected results are stored in the node set, but are not actually nodes.
In other words, they redefine the semantics of the node set stored by the query
object to contain non-node data after their completion.
The query interpreter will terminate after it has finished processing one of
these operators, silently discarding any later query elements. It also means
that our talk about maintenance of a node set is not quite true. It is a node
set while the interpreter is processing commands, but can be left as an
attribute value set at the end of query processing.
- string op attr
- Applies the string operator op to the attribute
named attr, for all nodes N in the node set, collects the
results of that application and places them into the node set.
The operator will fail if a node does not have an attribute named
attr.
The argument op is interpreted as partial argument list for the
builtin command string. Its first word has to be any of the
sub-commands understood by string. This has to be followed by all
arguments required for the subcommand, except the last. that last argument
is supplied by the attribute value.
- get pattern
- For all nodes N in the node set it determines all
their attributes with names matching the glob pattern, then the
values of these attributes, at last it replaces the node set with the list
of these attribute values.
- attlist
- This is a convenience definition for the operator
getvals *. In other words, it replaces the node set with a list of
the attribute values for all attributes for all nodes N in the node
set.
- attrs glob
- Replaces the current node set with a list of attribute
lists, one attribute list per for all nodes N in the node set.
- attval attname
- Reduces the current node set with the operator
hasatt, and then replaces it with a list containing the values of
the attribute named attname for all nodes N in the node
set.
SUB-QUERIES¶
Sub-queries yield node sets which are then used to augment, reduce or replace
the current node set.
- andq query
- Replaces the node set with the set-intersection of the node
set generated by the sub-query query and itself.
The execution of the sub-query uses the current node set as its own initial
node set.
- orq query
- Replaces the node set with the set-union of the node set
generated by the sub-query query and itself. Duplicate nodes are
removed.
The execution of the sub-query uses the current node set as its own initial
node set.
- notq query
- Replaces the node set with the set of nodes generated by
the sub-query query which are also not in the current node set. In
other word the set difference of itself and the node set generated by the
sub-query.
The execution of the sub-query uses the current node set as its own initial
node set.
NODE SET OPERATORS¶
These operators change the node set directly, without referring to the tree.
- unique
- Removes duplicate nodes from the node set, preserving
order. In other words, the earliest occurrence of a node handle is
preserved, every other occurrence is removed.
- select
- Replaces the current node set with a node set containing
only the first node from the current node set
- transform query var body
- First it interprets the sub-query query, using the
current node set as its initial node set. Then it iterates over the result
of that query, binding the handle of each node to the variable named in
var, and executing the script body. The collected results of
these executions is made the new node set, replacing the current one.
The script body is executed in the context of the caller.
- map var body
- Iterates over the current node set, binding the handle of
each node to the variable named in var, and executing the script
body. The collected results of these executions is made the new
node set, replacing the current one.
The script body is executed in the context of the caller.
- quote val
- Appends the literal value val to the current node
set.
- replace val
- Replaces the current node set with the literal list value
val.
NODE SET ITERATORS¶
- foreach query var body
- Interprets the sub-query query, then performs the
equivalent of operator over on the nodes in the node set created by
that query. The current node set is not changed, except through side
effects from the script body.
The script body is executed in the context of the caller.
- with query body
- Interprets the query, then runs the script
body on the node set generated by the query. At last it restores
the current node set as it was before the execution of the query.
The script body is executed in the context of the caller.
- over var body
- Executes the script body for each node in the node
set, with the variable named by var bound to the name of the
current node. The script body is executed in the context of the
caller.
This is like the builtin foreach, with the node set as the source of
the list to iterate over.
The results of executing the body are ignored.
- delete
- Deletes all the nodes contained in the current node set
from the tree.
TYPED NODE SUPPORT¶
These filters and accessors assume the existence of an attribute called
@type, and are short-hand forms useful for cost-like tree query, html
tree editing, and so on.
- nodetype
- Returns the node type of nodes. Attribute string accessor.
This is equivalent to
- oftype t
- Reduces the node set to nodes whose type is equal to
t, with letter case ignored.
- nottype t
- Reduces the node set to nodes whose type is not equal to
t, with letter case ignored.
- oftypes attrs
- Reduces set to nodes whose @type is an element in the list
attrs of types. The value of @type is used as a glob pattern, and
letter case is relevant.
EXAMPLES¶
... TODO ...
REFERENCES¶
- [1]
- COST [http://wiki.tcl.tk/COST] on the Tcler's
Wiki.
- [2]
- TreeQL [http://wiki.tcl.tk/treeql] on the Tcler's
Wiki. Discuss this package there.
BUGS, IDEAS, FEEDBACK¶
This document, and the package it describes, will undoubtedly contain bugs and
other problems. Please report such in the category
treeql of the
Tcllib SF Trackers [
http://sourceforge.net/tracker/?group_id=12883].
Please also report any ideas for enhancements you may have for either package
and/or documentation.
KEYWORDS¶
Cost, DOM, TreeQL, XPath, XSLT, structured queries, tree, tree query language
CATEGORY¶
Data structures
COPYRIGHT¶
Copyright (c) 2004 Colin McCormack <coldstore@users.sourceforge.net>
Copyright (c) 2004 Andreas Kupries <andreas_kupries@users.sourceforge.net>